top of page
Search

[LGPR] Publication Success Series- Dr S K B Sangeetha in Scientific Reports

Dr SKB Sangeetha, LGPR Postdoctoral Candidate at LUC Malaysia.
Dr SKB Sangeetha, LGPR Postdoctoral Candidate at LUC Malaysia.

Scientific Reports 15 41825 (2025)

Journal Indexes: Impact factor 3.9

Indexing: SCI, SCIE & EI Compendex

H-Index: 347

Aims & Scope

Scientific Reports is an open access journal publishing original research from across all areas of the natural sciences, psychology, medicine and engineering.


Spatiotemporal multimodal emotion recognition using Temporal video sequences and pose features for child emotion classification

SKB Sangeetha

Lincoln University College, Petaling Jaya, Malaysia.

Raja Sarath Kumar Boddu

CSE (AI&ML) Department, Raghu Engineering College, Visakhapatnam, India

Amiya Bhaumik

Lincoln University College, Petaling Jaya, Malaysia.

Sandeep Kumar Mathivanan

School of Computing Science and Engineering, Galgotias University, Greater Noida, 203201, India

Usha Moorthy

School of Computer Engineering, Manipal Institute of Technology Bengaluru, Manipal Academy of Higher Education, Manipal, India


Abstract

Developmental psychology and affective computing have placed great emphasis on identifying children’s emotional cues in recent times. In this study, a novel Spatio-Temporal Multimodal Emotion Recognition Network (ST-MERN) for child emotion classification is proposed. Dense feature embeddings of the EmoReact dataset and temporal video sequences are utilized for the study. The proposed method uses 115 continuous frames per visual signal instance, e.g., rotational-translational vectors, facial keypoints, and pose predictions. With steady performance on each frame and a mean confidence of 0.967, this ensures the system maintains good detection fidelity. In order to track subtle emotional changes, our method captures dynamic data like scale variation and frame-to-frame variation (rx, ry, rz, tx, ty). Latent features (p24–p33) provide a profound explanation of emotional states. The model is designed to preserve spatiotemporal consistency and improve emotion recognition by combining these features. Curiosity, uncertainty, excitement, happiness, surprise, disgust, fear, frustration, and valence are the nine categories on which the system categorizes children’s emotional states. Preliminary results show that our system effectively captures expressive nuances, with stable pose data and low feature variability across sequences. The model surpassed earlier models such as LSTM and TCN in generalization, with a high validation accuracy of 93.6% and test accuracy of 94.3% for the BiLSTM-based architecture. The BiLSTM model had enhanced classification capacity for different emotional states with an F1-score of 0.92. The TCN model is well-suited to real-time deployment since it recorded a competitive test accuracy of 91.7% with quick inference times of ~ 0.8 s per clip, even though it was slightly slower than the BiLSTM. With an F1-score of 0.89 and test accuracy of 90.2%, the LSTM model performed robustly; it trained faster than the BiLSTM and TCN, although its accuracy was slightly lower. By providing strong and interpretable classification that is sensitive to the dynamic nature of children’s emotional displays, this technique improves emotion detection in children. Our work provides the foundation for socially sensitive systems, therapy treatments, and affect-conscious education materials‎.

Cite this article

Sangeetha, S., Boddu, R.S.K., Bhaumik, A. et al. Spatiotemporal multimodal emotion recognition using Temporal video sequences and pose features for child emotion classification. Sci Rep 15, 41825 (2025). https://doi.org/10.1038/s41598-025-25813-8



Multiclass RoC curvecomparison.
Multiclass RoC curvecomparison.

 
 
 

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
bottom of page