Emotion Recognition From Posed And Non-Posed Facial Expressions

R., Rashmi Adyapady

Please use this identifier to cite or link to this item: https://idr.l3.nitk.ac.in/jspui/handle/123456789/17718

Title:	Emotion Recognition From Posed And Non-Posed Facial Expressions
Authors:	R., Rashmi Adyapady
Supervisors:	B, Annappa
Keywords:	Facial Expression Recognition;Micro-Expression Recognition;Convolutional Neural Network;Posed Expressions
Issue Date:	2023
Publisher:	National Institute Of Technology Karnataka Surathkal
Abstract:	Facial Emotion Recognition is an important topic of research in the field of computer vision and artificial intelligence. It plays a vital role in analyzing the current state of user behavior through their expressions. Emotion recognition through the face is an issue that researchers in the field of affective computing have extensively addressed. This issue is usually named Facial Expression Recognition (FER). There has been considerable work done on the recognition of emotional expressions. The application of this research is beneficial in improving human-machine interaction, knowing the other person’s mental state and intentions, and recognizing suspicious lies and crime detection, thus improving safety and taking prior actions in case of emergencies. Although Facial Emotion Recognition can be conducted using multiple sensors, this work focuses on facial images because visual expressions are one of the main information channels in interpersonal communication. It is challenging for machines to recognize emotions in the same way as humans do, as they vary with time, intensity, and appearance. Variations on the face, like occlusions, rotation, illumination changes, and accessories, degrade the performance of recognizing the expressions efficiently. This research presents an overview of facial expression recognition techniques based on machine and deep learning algorithms to classify posed and non-posed expressions and build an automated system to recognize engagement levels. The first work was to select relevant features, reduce dimensionality, and detect non-posed expressions using the ensemble model. A Micro Expression Recognition (MER) system is proposed using Delaunay Triangulation (DT) and Voronoi Diagram (VD) approach to retrieve Region of Interest (ROIs) based on the Action Units (AUs) description. Finally, the extracted features are appended and fed into the ensemble of the machine learning model for classification. The combination of geometric and texture features retrieved from ROI complemented each other in getting better performance in distinguishing minute changes in facial areas. The list of observations obtained from this experiment using selected features on the micro-expression (ME) database has beenreported. The proposed MER system achieved good performance while recognizing non-posed expressions with accuracies of 76.47% and 67.19% using micro-expression databases. Second, the task was to detect the presence of occlusions as it poses difficulty in localizing and detecting the facial region, resulting in substantial intra-expression variability. The primary goal of this work is to identify facial occlusions and minimize data loss during face recognition. Hence, the Xception network with residual attention mechanism (Xcep-RA) and Gradient-weighted Class Activation Mapping (Grad-CAM) visualization technique is proposed to localize the facial occlusions and combat erroneous predictions. The model showed accuracies of 99.85% and 98.95% on LFW-mask and RMFD datasets, respectively. Detection of posed expressions using a deep neural network has been considered as the next objective for this thesis. Two models have been developed to recognize posed expressions with pose variations. An ensemble model with a frequency-based voting approach (FV-EffNet) and a stacking classifier approach (SC-EffNet) is adopted to deal with profile and frontal pose variations and classify the posed expressions into respective classes. The extracted features are fed into the base classifier and passed through the meta classifier to analyze the data pattern and get accurate predictions. The reason behind using a stacking classifier is that it decreases the risk of getting varied outputs from different machine learning classifiers. The proposed multi-stage posed expression classification model achieved accuracies of 98.71% and 98.56%, respectively, making the system robust against pose variations. The assessment of engagement levels from visual cues has been considered a final objective for this thesis. The Facial Engagement Analysis-Network (FEA-Net) has been proposed for learning engagement assessment in Massive Open Online Courses (MOOC) scenarios that could help to reduce the dropout rates and overcome some of the educational problems by improving the quality of learning. In this work, the spatial and temporal features are generated by Convolutional Recurrent Neural Network (CRNN) and OpenFace features that are fused into FEA-Net, which will help discern the engaged state and improve the performance of classification of engagement levels.The model achieved an accuracy of 62.16%. The proposed models have been evaluated on publicly available datasets, and performance is compared against state-of-the-art systems.
URI:	http://idr.nitk.ac.in/jspui/handle/123456789/17718
Appears in Collections:	1. Ph.D Theses

Files in This Item:

File	Description	Size	Format
177029-CO004-Rashmi-Adyapady-R.pdf		17.9 MB	Adobe PDF	View/Open

Show full item record