A Master of Science thesis in Mechatronics Engineering by Muhannad Alkaddour entitled, “Face Flow: Constrained Optical Flow Framework for Faces”, submitted in May 2020. Thesis advisors are Dr. Usman Tariq and Dr. Abhinav Dhall. Soft copy is available (Thesis, Approval Signatures, Completion Certificate, and AUS Archives Consent Form).
In computer vision, models and methods for facial expression recognition are continually in development. Several models aim to describe the highly complex structure of different faces, which in turn allows researchers to digitally process the faces based on these models for various tasks. With the rise of deep learning in 2012, many works have since used deep networks to learn facial expressions through both static and dynamic images. One main source of information for dynamic features are optical flow algorithms. These algorithms predict, from a sequence of frames, where each pixel moves from one frame to the next. The recent optical flow algorithms based on deep learning employ frameworks that are similar to those of deep convolutional autoencoders. Faces have a peculiar structure. Hence, it makes sense to think that this optical flow should be constrained based on the physically allowable movements of facial features. Combined with robust facial alignment algorithms, good optical flow estimation for faces can be used as features for emotion recognition in robots. These vision-based techniques can aid the robot to better interact with humans by incorporating affect recognition in the interaction, adding a psychological element to it. To carry out this investigation, we propose to construct a dataset with ground truth optical flow generated by observing the deformation of face keypoints and their neighborhoods between any two consecutive face images. The dataset is then used to train the FlowNetS deep network specialized in learning optical flow, aiming to infer the constrained optical flow from a given pair of face images. The network trained with the dataset is then compared to other setups in the testing phase, and the overall results show that using the generated data during training helps the network predict better optical flow representations on face sequences. The results of this thesis can be used as a precursor to obtain and make use of the dynamic features for unsupervised learning of facial expressions, which are important in applications such as human robot interaction, online learning, and electronic consumer relationship management.