Stabilized Real-time Face Tracking via a Learned Dynamic Rigidity Prior


Our rigid stabilization method produces stable head poses under exaggerated facial expressions. We compare the baseline method without rigid stabilization (top) to our method (bottom) by applying both to virtual makeup and avatar retargeting on face-squeezing (left panel) and face-enlarging (right panel) expressions. Notice the abrupt scale changes of the makeup and avatar in the baseline method due to the unstable estimates of head poses in depth. We also show the rigidity weights from our learned dynamic rigidity prior that provide per-face-region adaptivity in the rigid pose optimization.


Despite the popularity of real-time monocular face tracking systems in many successful applications, one overlooked problem with these systems is rigid instability. It occurs when the input facial motion can be explained by either head pose change or facial expression change, creating ambiguities that often lead to jittery and unstable rigid head poses under large expressions. Existing rigid stabilization methods either employ a heavy anatomically-motivated approach that are unsuitable for real-time applications, or utilize heuristic-based rules that can be problematic under certain expressions. We propose the first rigid stabilization method for real-time monocular face tracking using a dynamic rigidity prior learned from realistic datasets. The prior is defined on a region-based face model and provides dynamic region-based adaptivity for rigid pose optimization during real-time performance. We introduce an effective offline training scheme to learn the dynamic rigidity prior by optimizing the convergence of the rigid pose optimization to the ground-truth poses in the training data. Our real-time face tracking system is an optimization framework that alternates between rigid pose optimization and expression optimization. To ensure tracking accuracy, we combine both robust, drift-free facial landmarks and dense optical flow into the optimization objectives. We evaluate our system extensively against state-of-the-art monocular face tracking systems and achieve significant improvement in tracking accuracy on the high-quality face tracking benchmark. Our system can improve facial-performance-based applications such as facial animation retargeting and virtual face makeup with accurate expression and stable pose. We further validate the dynamic rigidity prior by comparing it against other variants on the tracking accuracy.

Resources:    Paper »    Video »

author = "Chen Cao and Menglei Chai and Oliver Woodford and Linjie Luo",
title = "3D Hair Synthesis Using Volumetric Variational Autoencoders",
journal = "ACM Transactions on Graphics (Proc. SIGGRAPH Asia)",
year = "2018",
month = "nov",
volume = "37",
number = "6"