Stabilized Real-time Face Tracking via a Learned Dynamic Rigidity Prior

Despite the popularity of real-time monocular face tracking systems in many successful applications, one overlooked problem with these systems is rigid instability. It occurs when the input facial motion can be explained by either head pose change or facial expression change, creating ambiguities that often lead to jittery and unstable rigid head poses under large expressions. Existing rigid stabilization methods either employ a heavy anatomically-motivated approach that are unsuitable for real-time applications, or utilize heuristic-based rules that can be problematic under certain expressions. We propose the first rigid stabilization method...

3D Hair Synthesis Using Volumetric Variational Autoencoders

Recent advances in single-view 3D hair digitization have made the creation of high-quality CG characters scalable and accessible to end-users, enabling new forms of personalized VR and gaming experiences. To handle the complexity and variety of hair structures, most cutting-edge techniques rely on the successful retrieval of a particular hair model from a comprehensive hair database. Not only are the aforementioned data-driven methods storage intensive, but they are also prone to failure for highly unconstrained input images, exotic hairstyles, failed face detection. Instead of using a large collection of 3D...

Unsupervised Domain Adaptation for 3D Keypoint Estimation via View Consistency

In this paper, we introduce a novel unsupervised domain adaptation technique for the task of 3D keypoint prediction from a single depth scan/image. Our key idea is to utilize the fact that predictions from different views of the same or similar objects should be consistent with each other. Such view consistency provides effective regularization for keypoint prediction on unlabeled instances. In addition, we introduce a geometric alignment term to regularize predictions in the target domain. The resulting loss function can be effectively optimized via alternating minimization. We demonstrate the effectiveness...

StarMap for Category-Agnostic Keypoint and Viewpoint Estimation

Semantic keypoints provide concise abstractions for a variety of visual understanding tasks. Existing methods define semantic keypoints separately for each category with a fixed number of semantic labels. As a result, these keypoints are not suitable when objects have a varying number of parts, e.g. chairs with varying number of legs. We propose a category-agnostic keypoint representation encoded with their 3D locations in the canonical object views. Our intuition is that the 3D locations of the keypoints in canonical object views contain rich semantic and compositional information. Our representation thus...

Deep Volumetric Video From Very Sparse Multi-View Performance Capture

We present a deep learning-based volumetric capture approach for performance capture using a passive and highly sparse multi-view capture system. We focus on a template-free, per-frame 3D surface reconstruction from as few as three RGB sensors, where conventional visual hull or multi-view stereo methods would fail. State-of-the-art performance capture systems require either pre-scanned actors, large number of cameras or active sensors. We introduce a novel multi-view Convolutional Neural Network (CNN) that maps 2D images to a 3D volumetric field that encodes the probabilistic distribution of surface points of the captured...

AutoScaler: Scale-Attention Networks for Visual Correspondence

Finding visual correspondence between local features is key to many computer vision problems. While defining features with larger contextual scales usually implies greater discriminativeness, it could also lead to less spatial accuracy of the features. We propose AutoScaler, a scale-attention network to explicitly optimize this trade-off in visual correspondence tasks. Our architecture consists of a […]

Customized Expression Recognition for Performance-Driven Cutout Character Animation

Performance-driven cutout character animation. Actors perform customized expressions in (a) e.g. “disdainful” (top) and “daydreaming” (bottom) to animate the expressions of various cutout characters in (b). Note that the large inter-person expression variations even within the same expression category.   Performance-driven character animation enables users to create expressive results by performing the desired motion of […]

High-Quality Hair Modeling from A Single Portrait Photo

We propose a novel system to reconstruct a high-quality hair depth map from a single portrait photo with minimal user input. We achieve this by combining depth cues such as occlusions, silhouettes, and shading, with a novel 3D helical structural prior for hair reconstruction. We fit a parametric morphable face model to the input photo and construct a base shape in the face, hair and body regions using occlusion and silhouette constraints. We then estimate the normals in the hair region via a Shape-from-Shading-based optimization that uses the lighting inferred...

Level-Set-Based Partitioning and Packing Optimization of a Printable Model

As the 3D printing technology starts to revolutionize our daily life and the manufacturing industries, a critical problem is about to emerge: how can we find an automatic way to divide a 3D model into multiple printable pieces, so as to save the space, to reduce the printing time, or to make a large model printable by small printers. In this paper, we present a systematic study on the partitioning and packing of 3D models under the multi-phase level set framework. We first construct analysis tools to evaluate the qualities...

Single-View Hair Modeling Using A Hairstyle Database

Hair digitalization has been one of the most critical and challenging tasks necessary to create virtual avatars. Most existing hair modeling techniques require either expensive capture devices or tedious manual effort. In this paper, we present a data-driven approach to create a complex 3D hairstyle from the single view of a photograph. We first build a database of 343 manually created 3D example hairstyles from some online repositories. Given a reference photo of the target hairstyle and a few user strokes as guidance, we automatically search for several best matching...

Capturing Braided Hairstyles

From fishtail to princess braids, these intricately woven structures define an important and popular class of hairstyle, frequently used for digital characters in computer graphics. In addition to the challenges created by the infinite range of styles, existing modeling and capture techniques are particularly constrained by the geometric and topological complexities. We propose a data-driven method to automatically reconstruct braided hairstyles from input data obtained from a single consumer RGB-D camera. Our approach covers the large variation of repetitive braid structures using a family of compact procedural braid models. From...

Robust Hair Capture Using Simulated Examples

We introduce a data-driven hair capture framework based on example strands generated through hair simulation. Our method can robustly reconstruct faithful 3D hair models from unprocessed input point clouds with large amounts of outliers. Current state-of-the-art techniques use geometrically-inspired heuristics to derive global hair strand structures, which can yield implausible hair strands for hairstyles involving large occlusions, multiple layers, or wisps of varying lengths. We address this problem using a voting-based fitting algorithm to discover structurally plausible configurations among the locally grown hair segments from a database of simulated examples....

3D Self-Portraits

We develop an automatic pipeline that allows ordinary users to capture complete and fully textured 3D models of themselves in minutes, using only a single Kinect sensor, in the uncontrolled lighting environment of their own home. Our method requires neither a turntable nor a second operator, and is robust to the small deformations and changes of pose that inevitably arise during scanning. After the users rotate themselves with the same pose for a few scans from different views, our system stitches together the captured scans using multi-view non-rigid registration, and...

Structure-Aware Hair Capture

Existing hair capture systems fail to produce strands that reflect the structures of real-world hairstyles. We introduce a system that reconstructs coherent and plausible wisps aware of the underlying hair structures from a set of still images without any special lighting. Our system first discovers locally coherent wisp structures in the reconstructed point cloud and the 3D orientation field, and then uses a novel graph data structure to reason about both the connectivity and directions of the local wisp structures in a global optimization. The wisps are then completed and...

Wide-baseline Hair Capture using Strand-based Refinement

We propose a novel algorithm to reconstruct the 3D geometry of human hairs in wide-baseline setups using strand-based refinement. The hair strands are first extracted in each 2D view, and projected onto the 3D visual hull for initialization. The 3D positions of these strands are then refined by optimizing an objective function that takes into account cross-view hair orientation consistency, the visual hull constraint and smoothness constraints defined at the strand, wisp and global levels. Based on the refined strands, the algorithm can reconstruct an approximate hair surface: experiments with...

Chopper: Partitioning Models into 3D-Printable Parts

3D printing technology is rapidly maturing and becoming ubiquitous. One of the remaining obstacles to wide-scale adoption is that the object to be printed must fit into the working volume of the 3D printer. We propose a framework, called Chopper, to decompose a large 3D object into smaller parts so that each part fits into the printing volume. These parts can then be assembled to form the original object. We formulate a number of desirable criteria for the partition, including assemblability, having few components, unobtrusiveness of the seams, and structural...

Multi-View Hair Capture Using Orientation Fields

Reconstructing realistic 3D hair geometry is challenging due to omnipresent occlusions, complex discontinuities and specular appearance. To address these challenges, we propose a multi-view hair reconstruction algorithm based on orientation fields with structure-aware aggregation. Our key insight is that while hair's color appearance is view-dependent, the response to oriented filters that captures the local hair orientation is more stable. We apply the structure-aware aggregation to the MRF matching energy to enforce the structural continuities implied from the local hair orientations. Multiple depth maps from the MRF optimization are then fused...

Temporally Coherent Completion of Dynamic Shapes

We present a novel shape completion technique for creating temporally coherent watertight surfaces from real-time captured dynamic performances. Because of occlusions and low surface albedo, scanned mesh sequences typically exhibit large holes that persist over extended periods of time. Most conventional dynamic shape reconstruction techniques rely on template models or assume slow deformations in the input data. Our framework sidesteps these requirements and directly initializes shape completion with topology derived from the visual hull. To seal the holes with patches that are consistent with the subject's motion, we first minimize...

Estimating the Laplace-Beltrami Operator by Restricting 3D Functions

We present a novel approach for computing and solving the Poisson equation over the surface of a mesh. As in previous approaches, we define the Laplace-Beltrami operator by considering the derivatives of functions defined on the mesh. However, in this work, we explore a choice of functions that is decoupled from the tessellation. Specifically, we use basis functions (second-order tensor-product B-splines) defined over 3D space, and then restrict them to the surface. We show that in addition to being invariant to mesh topology, this definition of the Laplace-Beltrami operator allows...

Video Diver: Generic Video Indexing with Diverse Features

Semantic video indexing is critical for practical video retrieval systems and a generic and scalable indexing framework is a must for indexing a large semantic lexicon with over 1000 concepts present. This paper fully explores the idea of incorporating many kinds of diverse features into a single framework, combining them altogether to obtain larger degree of invariance which is absent in any of the component features, and thus achieves genericness and scalability. We scale down the formidable computational expense with a clever design of the classification and fusion schemes. To...