Accurate real-time full-body motion capture using a single depth camera

Accurate real-time full-body motion capture using a single depth camera

This paper proposed a method for real time motion capture based on the combination of tracking and detection steps. In the tracking step, they use MAP framework to produce the mostly pose similar to the observed depth data, but it will cause some problem that fail to local minimum. Thus detection step is proposed to initialize the pose when the system fails to local minimum.

Here are the steps of the system framework.

1 Real-time pose tracking.

C is the input data at current frame consisting of a depth map D and a binary silhouette image S. We want to infer from C the most probable skeletal poses q for the current frame given the sequence of m previously reconstructed poses, denoted as clip_image002.We aim to estimate the most likely poses q by solving the following MAP estimation problem:

clip_image004

whereclip_image006 denotes the conditional probability, using Bayes’ rule, we obtain

clip_image008

Assuming that C is conditionally independent of clip_image002[1] given q, we can write

clip_image010

Where the first term is the likelihood term which measures how well the reconstructed pose q matches the current observation data C, and the second term is the prior term which describes the prior distribution of the current pose q given the previous reconstructed poses clip_image002[2].

Use the “hypothesized” joint angle pose q, and then use forward kinematic function which maps the local coordinates of the surface point clip_image012 on the k-th bone segment to the global 3D coordinates p. We denote the 3D global coordinates of the “rendered” 3D points p(q). Then project all the “rendered” 3D points onto 2D image space with the calibrated camera parameters to obtain a “rendered” depth image clip_image014 under the current camera viewpoint.

Assuming Gaussian noise with a standard deviation of clip_image016 for each pixel x, we obtain the following likelihood term for depth image registration.

clip_image018

Where x(q) is column vector containing the pixel coordinates of “rendered” image.

posted on 2013-01-30 21:49  leo_leo  阅读(308)  评论(0)    收藏  举报

导航