Tuesday, April 29, 2008

3D Visual Detection of Correct NGT Sign Production (Lichtenauer 2007)

Summary:
In this paper, a 3D visual detection system is proposed to aid children in learning Dutch sign language. Two video cameras with wide angle lenses capture 640x480 resolution images at 25 frames per second. The user's head and hands are tracked based on following skin-colored segments of the image from frame to frame. Adaptive skin color modeling determines how skin color appears under different lighting conditions, but must first be initialized by selecting some pixels within the face to and a square of pixels surrounding the head. In practice, the colors of pixels showing skin were distributed in a bimodal manner due to two different light sources. Each modality was modeled separately to reduce mis-classification of pixels as skin colored or not. Classification of gestures is done based on fifty hand features. Dynamic Time Warping is used to find the level of time-correspondence between an input gesture and a reference gesture.
A set of 120 different NGT signs performed by 70 individuals are used to test the sign classification. Cross validation was performed to effectively increase the amount of test data many times. Overall, the true positive classification rate was 95%. Dynamic time warping only improved recognition of signs with repetitious motion.

Discussion:
The classification algorithm relies on knowing the approximate start and end times of a sign. Some other kind of segmentation scheme could be applied so that the hands must not always come to rest on the table between signs. As a teaching tool, resting between signs limits the learning to single words. Multi-gesture phrases and sentences cannot be recognized without segmentation.
The left and right blobs identified during skin detection tracking are assigned to the left and right hands, regardless of which hand is on which side of the body. So, crossing hands will cause skin blobs to be mis-labeled, giving rise to the possibility of classifying two distinct gestures (both consisting of the same basic motion, but one in which hands cross and one in which they do not) as the same.

No comments: