Friday, May 2, 2008

A Survey of Hand Posture and Gesture Recognition Techniques and Technology (LaViola 1999)

Summary:
This aptly named paper presents a variety of gesture and posture recognition techniques. The techniques are loosely grouped as either feature extraction and modeling, learning algorithms, or miscellaneous. Template matching is one of the simplest techniques to implement and accurate for small sets of postures, but is not suited for gestures. Feature extraction and analysis uses a layered architecture that can handle both postures and gestures, but it can be computationally expensive if large numbers of features are extracted. Active shape models allow for real time recognition but only tracks the open hand. Principal component analysis can recognize around thirty postures, but it requires training by more than one person to achieve accurate results for multiple users. Linear fingertip models are only concerned with the starting and ending points of fingertips and only recognize a small set of postures. Causal analysis uses information about how humans interact with the world to identify gestures, and therefore, can only be applied to a limited set of gestures. Neural networks can recognize large posture or gesture sets with high accuracy given enough training data. However, the training can be very time consuming and the network must be retrained when items are added or removed from the set to be recognized. Hidden Markov models are well covered in the literature and can be used with either a vision or instrumented approach. Training HMMs can be time consuming and does not necessarily give good recognition rates. Instance-based learning techniques are relatively simple to implement, but require a large amount of memory and computation time and are not suited for real time recognition. Spatio-temporal vector analysis is a non-obtrusive, computationally intensive, vision based approach which has not reported any recognition accuracy results.

Discussion:
One aspect of this paper I liked was the summaries after each subsection which highlighted key points of the previous paragraphs. We have discussed HMM based techniques so often in class, that it was refreshing to see a wider variety of approaches. This paper was good for brainstorming what techniques to use, expand on, or combine for continuing work in gesture recognition. The experiments we did in class where a gesture was shown and described in words by each person gave some experience with the linguistic approach. Through some of the difficulties in class, I realize that it can be difficult to describe a posture accurately and universally understandable using words only. The linguistic approach in the paper only considered postures in which fingers were fully extended or contracted, which covers only a small set of all possible postures. The paper says the approach is simple, but I would say it is difficult to do when considering a set of postures that is not tightly constrained.

No comments: