Monday, April 28, 2008

A Dynamic Gesture Recognition System for the Korean Sign Language (Kim 1996)

Summary:
The paper describes a system for recognizing Korean Sign Language (KSL) gestures and converting them into text. The system employs two ten-sensor VPL Data-Gloves to measure the bend in the joints of each digit on both hands. The gloves also sense the position and orientation of each hand relative to a fixed source. The hand gestures need to be recognized regardless of their position relative to an initial position which varies, the position data is recorded as the difference between the previous position and the current position. For the paper, 25 of around 6000 KSL gestures were analyzed and partitioned into ten sets based on their general direction. To determine which of the ten direction categories the motion will be classified as during recognition, the change over the five most recent readings of region data are examined. Hand postures are recognized by applying the technique of Fuzzy Min-Max Neural (FMMN) Networks. Input from the data glove are identified as one of the direction classes and then recognized by the FMMN network. The system classifies gestures correctly almost 85% of the time.

Discussion:
Figure 8 shows a diagram of the min-max neural network used for classification. There are ten input nodes at the bottom, one for each of the flex angles measured from the fingers. However, there should be fourteen class noes at the top of the diagram instead of ten since there are fourteen posture classes.
When motion data is expressed in its compressed form, the order of region data is preserved, but data relating to the length in time of each region is lost.
Although the paper states that the FMMN network requires no pre-learning about posture class and has on-line adaptability, I'd say the basic idea of neural networks does require learning since that is how the weights of the network are adjusted.
The mis-classifications are partially blamed on abnormal motions in gestures and postures, but dealing with data that exhibits less than ideal characteristics seems to be part of the point of applying complex solutions to gesture recognition.

No comments: