Thursday, May 8, 2008

A Hidden Markov Model Based Sensor Fusion Approach for Recognizing Continuous Human Grasping Sequences (Bernardin 2005)

Summary:
This paper proposes a system that uses information about both hand shape and contact points obtained from a combination of a data glove and tactile sensors to recognize continuous human grasp sequences. The long term goal of the study would be to teach a robotic system to perform a task simply by observing a human teacher instead of explicitly programming the robot. For this kind of learning to take place, the robot must be able to infer what has been done and map that to a known skill that can be described symbolically. This paper analyzes grasping gestures in order to aid the construction of their symbolic descriptions.
An 18 sensor Cyberglove is used in conjunction with an array of 16 capacitive pressure sensitive sensors affixed to the fingers and palm. Classification of grasping gestures is mad according to Kamakura's grasp taxonomy, which identifies 14 different kinds of grasps. The regions of the hand which are covered by pressure sensors were chosen to maximize the detection of contact with a minimal number of sensors while also corresponding to the main regions in Kamakura's grasp types. An HMM was built for each type of grasp using 112 training gestures. The HMMs have a flat topology with 9 states and were trained offline.
The main benefit gleaned from the tactile sensors was the ability to perform segmentation easier.

Discussion:
The customization required of the capacitive pressure sensors indicates that there is not currently a mass-produced component to fill the demand for grasp detection hardware. In the description of the HMM recognizer, it is mentioned that a task grammar was used to reduce the search space of the recognizer. Since only grasp and release sequences are recognized, the segmentation problem is avoided.
If the end goal is to teach a robot to learn grasps by observation, I think an experiment that used both visual-based and glove-based inputs would be required to discern a link between the visual and tactile realms. The visual signal could be analyzed and possibly mapped to a tactile response.

No comments: