Wednesday, January 30, 2008

Flexible Gesture Recognition for Immersive Virtual Environments (Deller 2006)

Summary:
The paper gives an overview of an inexpensive gesture recognition system which can be extended to recognize additional gestures and work with multiple users. The author argues that gestures are a natural form of interaction. When a person sees a human hand perform an action, the person immediately knows how to perform the action. There is much less mental translation involved with hand gestures than there is when a person sees a mouse pointer perform some action and tries to map the action into physical movement. The paper mentions some downsides of current approaches, focusing on fixed installations, expensive hardware, and requirement of high computational power -- all of which which combine to exclude the use of these solutions from ordinary working environments.
The solution described by the paper uses a P5 glove with an infrared based position and orientation tracker. The virtual environment is displayed in 2D on a higher resolution monitor and in stereoscopic 3D on a SeeReal C-I. Since the system learns postures by each user performing them, the system can easily adapt to multiple users. When determining what gesture is being performed, the main decision is made by analyzing the position of each of the fingers and the orientation of the hand. The relevance of hand orientation in identifying a gesture can be adjusted per gesture.

Discussion:
The author mentions interacting "in a natural way by just utilizing [one's] hands in ways [one] is already used to". This seems a bit vague. When interacting with physical objects, some people may prefer to slide them, others may prefer to pick them up and move them. These different methods of movement are intended to perform the same goal, and a choice must be made as to which gesture will be associated with an intended action in the program.
The gestures recognized by the example application seemed to be fairly limited and the associated actions seemed to lack innovation -- mimicking a mouse-based interface. For example, the "tapping" and "moving" gestures do not provide any more usefulness as gestures than the 2D clicking and dragging movements of a mouse. I doubt the user of such a gesture based system would gain any real productivity over a traditional interface.
The paper fails to mention quantitative results, serving as more of a proof-of-concept than a scientific study. The authors claim that the engine provides a fast and reliable gesture recognition interface on standard consumer computers, but fails to give specific data defining "reliable" and "standard consumer".

Monday, January 28, 2008

Environmental Technology - Making the Real World Virtual (Kreuger 1993)

Summary:

In this paper, Kreuger reviews some of his contributions to virtual environments, which are mainly concerned with compositing video with virtual environments to create a new scene. Some of the applications mentioned are multi-point control for sculpting and videoconferencing. Other applications include range-of-motion therapy and an educational tool for teaching children about the scientific method. He rejects the idea of using input which is unnatural to the user, such as a head-mounted display, and advocates a “come as you are” approach.


Discussion:

The paper was generally not very academic and lacking in the area of quantitative results. One point of interest to me was the fact that Kreuger coined the term “artificial reality”.

The author briefly mentions combining gesture input with speech recognition. The paper by Rabiner and Juang on hidden markov models uses the example application of speech recognition. One possible extension to the research may be combining speech and gesture data into a single HMM to more quickly or accurately recognize a user's intended command.