Tuesday, April 22, 2008

3D Object Modeling Using Spatial and Pictographic Gestures (Nishino 1998)

Summary:
The paper suggests creating an "image externalization loop", which is just a fancy way of saying they provide visual feedback of a virtual object as it is being manipulated. Instead of representing and manipulating objects at the vertex or polygonal level, mathematical expressions of the 3D objects were created which take deformation parameters which can be altered through gestures. The system is implemented in C, using the OpenGL library for rendering graphics. Finger joint angles are read by two CyberGloves and fed into a static posture recognizer to classify hand shape. Position and orientation data for both hands are read by a polyhemus tracker and sent to a dynamic gesture recognizer. The recognized hand shape determines what operation to perform, while the movement is used to determine how to adjust the deformation parameters. Segmentation is performed based on the assumption that static hand posture remains generally fixed during motion of the hands. The left hand is used as a frame of reference for the motion of the right hand, which scales or rotates objects. It seems only three static gestures were recognized, an open hand for "deform", a closed fist for "grasp", and a pointing index finger for "point".
A gesture learning and recognition function, called the Two-Handed dynamic Gesture environment Shell (TGSH) utilizes a self-organizing feature map algorithm to allow users to specify their preferred gestures.
One experiment involved recreating virtual objects to match real ones. The dimensions of real objects were scanned using Minolta's Vivid 700 3D digitizer. Users were able to recreate the models using the system.

Discussion:
The amount of data required to store an object by its deformation parameters is greatly reduced when compared to polygonal representations (factors of 1:670 and 1:970 were given for two example objects). The idea of using the deformation parameters that describe the objects as searchable categories in a database of 3D objects is interesting.
The authors identified the trade-off between quality of the rendering and interactivity. The number of polygons was limited to allow sufficient drawing rates and interactive responsiveness. With today's technology (10 years of improvements), I doubt the restrictions of 8,000 and 32,000 polygons per object would be as strict.
Since implicit functions are used to model objects, the collision detection can be computed easily, by simply evaluating the function at a location in space and comparing the result to a constant. This is much easier than performing collision detection with polygonal based models.
The fact that the blending function used produces G2 continuous surfaces is important since it allows reflections off the surface, including highlights, to be displayed with G1 continuity. Curves with less than G1 continuity do not look smooth and are easily detectable by humans.
The system only implements one of the superquadrics, the ellipsoid. There are three other superquadrics, namely hyperboloids of one and two sheets, and the toroid. Adding these additional primitives would be a natural extension for the existing system.

No comments: