Summary:
The partially observable Markov decision process (POMDP) model is heavily based on regular Markov decision process models which have been the focus of other papers we have read dealing with Hidden Markov models. The defining parts of a POMDP model are three finite sets and three functions. Similar to HMMs, there is a set of states, a set of actions, and a set of observations. The transition and observation functions are the familiar probability distributions of HMMs. The main difference is the addition of an immediate reward function which gives the immediate benefit for performing each action while in each state.
The paper gives an overview of several application areas for the POMDP model. In the area of industrial applications, the example of finding a policy for machine maintenance is given. This is one of the oldest applications of the POMDP model and it fits the problem very well since the observations are probabilistically related to internal states. Other industrial applications include structural inspection, elevator control policies, and the fishery industry. Autonomous robots are given as another possible application of the POMDP model. Currently, finding control rules for robots using the method is done at a higher level of abstraction than at the sensor and actuator level. The paper mentions that it might be possible to use a hierarchical arrangement of POMDP models to get the model functioning closer to the level of the hardware. Other scientific applications include behavioral ecology and machine vision. Business applications include network troubleshooting, distributed database queries, and marketing. Some military and social applications are also mentioned.
Discussion:
One observation from the machine vision application example is that POMDP models work best in special purpose visual systems where the domain has been restricted. One of the special systems mentioned was a gesture recognizer which attempts pattern recognition. Since this area is the most closely related to instrumented gesture recognition, I think it may be beneficial to read the paper referenced in this section (Active Gesture Recognition using Partially Observable Markov Decision Processes by Trevor Darrell and Alex Pentland).
One of the more interesting application areas to me was education where the internal mental state of an individual is the model. The reward function could even be applied at an individual level, taking into consideration a student's learning style. I think the limitation of discretizing a space of concepts makes the application to learning not totally accurate. Concepts are inherently difficult to define. Some of the application areas seemed somewhat far-fetched in that too much information would be required to build an accurate model. In addition to these theoretical limitations, the practical limitations of representing and performing computation on the models is even more severe. Because of the difficulty of correct POMDP modeling, algorithms which consider characteristics of the problem area must be used to achieve timely results.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment