Technical Reports

The ICICS/CS Reading Room


UBC CS TR-2004-02 Summary

Decision Theoretic Learning of Human Facial Displays and Gestures, March 11, 2004 Jesse Hoey and James J. Little, 45 pages

ian learning facial displays and gestures in interaction. Changes in the human face occur due to many factors, including communication, emotion, speech, and physiology. Most systems for facial expression analysis attempt to recognize one or more of these factors, resulting in a machine whose inputs are video sequences or static images, and whose outputs are, for example, basic emotion categories. Our approach is fundamentally different. We make no prior commitment to some particular recognition task. Instead, we consider that the meaning of a facial display for an observer is contained in its relationship to actions and outcomes. Agents must distinguish facial displays according to their affordances, or how they help an agent to maximize utility. To this end, our system learns relationships between the movements of a person's face, the context in which they are acting, and a utility function. The model is a partially observable Markov decision process, or POMDP. The video observations are integrated into the POMDP using a dynamic Bayesian network, which creates spatial and temoral abstractions amenable to decision making at the high level. The parameters of the model are learned from training data using an a-posteriori constrained optimization technique based on the expectation-maximization algorithm. One of the most significant advantages of this type of learning is that it does not require labeled data from expert knowledge about which behaviors are significant in a particular interaction. Rather, the learning process discovers clusters of facial motions and their relationship to the context automatically. As such, it can be applied to any situation in which non-verbal gestures are purposefully used in a task. We present an experimental paradigm in which we record two humans playing a collaborative game, or a single human playing against an automated agent, and learn the human behaviors. We use the resulting model to predict human actions. We show results on three simple games.


If you have any questions or comments regarding this page please send mail to help@cs.ubc.ca.