2.4 Acting with Reasoning

2.4.1 Agents Modeling the World

The definition of a belief state is very general and does not constrain what should be remembered by the agent. Often it is useful for the agent to maintain some model of the world, even if its model is incomplete and inaccurate. A model of a world is a representation of the state of the world at a particular time and/or the dynamics of the world.

At one extreme, a model may be so good that the agent can ignore its percepts. The agent can then determine what to do just by reasoning. This approach requires a model of both the state of the world and the dynamics of the world. Given the state at one time, and the dynamics, the state at the next time can be predicted. This process is known as dead reckoning. For example, a robot could maintain its estimate of its position and update the estimate based on its actions. When the world is dynamic or when there are noisy actuators (e.g., a wheel slips, the wheel is not of exactly the right diameter, or acceleration is not instantaneous), the noise accumulates, so that the estimates of position soon become so inaccurate that they are useless. However, if the model is accurate at some level of detail, it may still be useful. For example, finding a plan on a map is useful for an agent, even if the plan does not specify every action of the agent.

At the other extreme is a purely reactive system that bases its actions on the percepts, but does not update its internal belief state. The command function in this case is a function from percepts into actions. As an example, the middle layer of the robot in the previous section, if we ignore the timeout, could be considered to be a reactive system.

A more promising approach is to combine the agent’s prediction of the world state with sensing information. This can take a number of forms:

  • If both the noise of forward prediction and sensor noise are modeled, the next belief state can be estimated using Bayes’ rule. This is known as filtering.

  • With more complicated sensors such as vision, a model can be used to predict where visual features can be found, and then vision can be used to look for these features close to the predicted location. This makes the vision task much simpler and vision can greatly reduce the errors in position arising from forward prediction alone.

A control problem is separable if the best action can be obtained by first finding the best model of the world and then using that model to determine the best action. Unfortunately, most control problems are not separable. This means that the agent should consider multiple models to determine what to do. Usually, there is no “best model” of the world that is independent of what the agent will do with the model.