12 Learning to Act

12.9 Reinforcement Learning with Features

Dimensions: flat, features, infinite horizon, fully observable, stochastic, utility, learning, single agent, online, bounded rationality

Usually, there are too many states to reason about explicitly. The alternative to reasoning explicitly in terms of states is to reason in terms of features. In this section, we consider reinforcement learning that uses an approximation of the Q-function using a linear combination of features of the state and the action. There are more complicated alternatives such as using a decision tree or neural network, but the linear function often works well.

The feature-based learners require more information about the domain than the reinforcement-learning methods considered so far. Whereas the previous reinforcement learners were provided only with the states and the possible actions, the feature-based learners require extra domain knowledge in terms of features. This approach requires careful selection of the features; the designer should find features adequate to represent the Q-function. This is often a difficult problem in feature engineering.