Due: 10:00am, Wednesday 11 January 2012.
Consider the 6 state, 4 action domain at:
What is the policy that optimizes the accumulated reward? Explain how you found it. (You should do this just by playing with the applet, and trying to work out which actions give the highest accumulated reward).
Choose a particular application domain that can be abstracted as an agent, for example, a particular game, the diagnosis and repair of a particular type of object, a robot in a particular environment, a tutoring system in a particular area, or (preferably) some domain you are interested in and knowledgeable about.
Use full sentences in your answers. Explain your answers well enough so that the reader will be able to understand what you mean. If there are no examples explain why.
Describe the domain.
For a particular type of agent in the domain:
What are its abilities?
What prior knowledge would an agent have?
What observations would an agent make?
What past experiences could an agent learn from?
Give an example of a modular decomposition of an agent in your domain.
Give an example of a hierarchical decomposition of an agent in your domain.
Give examples of features in your domain.
Give examples of types of individuals
Give examples of relations that are not properties
Give examples of what an agent may be uncertain about.
Give examples of stochastic actions.
Give examples of deterministic actions.
Are there subproblems that are fully observable? Explain.
Are there subproblems that are partially observable? Explain.
Give examples of achievement goals.
Give examples of ordinal preferences.
Give examples of cardinal preferences.
Give a finite horizon problem for this domain.
Give a indefinite or infinite horizon problem.
How are multiple agents involved? Do they have competing preferences?
Could an agent be perfectly rational? Explain.
How long did this assignment take? What did you learn? Was it reasonable?
Remember that everyone need to give a mini-project for January. This involves giving a 3-minute talk to the class, and posting a summary (including references) to WebCT. This should be something that is interesting and informative. Tell us something that you would be interesting in knowing about. Each assignment will give some suggestions.
Find some real-world application, and explain to the class an agent in that domain: what it observes, what it actually does, what background knowledge it has.
At http://artint.info/demos/rl/sGame.html is a simple game with a simplistic controller. All code is open source. Try to build a better controller, and report back on how your controller works and the average reward received. We will use this example for reinforcement learning.