Chapter 3: Task Environments

3.1 What is a Task Environment?

A task environment is the problem to which an agent is the solution. It consists of:

3.2 Properties of Task Environments

Property Description Example
Fully Observable Agent can see complete state Chess (vs Poker)
Deterministic Next state depends only on current state + action Taxi navigation (vs Poker)
Episodic Actions divided into atomic episodes Image classification
Static Environment doesn't change while agent deliberates Crossword puzzle
Discrete Finite number of states/actions Chess (vs autonomous driving)
Single Agent No other agents affecting environment Sudoku solver

3.3 Environment Types

Fully Observable vs Partially Observable

Fully Observable: Agent sensors give access to complete state of environment (e.g., chess)

Partially Observable: Agent has limited/partial information (e.g., poker, real-world navigation)

Deterministic vs Stochastic

Deterministic: Next state completely determined by current state + action (e.g., solving a puzzle)

Stochastic: Some randomness in outcomes (e.g., robot navigation with sensor noise)

Episodic vs Sequential

Episodic: Each action independent of previous ones (e.g., image classification)

Sequential: Current decision affects future ones (e.g., chess, navigation)

3.4 Common Task Environments

Environment Observability Determinism Episodic/Sequential Static/Dynamic
Chess Fully Deterministic Sequential Semi-static
Poker Partial Stochastic Sequential Dynamic
Robot Navigation Partial Stochastic Sequential Dynamic
Image Classification Fully Deterministic Episodic Static
Medical Diagnosis Partial Stochastic Sequential Dynamic

3.5 Frequently Asked Exam Questions

  1. Compare and contrast fully observable and partially observable environments with examples.
  2. Why is chess considered a deterministic environment while poker is stochastic?
  3. Explain how the properties of an environment affect agent design choices.
  4. Classify the following environments: self-driving car, spam filter, chess AI, weather prediction system.
  5. What makes an environment dynamic versus static? Give two examples of each.