Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

CS 440/ECE 448

Fall 2023

Quiz 6 skills list

The quiz will cover material through the lectures on constraint satisfaction problems.

Markov Decision Processes

· Model and terminology for an MDP

· Quantized representation of continuous state variables via randomized actions

· Bellman equation

· Methods of solving the Bellman equation

Value iteration

Policy iteration

Asynchronous dynamic programming

· How to choose a policy?

Reinforcement Learning

· Basic setup for reinforcement learning (e.g. main loop)

· Model-based reinforcement learning

· Model-free reinforcement learning

Q-learning version of Bellman equation (expressing Q in terms of itself, without reference to the utility or transition probability functions)

TD update algorithm

SARSA update algorithm

How do TD and SARSA differ?

· Selecting an action

Deriving a policy from utility values or from Q values.

Incorporating exploration

· Online learning, offline learning, experience replay

Constraint satisfaction problems

Historical trivia:

· Waltz line labelling

4-color theorem

Key examples (be familiar with basic rules)

· n-queens, map/graph coloring, Sudoku, Cryptarithmetic, Waltz line labelling, task scheduling

· Graph coloring is NP-complete.

Hill-climbing

· how it works (high-level idea only)

· how it differs from backtracking search

Backtracking search (DFS)

· Variable assignments can be done in any order, search is to a known depth

· Why does DFS work well? Why isn't looping a worry?

Heuristics for variable and value selection

most constrained/most constraining variable

least constraining value

· Forward checking, constraint propagation

· AC-3 algorithm

· How to incorporate constraint propagation into backtracking search