关键词 > CS6501-005/SYS6581-010
CS 6501-005/SYS 6581-010: LEARNING IN ROBOTICS FALL 2022 HOMEWORK 1
发布时间:2022-09-21
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
CS 6501-005/SYS 6581-010: LEARNING IN ROBOTICS
FALL 2022
HOMEWORK 1
1 Problem 1 (50 points, Bayes filter). In this problem we are given a robot operating
2 in a 2D grid world. Every cell in the grid world is characterized by a color (0 or 3 1). The robot is equipped with a noisy odometer and a noisy color sensor. Given
4 a stream of actions and corresponding observations, implement a Bayes filter to
5 keep track of the robot’s current position. The sensor reads the color of cell of the
6 grid world correctly with probability 0.9 and incorrectly with probability 0.1. At
7 each step, the robot can take an action to move in 4 directions (north, east, south,
8 west). Execution of these actions is noisy, so after the robot performs this action,
9 it actually makes the move with probability 0.9 and stays at the same spot without
10 moving with probability 0.1.
11 When the robot is at the edge of the grid world and is tasked with executing an
12 action that would take it outside the boundaries of the grid world, the robot remains
13 in the same state with probability 1. Start with a uniform prior on all states. For
14 example if you have a world with 4 states (北1,北2,北3,北4) then P(X0 = 北1) =
15 P(X0 = 北2) = P(X0 = 北3) = P(X0 = 北4) = 0.25.
16 You are given a zip file with some starter code for this problem. You will find
17 the zip file on Collab under Resources » Homeworks » hw1. This consists of the
18 Python scripts example_test.py and histogram_filter.py (Bayes filter is also called the
19 Histogram filter) and a starter file containing some data: starter.npz. The starter.npz
20 file contains a binary color-map (the grid), a sequence of actions, a sequence of
21 observations, and a sequence of the correct belief states. This is provided for you
22 to debug your code. You should implement your code in the histogram_filter.py.
23 Be careful not to change the function signature, or your code will not pass the tests
24 on the autograder. You should upload your histogram_filter.py file to Homework 25 1 Problem 1 Code on Gradescope. In the PDF of your solutions, you are free to
26 include any relevant plots of the grid world and the robot’s belief which show that
27 your code works as intended.
28 Problem 2 (35 points, Learning HMMs). In this problem, we are going to im-
29 plement what is called the Baum-Welch algorithm for HMMs. Recall that an
30 HMM with observation matrix M has an underlying Markov chain with an initial
31 distribution π and state-transition matrix T. Let us denote our HMM by λ = (π,T,M).
32 Say someone had given us an HMM λ which gave us a sequence of observations
33 Y1 , . . . ,Yt . Since these observations happened, they tell us something more about
34 our given state-transition and observation matrices, for example, given these obser-
35 vations, we can go back and modify T,M to be such that the observation sequence
36 is more likely, i.e., it improves
P(Y1 , . . . ,Yt | λ).
37 (a) (20 points) Assume that we are tracking a criminal who shuttles merchandise
38 between Los Angeles (北1) and New York (北2). The state transition matrix between
1 these states is
T = ] ;
2 e.g., given the person is in LA, he is likely to stay in LA or go to NY with equal
3 probability. We can make observations about this person, we either observe him to
4 be in LA (y1), NY (y2) or do not observe anything at all (null, y3). M = ] .
5 Say that we are tracking the person and we observed an observation sequence of 20
6 periods
(null, LA, LA, null, NY, null, NY, NY, NY, null, NY, NY, NY, NY, NY, null, null, LA, LA, NY).
7 Assume that we do not know where the criminal is at the first time-step, so the
8 initial distribution π is uniform.
9 We first compute the probability of each state given all the observations, i.e.,
αk(x) βk(x)
zxαt(x) .
10 which is simply our smoothing probability computed using the HMM λ computed
11 using the forward and backward variables αk,βk . You should check that γk(x)
12 should be a legitimate probability distribution, i.e., zxγk(x) = 1.
13 Compute the smoothing probabilities γk(x) for all 20 time-steps and both states
14 and show it in a table like we had in the lecture notes. You should also show the
15 corresponding tables for the forward and backward probabilities αks and βks.
16 Write down the point-wise most likely sequence of states based on smoothing
17 probabilities.
18 (b) (5 points) We now discuss how to update the model λ given our observations.
19 First observe that
t−1
E[number of times the Markov chain was in state x] =之 γk(x).
k=1
20 Let us define a new quantity
ξk(x,x\) = P(Xk = x,Xk+1 = x\ | Y1 , . . . ,Yt,λ)
21 to be the probability of being at a state x at time k and then moving to state x\ at
22 time k + 1, conditional upon all the observations we have received from our HMM.
23 Show that
ξk(x,x\) = η αk(x)Tx,x\ Mx\ ,yk+1 βk+1(x\)
24 where η is a normalizing constant that makes zx,x\ ξk(x,x\) = 1.
25 (c) (5 points) We can now use our estimate from the previous part in to get
t−1
E[number of transitions from x to x\] =之 ξk(x,x\).
k=1
1 We now construct an updated model for our HMM as follows. The initial distribution
2 of the states, instead of being π is our smoothing probability π \ = γ1 (x).
3 Entries of the new transition matrix can be computed by
T,x\ =
=对tk(对tk−=)
4 Entries of the new observation matrix can be computed by
M,y =
= 对k(t)对(=1)
5 where 1{yk =y} denotes that the observation at the kth timestep was y . Let the new
6 HMM model be
λ\ = (π\,T\,M\)
7 Write down the new HMM model λ\ and see if any entires have changed as compared
8 to λ .
9 (d) (5 points) Compare the two HMMs λ and λ\ . Compute the two probabilities
10 and show that
P(Y1 , . . . ,Yt | λ) < P(Y1 , . . . ,Yt | λ\).
11 This is exactly what Baum et al. proved in the paper Baum, Leonard E., et
12 al. "A maximization technique occurring in the statistical analysis of probabilistic
13 functions of Markov chains." The annals of mathematical statistics 41.1 (1970): 14 164-171. You can see that the Baum-Welch algorithm is quite powerful to learn
15 real-world Markov chains. Given some elementary model of the dynamics T and an
16 observation model M, you can sequentially update both of them to guess a better
17 model of the dynamics.
18 Problem 3 (20 points, Forward-Backward algorithm). Answer the following
19 questions.
20 (a) (15 points) Using our forward and backward variables
αk(x) = P(Y1 , . . . ,Yk,Xk = x)
βk(x) = P(Yk+1 , . . . ,Yt | Xk = x).
21 for a sequence of observations Y1 , . . . ,Yt of length t > 1, the state transition matrix
22 Tij = P(Xk+1 = xj | Xk = xi) and the observation matrix Mij = P(Yk =
23 yj | Xk = xi), write down how you will compute the following probabilities
24 (1) P(Xk+1 = xj | Xk = xi, Y1 , . . . ,Yt),
25 (2) P(Xk = xi | Xk+1 = xj, Y1 , . . . ,Yt),
26 (3) P(Xk−1 = xi,Xk = xj,Xk+1 = xl | Y1 , . . . ,Yt).
27 You do not need to consider the boundary cases like k ∈ {1,t}.
28 (b) (5 points) Viterbi’s algorithm finds the most likely state trajectory of the
29 HMM’s associated Markov chain given a sequence of observations Y1 , . . . ,Yt .
1 Explain why, in general, the solution of the decoding problem, i.e.,
(x1(∗) , . . . ,xt(∗)) = argmax P(X1 = x1 , . . . ,Xt = xt | Y1 , . . . ,Yt).
(x1 ,...,xt)
2 is not the same as the sequence obtained by most likely state at each time computed
3 by the smoothing, i.e.,
(xˆ1 , . . . , xˆt) where xˆk = argmax P(Xk = x | Y1 , . . . ,Yt).
x
4 Give an example where the two are the same.
5 Problem 4 (15 points, Optimal estimation). We will study the simplest case of
6 a filtering problem, namely estimation of a static, scalar variable X ∈ R. We take
7 two noisy measurements of the scalar X of the form Yi = hiX + ϵi; i = 1, 2
8 where h1 = 1 and h2 = 2. The noise in our measurements ϵi ∈ R are distributed as E[ϵi] = 0, and E[ϵi(2)] = σi(2) .
9 The two noise terms are uncorrelated with each other. Assume that our signal X is
10 not correlated with noise E[Xϵi] = 0 for both i = 1, 2.
11 (a) (10 points) Assume that the optimal estimate of the variable is of the form
Xˆ = a1Y1 + a2Y2
12 where constants a1 and a2 are independent of X . Compute the values of
13 a1 ,a2 that
14 (i) make sure that the estimate Xˆ is unbiased, and
15 (ii) minimize the mean-square estimation error E[(X − Xˆ )2].
16 Use these values of a1 ,a2 to compute the minimum value of the mean-
17 square estimation error. You should solve this problem from first principles,
18 without using any expressions for the Kalman filter that we may derive in
19 the class.
20 (b) (5 points) Discuss how the optimal estimator uses the two measurements
21 for the following cases 22 (i) σ2 ≫ σ 1
23 (ii) σ2 = σ 1
24 (iii) σ2 ≪ σ 1 .
25 Do your answers agree with your intuition? Explain.