关键词 > CS6501-005/SYS6581-010

CS 6501-005/SYS 6581-010: LEARNING IN ROBOTICS FALL 2022 HOMEWORK 1

发布时间：2022-09-21

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

CS 6501-005/SYS 6581-010: LEARNING IN ROBOTICS

FALL 2022

HOMEWORK 1

1 Problem 1 (50 points, Bayes filter). In this problem we are given a robot operating

2 in a 2D grid world. Every cell in the grid world is characterized by a color (0 or 3 1). The robot is equipped with a noisy odometer and a noisy color sensor. Given

4 a stream of actions and corresponding observations, implement a Bayes filter to

5 keep track of the robot’s current position. The sensor reads the color of cell of the

6 grid world correctly with probability 0.9 and incorrectly with probability 0.1. At

7 each step, the robot can take an action to move in 4 directions (north, east, south,

8 west). Execution of these actions is noisy, so after the robot performs this action,

9 it actually makes the move with probability 0.9 and stays at the same spot without

10 moving with probability 0.1.

11 When the robot is at the edge of the grid world and is tasked with executing an

12 action that would take it outside the boundaries of the grid world, the robot remains

13 in the same state with probability 1. Start with a uniform prior on all states. For

14 example if you have a world with 4 states (北1,北2,北3,北4) then P(X0 = 北1) =

15 P(X0 = 北2) = P(X0 = 北3) = P(X0 = 北4) = 0.25.

16 You are given a zip file with some starter code for this problem. You will find

17 the zip file on Collab under Resources » Homeworks » hw1. This consists of the

18 Python scripts example_test.py and histogram_filter.py (Bayes filter is also called the

19 Histogram filter) and a starter file containing some data: starter.npz. The starter.npz

20 file contains a binary color-map (the grid), a sequence of actions, a sequence of

21 observations, and a sequence of the correct belief states. This is provided for you

22 to debug your code. You should implement your code in the histogram_filter.py.

23 Be careful not to change the function signature, or your code will not pass the tests

24 on the autograder. You should upload your histogram_filter.py file to Homework 25 1 Problem 1 Code on Gradescope. In the PDF of your solutions, you are free to

26 include any relevant plots of the grid world and the robot’s belief which show that

27 your code works as intended.

28 Problem 2 (35 points, Learning HMMs). In this problem, we are going to im-

29 plement what is called the Baum-Welch algorithm for HMMs. Recall that an

30 HMM with observation matrix M has an underlying Markov chain with an initial

31 distribution π and state-transition matrix T. Let us denote our HMM by λ = (π,T,M).

32 Say someone had given us an HMM λ which gave us a sequence of observations

33 Y1 , . . . ,Yt . Since these observations happened, they tell us something more about

34 our given state-transition and observation matrices, for example, given these obser-

35 vations, we can go back and modify T,M to be such that the observation sequence

36 is more likely, i.e., it improves

P(Y1 , . . . ,Yt | λ).

37 (a) (20 points) Assume that we are tracking a criminal who shuttles merchandise

38 between Los Angeles (北1) and New York (北2). The state transition matrix between

1 these states is

T = ] ;

2 e.g., given the person is in LA, he is likely to stay in LA or go to NY with equal

3 probability. We can make observations about this person, we either observe him to

4 be in LA (y1), NY (y2) or do not observe anything at all (null, y3). M = ] .

5 Say that we are tracking the person and we observed an observation sequence of 20

6 periods

(null, LA, LA, null, NY, null, NY, NY, NY, null, NY, NY, NY, NY, NY, null, null, LA, LA, NY).

7 Assume that we do not know where the criminal is at the first time-step, so the

8 initial distribution π is uniform.

9 We first compute the probability of each state given all the observations, i.e.,

αk(x) βk(x)

zxαt(x) .

10 which is simply our smoothing probability computed using the HMM λ computed

11 using the forward and backward variables αk,βk . You should check that γk(x)

12 should be a legitimate probability distribution, i.e., zxγk(x) = 1.

13 Compute the smoothing probabilities γk(x) for all 20 time-steps and both states

14 and show it in a table like we had in the lecture notes. You should also show the

15 corresponding tables for the forward and backward probabilities αks and βks.

16 Write down the point-wise most likely sequence of states based on smoothing

17 probabilities.

18 (b) (5 points) We now discuss how to update the model λ given our observations.

19 First observe that

t−1

E[number of times the Markov chain was in state x] =之 γk(x).

k=1

20 Let us define a new quantity

ξk(x,x\) = P(Xk = x,Xk+1 = x\ | Y1 , . . . ,Yt,λ)

21 to be the probability of being at a state x at time k and then moving to state x\ at

22 time k + 1, conditional upon all the observations we have received from our HMM.

23 Show that

ξk(x,x\) = η αk(x)Tx,x\ Mx\ ,yk+1 βk+1(x\)

24 where η is a normalizing constant that makes zx,x\ ξk(x,x\) = 1.

25 (c) (5 points) We can now use our estimate from the previous part in to get

t−1

E[number of transitions from x to x\] =之 ξk(x,x\).

k=1

1 We now construct an updated model for our HMM as follows. The initial distribution

2 of the states, instead of being π is our smoothing probability π \ = γ1 (x).

3 Entries of the new transition matrix can be computed by

T,x\ = =对tk(对tk−=)

4 Entries of the new observation matrix can be computed by

M,y = = 对k(t)对(=1)

5 where 1{yk =y} denotes that the observation at the kth timestep was y . Let the new

6 HMM model be

λ\ = (π\,T\,M\)

7 Write down the new HMM model λ\ and see if any entires have changed as compared

8 to λ .

9 (d) (5 points) Compare the two HMMs λ and λ\ . Compute the two probabilities

10 and show that

P(Y1 , . . . ,Yt | λ) < P(Y1 , . . . ,Yt | λ\).

11 This is exactly what Baum et al. proved in the paper Baum, Leonard E., et

12 al. "A maximization technique occurring in the statistical analysis of probabilistic

13 functions of Markov chains." The annals of mathematical statistics 41.1 (1970): 14 164-171. You can see that the Baum-Welch algorithm is quite powerful to learn

15 real-world Markov chains. Given some elementary model of the dynamics T and an

16 observation model M, you can sequentially update both of them to guess a better

17 model of the dynamics.

18 Problem 3 (20 points, Forward-Backward algorithm). Answer the following

19 questions.

20 (a) (15 points) Using our forward and backward variables

αk(x) = P(Y1 , . . . ,Yk,Xk = x)

βk(x) = P(Yk+1 , . . . ,Yt | Xk = x).

21 for a sequence of observations Y1 , . . . ,Yt of length t > 1, the state transition matrix

22 Tij = P(Xk+1 = xj | Xk = xi) and the observation matrix Mij = P(Yk =

23 yj | Xk = xi), write down how you will compute the following probabilities

24 (1) P(Xk+1 = xj | Xk = xi, Y1 , . . . ,Yt),

25 (2) P(Xk = xi | Xk+1 = xj, Y1 , . . . ,Yt),

26 (3) P(Xk−1 = xi,Xk = xj,Xk+1 = xl | Y1 , . . . ,Yt).

27 You do not need to consider the boundary cases like k ∈ {1,t}.

28 (b) (5 points) Viterbi’s algorithm finds the most likely state trajectory of the

29 HMM’s associated Markov chain given a sequence of observations Y1 , . . . ,Yt .

1 Explain why, in general, the solution of the decoding problem, i.e.,

(x1(∗) , . . . ,xt(∗)) = argmax P(X1 = x1 , . . . ,Xt = xt | Y1 , . . . ,Yt).

(x1 ,...,xt)

2 is not the same as the sequence obtained by most likely state at each time computed

3 by the smoothing, i.e.,

(xˆ1 , . . . , xˆt) where xˆk = argmax P(Xk = x | Y1 , . . . ,Yt).

4 Give an example where the two are the same.

5 Problem 4 (15 points, Optimal estimation). We will study the simplest case of

6 a filtering problem, namely estimation of a static, scalar variable X ∈ R. We take

7 two noisy measurements of the scalar X of the form Yi = hiX + ϵi; i = 1, 2

8 where h1 = 1 and h2 = 2. The noise in our measurements ϵi ∈ R are distributed as E[ϵi] = 0, and E[ϵi(2)] = σi(2) .

9 The two noise terms are uncorrelated with each other. Assume that our signal X is

10 not correlated with noise E[Xϵi] = 0 for both i = 1, 2.

11 (a) (10 points) Assume that the optimal estimate of the variable is of the form

Xˆ = a1Y1 + a2Y2

12 where constants a1 and a2 are independent of X . Compute the values of

13 a1 ,a2 that

14 (i) make sure that the estimate Xˆ is unbiased, and

15 (ii) minimize the mean-square estimation error E[(X − Xˆ )2].

16 Use these values of a1 ,a2 to compute the minimum value of the mean-

17 square estimation error. You should solve this problem from first principles,

18 without using any expressions for the Kalman filter that we may derive in

19 the class.

20 (b) (5 points) Discuss how the optimal estimator uses the two measurements

21 for the following cases 22 (i) σ2 ≫ σ 1

23 (ii) σ2 = σ 1

24 (iii) σ2 ≪ σ 1 .

25 Do your answers agree with your intuition? Explain.