闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ECS708 Machine Learning

Question 1

a) Define the joint probability p(A, B) and the conditional probability p(A | B) . You may want to use a diagram/sketch. Give the formula that relates them.

[4 marks]

Answer:

The conditional probability p(A|B) is the probability that the event A occurs, given that the event B has occurred. The joint probability is the probability that both events occurred. Their relation is given by the formula p(A | B) = P(A, B) / P(B)

b) Give the relation between the joint probability P(X, Y) and the probabilities P(X) and P(Y) that holds in the case that X and Y are independent random variables. Give the condition that holds when X and Y are uncorrelated. Are these conditions the same?

[4 Marks]

Answer:

X, and Y are independent if p(x, y) = p(x)p(y) . They are uncorrelated iftheir covariance is zero, that is if var{ x, y} = E{ (x − E{ x} )2 (y − E{ y} )2} = E{ xy} 2 − E{ x} E{ y} = 0 . The condition of independence is much stronger.

c) Show that the expected value ofthe sum oftwo independent random variables X and Y is equal to the sum ofthe expected values ofX and Y. That is, show that

E{X + Y} = E{X} + E{Y} . You may show it either for continuous or discrete variables. (Hint: You need to work with the joint probability P(X, Y) ).

[6 Marks]

Answer

d) An IT worker works from home 2 days a week. When she works from home there is a 30% chance she will NOT answer an email within an hour, 10% chance that she will not answer an email within two hours, and it is certain that she will answer all the emails within the day. When she is at office, there is a 50% chance she will NOT answer an email within one hour, 10% chance that she will not answer an email within the two hours, and it is certain that she will answer all the emails within the day.

i. If you send her an email, what is the probability that she will answer within 2 hours?

ii. Given that she hasn’t replied to your email within 1 hour, what is the probability that she is working from home? Does the information that she hasn’t answered the email within 1 hour makes it more or less likely that she works from home?

[11 Marks]

Answer

The joint probability table is given by:

0.4* 0.7 0.4*0.2 0.4*0.1

.28 0.08 0.04

0.6*0.5 0.6 *0.4 0.6*0.1 =

.3 .24 .06

The probability that she will answer within two hours is the sum ofthe values ofthe first two columns and equals 0.90

Let D denote the event that she will answer after one hour. P(D) = 0.42 (sum of last two columns)

Let H denote the event that she works at home. P(H) = 0.4 (sum of first row)

P(D,H) = 0.12 is the probability that she works from home AND will answer after one hour.

P(H|D) =P(D,H) / P(D)= 0.12 / 0.42 = 0.2857 < p(H) = 0.4

That means that if she has not answered the email within the first hour, this gives some evidence that she is at work.

[Total 25 marks]

Question 4

a) With a help of a diagram explain the main principles of a first-order Markov Model. Explain what is meant by the term ‘’first-order”. What are the differences with a hidden Markov model (HMM)? In your answer, define the states ωi , the symbols vk , and the matrices A = [aij ] and B = [bjk ] .

[6 marks]

Answer:

A first-order Markov model is a generalisation of a finite state machine, in the sense that transitions between the states are probabilistic. In the example above there are three states ωi , and transitions probabilities aij from state i to state j. It is called a first-order model meaning that the probability of going to state j from state i depends only on the current state i and not on states further in the past.

In a hidden Markov model:

At each state j an observation symbol vk is emitted with probability bjk We do not have direct evidence ofthe state, we observe only the symbols.

b) The decoding problem can be stated as follows: Given an HMM and a sequence ofobservation symbolsV1:T determine the most likely sequence of hidden statesω1:T .What are the other two types of problems that arise in the context ofHMMs?

[6 marks]

Answer:

1) The evaluation problem: Given an HMM (that is,ωi ( 0) , A = [aij ] andB = [bjk ]), what is the probability of generating the sequence V1:T .

2) The Learning problem: Given the structure of the HMM, that is the number of states, the connections and the symbols) learn the state transition matrix and the symbol emission mat- rix. These are learned from a set of observation sequences.

c) Describe an algorithm that solves the decoding problem, as this is described in part (b). What is the name ofthis algorithm?

[13 marks]

Answer:

The most likely sequence of hidden states can be estimated by the Viterbi algorithm, as given below.

1. Initialize: δi (0) = 1 for init state ω%(0), 0 otherwise; t ¬ 0

2. Recursion: t ¬ t + 1

3. δj (t) ¬ mi(a)x[δi (t − 1)aij ]bj ,v(t)

4. ψj (t) ¬ arg mi(a)x[δi (t − 1)aij ]

5. If t < T repeat from 2.

6. Termination: ω%(T) ¬ arg mi(a)xδi(T)

7. Backtracking:

8. t ¬ t − 1; ω%(t) ¬ ψω%(t+1)(t + 1)