关键词 > ELEC475/575

ELEC 475/575 Homework 2 Learning From Sensor Data

发布时间：2023-02-06

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ELEC 475/575 Homework 2

Learning From Sensor Data

1 Problem 1

We are provided with a set of training example for the unknown target function (X1 , X2 ) → Y. Each row indicates the values observed and how many times that set of values was

observed. For example, (+,T,T) was observed 3 times while (-,T,T) was never observed.

Count

1. Compute the sample entropy H(Y) for this training data? Assume the logarithm is base 2

2. What is the mutual information between X1 and Y, I(X1 ;Y), from the sample of training data?

3. What is the mutual information between X2 and Y, I(X2 ;Y), from the sample of training data?

4. Draw the decision tree that would be learned from the sample training data? Hint: Think about which feature, X1 or X2 , would be hrst, the one with the highest mutual information with Y?

2 Problem 2

Show that I(X1 , X2 ;Y1 ) = I(X1 ;Y1 ) + I(X2 ; Y1 |X1 )

3 Problem 3

With mutual information, we want to see its connection to KL-divergence, where KL is important in information theory and machine learning. KL-divergence from a distribution f(x) to a distribution g(x) is thought of as a measure of dissimilarity from F to G:

DKL (f ||g) = \x fX (x)logdx (1)

Mutual information can be deined as the KL-divergence from the observed joint of X and Y to the product of their observed marginals.

I(X;Y) = 之i,jpX,Y (xi , yj )log() = \R2 fX,Y log()dxdy (2)

Show that I(X;Y) = H(X) - H(X| Y) or I(X;Y) = H(Y) - H(Y| X)

Hint: Start of with the second equation in terms of probabilities and remember to use your conditional probability theorems and sum of probabilities

4 Problem 4

Say that X,Y,Z is a Markov chain denoted as X t Y t Z, then show

1. I(X;Y) N I(X;Z)

2. I(Y;Z) N I(X;Z)

This is known as the Data-processing inequality and it is equality if I(X;Y—Z) = 0

5 Problem 5

- We will have at least one homework problem in every set focused on processing, manipulating, and learning from one data set.

- You can find the data here: http://www.ieeg.org

- Choose a time series and examine a section of the data to see if it is ergodic, that is:

- Since we do not know the mean, convergence to any number shows ergodicity.

6 Problem 6

With the IEEE data that y’all downloaded you will use it and utilize some methods that you learned or will learn:

1. You want to estimate the density of your time series data. Try using histograms to evaluate what it looks like. Is it a good way to evaluate your density? Please explain in detail and do comment about the type of data you are using to show why it is good or bad.

2. With multiple time-series data, try calculating the correlation with at least two difffferent channels with the EEG data.