Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Math 4570 Matrix methods for DA and ML

Homework 6.

Using Python or Matlab for the calculations of matrices. Do not use scikit-learn or statmodels libraries for calculation.

Using Mathematica or https://www.wolframalpha.com/help the calculation of integrals.

Question 1. Find a least squares approximation to the function e-x by a linear function a +bx in the interval [1, 2]. Use the inner product〈f , g=  12 f (x)g(x)dx

Question 2. Find a least square approximation for the function sin x as a quadratic function a +bx +cx2 in the interval [0, r]. Use the inner product〈f , g=  0r f (x)g(x)dx

Question 3. For any two continuous functions f (x) and g(x), let the inner product

r

f , g=      f (x)g(x)dx.

0

(1) Find an orthogonal basis for the inner product space P = Span{1, 2x 3 2x}

(2) Find the least squares approximation to the function f (x) = sin x by a quadratic function abxcx2 in the interval [0, r].

(You may need:  0r sin(x)dx = 2;  0r x sin(x)dx = r;  0r x2 sin(x)dx = r2 - 4;  0r x3 sin(x)dx = r(r2 - 6))

Question 4. Let X e Rn×d  and  e Rn  and let J() =  ||X - ||2 . Here the norm || || is the standard l2 -norm defined by dot product. You can use any results in the lecture notes.

(1) Calculate the gradient of the function J().

(2) Calculate Hessian matrix of J().

(3) Write down the update formula for approximating argmin9J() using Gradient Decent, using α for the learning rate.

(4) Write down the update formula for approximating argmin9J() using Newton’s method.

(5) Find the argmin9J().

x(i)

Question 5. Consider the data

You may use Python (with only numpy library) to solve the matrix calculations.

(1) Use the Method of Least Squares to fit a linear model f (x) = 90 +91x1 to this dataset.

(2) Use the Method of Least Squares to t a quadratic model g(x) = 90 +91x1 +92x2 to this dataset.

(3) Calculate and compare the RSS cost RS S (9) = ||X - ||2 for the above linear fit and quadratic fit.

(4) Plot the graph for the data and the linear fit and quadratic fit.

2

┌'91

'    

Question 6. Let X be the data matrix with mean zero, Y be the label vector, and  = '  be the parameter

'    

'    

(1) Find an expression for the location of the critical point of Ridge() by calculating gradient Ridge() = 0.

(2) Consider the following data points                            . The mean of each column is zero.

a). Fit a linear model y = 91x1 +92x2 to this dataset when the loss is RSS= ||X - y→||2 . You should report the best t function and the RSS cost value. use Python (with only numpy library)

b). Fit a linear function to this dataset when the loss is the Ridge Loss J(9) = ||X -y→||2+入(91(2)92(2)) with = 1 and with  = 10. You should report the best fit function and the RSS cost value.

Question 7. Logistics Regression Consider the categorical learning problem consisting of a data set with two labels:

Label 1:

X1

3.81

0.23

3.05

0.68

2.67

 

X

 

3.37

3.53

1.84

2.74

 

Label 2:

(1) Use gradient descent to find the logistic regression model

p(Y = 1|x) =

and the boundary. (Plot the boundary, only use numpy and Matplotlib.)

Hint for code:

1   def  sigmoid (x):

2               return   1/(1+ np . exp( - x)) 3

4   def  grad_ cost (theta ,  x,  y):

5               z  =  x . dot( theta )

6               gradcost  =   (1/ len(x))*np .matmul (x .T ,( sigmoid (z) -y))

7               return  gradcost

Define Gradient Descent function with iterations and learning rate alpha

1   def  GradientDescent (x,y,  theta ,  alpha ,  iteration ):

2               for  i  in  range ( iteration ):

3                           theta_new  =  theta   -  alpha * grad_ cost (theta ,x,y)

4                           theta  =  theta_new

5               return  theta_new

The result of  depends on your initial value )0 , number iterations, and learning rate α .  With )0   = , α = 0.02, and and 1000 iterations, we get our :

[-0.04617983, - 1.37920924, - 1.25274956]

(Your answer may very di住erent from this.  But after divide 92, the answer should be similarly.  Or the boundary graph should be similarly.)

If you want, you can also recording the Cross-entropy cost values and plot them. The cross entropy function can be defined as:

1   def  CELoss (x,y, theta ):

2               z  =  x . dot( theta )

3               CE=np . sum(np .matmul (y .T,np . log( sigmoid (z)))+np .matmul (( np . ones(y . shape ) -y) .T,np . log (( np . ones( sigmoid (z) . shape )))))

4               return   - (1/ len(x))*CE

 

The boundary 90   91X1   92X2  = 0 can be plotted using plt.plot(X1, (-X1 * 91  - 90)/92 , color  = ”blue”) (Here, you only need to plot two points for X1, i.e, the min and the max.)

 

(2) Try quadratic Logistics Regression method for this question and obtain an quadratic boundary. (bonus) (Hint: this means to use new features: X1 , X2 , X1(2) , X1X2 , X2(2) .)

Remark:   You may get the polynomial feature by basic coding: numpy.c [x, x1*x1, x1*x2,x2*x2] to add columns.  If allow to use scikit-learn in labs, we can use sklearn.preprocessing (See CVBootstrap.ipynb in lecture notes)

1   from   sklearn . preprocessing   import  PolynomialFeatures 2   #  Quadratic

3   poly  =  PolynomialFeatures ( degree =2)

4   x_poly  =  poly . fit_ transform (x)

5

Graphing: You may use the following code to draw the graph:

1  X,  Y  =  np .meshgrid (np . arange ( -4,  4,  0 . 05) ,np . arange ( -4,  4,  0 . 05) )

2   plt . contour (X,  Y,

3   - 0 . 01614066 - 1 . 33955452*X - 1 . 23265001* Y +0 . 02176921* X*X +0 . 20651087* X*Y - 0 . 11120619* Y*Y   , [0])

4   plt . show ()

The same drawing in di住erent ranges.

 

Question 8. Consider the classification problem consisting of a data set with two labels:

Label 0:

 

0.2

0.6

2

2.6

3.1

3.8

X

3.4

1.8

2

2.7

3.5

1.5

Label 1:

Use logistic regression p(Y = 1|x→) =  to classify the data.

(1)  Find the logistic function h(x→)  = Newton’s method.)

1

1 e-9Tx.

(You can either use Scikit-learn or Gradient descent or

(2) Find the formula for the line forming the decision boundary.

(3) Find the probability P(y = 0|x→) and P(y = 1|x→) for a test point x=  0(0) for the logistics model in the above question.

(4). What is the predicted label for the point x=  0(0) ?

(5) Plot the graph for the data and the boundary.