闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Comp 642: Assignment #1

2022

Submission Instructions: For coding questions, please submit python notebook along with all the plots and 2-3 paragraphs explaining what you observe and what are your conclusions. Please do familiarize

them in your browser without any installations.

1 Logistic Regression by Hand 20 Points

You are given a data set of three samples with 2-dimensional feature vectors. D = {5, 10,TargetValue =

We will train a logistic regressions model with parameter vectors x, with cross-entropy loss.

Questions: Full Pseudo Codes whenever an algorithm is expected

• How many parameters are there in logistic regressions

• Write down the loss as a function of parameters.

• Calculate the partial derivative of the loss function with respect to each parameter

• Write the iterative gradient descent update algorithm, assuming a step size η

• Assuming that we calculate stochastic gradients, i.e., pick a random point and compute the gradient with respect to that point only. Write down the algorithm for the ADAM update rule.

2 (Coding) Python Notebook and Variants of Gradient Descent 40 Points

We fit a regression model with mean square error (MSE) loss. The notebook demonstrates and compares

loss functions). We have given a template for plotting different things and comparing the convergence

Questions: We expect full working codes and plots. Please submit the notebook and all the plots. Write a short conclusion about what you observe

• (SGD) Write a new method that gives stochastic gradient descent (SGD) (randomly pick one data sample and return the gradient only on that sample). Compare the convergence and accuracy of SGD with others.

• (Averaged SGD) Implement averaged SGD (See http://dustintran.com/blog/on-asymptotic-convergence- of-averaged-sgd). That is, take the running average of the parameters during each iteration

• Assuming SGD, Implement ADAM and any two of your favorite gradient descent idea fromhttps: //ruder.io/optimizing-gradient-descent/. Compare and contrast convergence accuracy, etc.

3 (Coding) Double Descent 20 Points

we are demonstrating double descent phenomena with ridge regression. We used an example of a feature

Questions: We expect full working codes and plots. Please submit the notebook and all the plots. Write a short conclusion about what you observe.

• Instead of Ridge Regression, use any of your favorite classifiers for MNIST and see if you still get the double descent phenomena.

• Usinghttps://github.com/gwgundersen/random-fourier-features/blob/master/rffridge.py as example, write your own feature generator (Random Fourier Features) to replace the generate_synthetic_data function. See if you can again demonstrate the double descent phenom-

ena.