ST420 Assignment 1: Exploring neural networks
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
ST420 Assignment 1: Exploring neural networks
25/01/2023
Aims
This assignment, worth 10% of the final module mark aims to give you hands-on experience of some of the ideas discussed in the early part of the module. Your task is to investigate the use of neural networks for univariate regression.
Submission details
This assignment must be submitted, via the submission portal on moodle, by 12 noon on 10th February 2023. You must submit two files:
1. A two page .pdf file, containing your work, the requirements of which are set out in the tasks below. This pdf may be generated by any means you wish. I suggest the use of R Markdown, but submissions compiled using LaTeX or any other means are also acceptable. The page limit is strict - nothing will be marked after the first two pages. Zero marks will be awarded for files that are not in pdf format. (Note that it is easy to create a pdf from any other format using the print to pdf option.)
2. A single file containing the all of the code you used to generate the results.
The usual rules about plagiarism and collusion apply. In particular, your code should be your own; it will be checked for signs of collusion.
Background
The goal of this assignment is to investigate the use of neural networks for regression. You will not be required to write code for training of neural networks from scratch, but can make use of existing packages. You are free to use any programming language you wish, but I recommend R, because of the convenient R package
neuralnet , which I will be using as an illustration throughout. For example, if you are familiar with Python you are welcome to use PyTorch instead.
In RStudio, the neuralnet package can be installed and loaded as follows:
install.packages("neuralnet") l ibrary (neuralnet) |
Generating training data
We will be generating data from a sinusoidal function, plus noise:
Y = f ∗ (X) + ϵ , ϵ ∼ N (0, 0.12 ),
with
f ∗ (x) = sin(15x).
We will take the xi-values to be the sequence 0, 0.05, 0.10, … , 1.
x = seq(from=0,1,0.05) y = sin(15*x) + rnorm(length(x), sd=0.1) xy_data = data.frame(x,y) |
1. (1 mark) Plot your generated training data and the true function f ∗ on a single figure.
2. (1 mark) Throughout this question we will use squared loss. Using a regular grid of x values, and by generating new y-values, estimate the risk of the true function f ∗ . (This is an estimate of the optimal risk.)
Fitting neural networks
You can easily fit a neural network in R using the function neuralnet ; see ?neuralnet for the details:
xy_nn = neuralnet(y~x, xy_data, hidden=c(2), act.fct = "logistic",
lifesign = "full", lifesign.step = 5000, stepmax=10^6, rep=1)
You can use the default choices of optimization algorithm, learning parameters and initialisation. Note that since both the initialisation and training procedure is stochastic, it may be the case that for certain runs the training algorithm does not converge within the maximum number of steps. Simply re-run the code in this case (it should converge most of the time).
To visualise the network and see the values of the weights and biases, you can use plot :
plot(xy_nn)
You can see the output, i.e. the predictions, of the neural network using the predict function:
x_pred = seq(0,1,0.01)
y_pred = predict(xy_nn, newdata = as.matrix(x_pred))
3. (3 marks) Adapt the code to train a neural network with 1 hidden layer containing 3 neurons, using the tanh activation function. (I.e. there are a total of three layers: the input layer, 1 hidden layer, and then the output layer.) Plot the predicted values of this neural network on the set of x_pred points, along with the true function f∗ on the same plot. (Do not be alarmed if the fit is poor.)
4. (1 mark) Generate an independent test set of 10 points from the model (you can draw the x values uniformly in [0, 1]), and estimate the risk of the neural net estimator on this test set.
5. (1 mark) Repeat questions 3-4 for a neural network with 1 hidden layer and 4 neurons.
6. (3 marks) Plot now estimates of the risk for networks trained with 1 hidden layer, with widths ranging from 5 to 30. (As before, due to the stochastic nature of neural network training, you may need to re-run the code for certain widths if it does not converge.) When the number of parameters exceeds the number of training points, this is known as the overparameterized regime. Comment on the behaviour of the neural network in the overparameterized regime; is there evidence of overfitting?
2023-02-04