Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

MA2647: MATHEMATICS OF DEEP LEARNING, MAJOR PROJECT (SECOND SIT) 2021–22

Deep Learning for the MNIST Data Set

You will: • Implement and train a deep learning network for handwritten digit recognition • Assess the hyperparameters, and compare activation functions and error measures • Pro- duce a written report detailing all of the points above • Produce a recorded audio-video presentation show-casing your achievement.

This project1has three components: a single MATLAB source ile (which must be suitable for direct loading into MATLAB); a written report (in PDF); and an audio-visual presentation in an mp4 audio-video ile. The brief for each component is given below: each is CORE  each must be submitted. The project is worth 70 marks, as detailed below, of which there are two marks for professionalism for each of the three components. Two marks will require polished, professional and substantially error free items; one mark will be given for work that is cosmetically lawed but nevertheless it for purpose; zero marks will be given for very untidy and/or incoherent items.                                                                                           The written report should consist of a cover page plus no more than ive additional A4 sides, be written in a font size of no less than 10 point, and have margins of no less than 2cm all around. The report may be handwritten or typed but it should be well-structured and legible. All igures referred to below must be included in this report. The audio-visual presentation should be no longer than 5 minutes (be accurate and concise).                          All three components of your project should be submitted on Wise Flow by or before 3:00pm UK Time (strict) on Monday 8 August 2022. NOTE: after 3:00pm will be a late submission.

 

Coding: The MATLAB code should be a function contained in just one ile with the name ANN#######.mwhere the # symbol should be replaced by the seven digits of your student ID. The function itself should be organised as shown over the page in Figure1.

From the assessment seven MATLAB grader page on Blackboard obtain your personalized data consisting of: integers Nep, D, u, v and w, and a real positive number α . This step carries no marks but it will be recorded: zero participation = zero marks.

Also, obtain the MNIST training and test data iles, MNIST_train_1000.csv (1000 data points) and MNIST_test_100.csv (100 data points) from the MA2647 Assessment seven Blackboard page.  Note that these MNIST data iles are different from those used in the lectures. These data iles must be in the same folder as the MATLAB code that you develop in the tasks below, and they must not be altered in any way.

Next, execute the following steps. You are strongly urged to use the code ann09demo.m as your starting point, but adjust it to conform with Figure 1.

1.  Implement a ive (one input, three hidden, one output) layer artiicial neural network (ANN) that will read from the MNIST data sets referred to above.  The irst, second and third hidden layers (counting from the input layer) respectively should contain u, v and w neurons (or nodes).  You should use calculus-based backpropagation for training with Nep epochs and a learning rate of α .  If af = 0 in Figure 1, the hidden layer activation functions should be sigmoidal: (1 + ex )1 . We’ll refer to this as the sigmoid code.

2.  Conigure your sigmoid code so that it uses only a randomly selected fraction of the

1000 data points for training. This fraction is speciied as the variable tsf in the Fig- ure 1 code template. For example, setting tsf=0.15 should result in 150 randomly

selected points being used from the 1000.

3.  Use thick solid lines to plot, with a logarithmic x axis, the percentage success at classi- fying all ten digits from the test set against training set sizes of 50, 250, 750, 1000 for the total squared error (TSE) performance index in red, and for cross-entropy (XE) in blue. These success igures should be averages taken over at least four runs (using different randomly selected training data each time). Using thinner broken lines of matching colour, add plots of averaged success plus and minus two standard deviations.

4.  Enhance your code so that if af  0 in Figure 1, the hidden layer activation functions are ReLU: max{0, x}. We’ll refer to this as the ReLU code. Repeat all the steps above on a different set of axes. Make sure that both plots are of high quality, are properly labelled and captioned, and include them in your report.

5.  Use the sigmoid code with tsf=0.3 to train your three layer network with TSE to determine whether an item in the test set is ^D or not. Give an example of the 2 × 2 confusion matrix along with the values of α and Nep that you used (u, v and w should not alter). Give the sensitivity and specihcity of the classiier. Repeat this task with XE.

6.  Repeat all of the last task but with the ReLU code.

Report: Prepare and submit a report that has a cover page giving at least your student ID, the module code, and the project title, and that contains the following information.

1.  A brief but complete mathematical description of your ANN, the training formulae and algorithm. (You do not need to derive the formulae.)

2.  Include and refer to the plots you created above to discuss how the sigmoid and ReLU, TSE and XE, codes compare when used to classify all ten digits. Extend this discussion to how the sigmoid and ReLU, TSE and XE, codes compare when used to classify a digit as ^D or not-^D. Refer to the 2 × 2 confusion matrices and explain in detail how you computed them. Explain also the choices you made for α and Nep . For the 2 × 2 case, do you recommend TSE or XE, sigmoid or ReLU? Justify your recommendation.

Presentation:  Prepare and submit an audio-visual presentation in mp4 format.  You are strongly recommended to use Zoom to create this, and to follow the guidelines in the tu- torial video posted on Blackboard. The presentation should contain the following.

1.  An introduction in which you appear live and which contains a clear picture of your student ID card. This is to verify authenticity of authorship and is not optional.

2.  A screen share where you show your report and code (both as submitted) and go through the report and cross reference its contents to the MATLAB ile. Explain where and how the algorithmic and mathematical details are implemented in the code. You must cover forward propagation, back propagation and gradient descent.

3.  Give a live demonstration of your sigmoid code working when tsf=0.15. Show the 10 × 10 confusion matrix for the test data and explain the meaning of at least two non-zero entries: one on and one off the main diagonal. You may alter α and Nep if necessary in order to get low enough accuracy that your confusion matrix contains enough‘interesting’non-zero results.

4.  Use this live demo to demonstrate how you arrived at the 2 × 2 confusion matrices.

function [Yn On Yt Ot wt] = ANN#######(lr, N_ep, tsf, af, pichoice)

% lr is the learning rate, N_ep the number of epochs, tsf training size fraction

% af = 0 for sigmoid code, af not zero for ReLU code

% pichoice = 0 (or 1) for TSE (or XE)

% Y/O are the exact/predicted labels/targets (n=train, t=test); wt is test success

% MNIST CSV files not to be altered, and in the same folder as matlab code.

... YOUR CODE GOES HERE ...

end

Figure 1: The skeleton form of the MATLAB function that you must use  replace the hashes with your student ID. No deviationfrom this is permitted. Your code will be auto-run and verihed without human intervention. This step may fail if you deviate and your overall score could then be zero.