Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Assignment #2: COMP5434 Big Data Computing

Due Date: 23:59pm, Wednesday, 22 March. 2023

Question 1 [20 marks]

Given a three-dimensional neural network structure as follows.

Where  ai  = j wji Zj ,  Zi  = fi (ai )  for  i = 1, 2, 3, 4, 5 ,  Z0  = a0    (an input neuron),

f2 (X) = f3 (X) = relu(X)  ,   and   f1 (X) = f4 (X) = f5 (X) = sigmoid(X).       relu(X)

corresponds to a rectifier linear unit transfer function defined as:  relu(X) = max {0, X}. The cost function is defined as  J(w) = (Z5  − y)2 .

We consider a single data sample X = 1.0  and the corresponding label  y = 0. 1. Use the training data to develop the neural network model and solve it by using gradient descent algorithm with initialized setting:

w[0] = [w0(1), w0(2), w0(3), w0(4), w1(2), w1(3), w2(4), w3(4), w1(5), w2(5), w3(5), w4(5)] =

[0.3, 0, 0.5, 0.4, 1.0, 0.8, 0, 0.2, 0.4, 0, 0.7, 0.3], and η = 0.01.

(a) Write a function  F  to simulate the neural network. [5 marks]

(b) Find the values of w3(5)[1], w4(5)[1], w3(4)[1]  after the first iteration. Please show your steps clearly. [15 marks]

Answer:

(a)

a1  = w0(1)a0 ,   a2  = w0(2)a0 +w1(2)Z1  = w0(2)a0 +w1(2) · sigmoid(w0(1)a0)

a3  = w0(3)a0  + w1(3)Z1  = w0(3)a0  + w1(3) sigmoid(w0(1)a0)

a4  = w0(4)a0  + w2(4)Z2  + w3(4)Z3

= w0(4)a0  + w2(4) ∙ relu(w0(2)a0  + w1(2) · sigmoid(w0(1)a0)) + w3(4)

∙ relu(w0(3)a0  + w1(3)  ∙ sigmoid(w0(1)a0))

a5  = w1(5)Z1  + w2(5)Z2  + w3(5)Z3  + w4(5)Z4

=

w1(5)  · sigmoid(w0(1)a0) + w2(5) relu(w0(2)a0  + w1(2)  ·

sigmoid(w0(1)a0)) + w3(5) relu(w0(3)a0  + w1(3)  ∙ sigmoid(w0(1)a0)) + w4(5)  · sigmoid(w0(4)a0  + w2(4)  ∙

relu(w0(2)a0  + w1(2)  · sigmoid(w0(1)a0)) + w3(4)  ∙ relu(w0(3)a0  + w1(3)  ∙ sigmoid(w0(1)a0))) z5  =  sigmoid(w1(5)  · sigmoid(w0(1)a0) + w2(5) relu(w0(2)a0  + w1(2)  · sigmoid(w0(1)a0))

+ w3(5) relu(w0(3)a0  + w1(3)  ∙ sigmoid(w0(1)a0)) + w4(5)  · sigmoid(w0(4)a0  + w2(4) ∙ relu(w0(2)a0  + w1(2)  · sigmoid(w0(1)a0)) + w3(4)

relu(w0(3)a0  + w1(3) ∙ sigmoid(w0(1)a0))))

(b)

a1  = 0.3  a2  = 0.574442   a3  = 0.9595528  a4  = 0.59191056 a5  = 1.09460475 z1  = 0.574442 z2  = 0.574442 z3  = 0.9595528 z4=0.6438033  z5=0.7492478

= = (z5 y) ⋅ z3

= = (z5 y) ⋅ z4

= =  (z5  − y) ⋅ w4(5)  ⋅ ⋅ z3

w3(5)[1] =  w3(5)[0]-7 = 0.698

w4(5)[1] =  w4(5)[0]-7 = 0.299

w3(4)[1] =  w3(4)[0]-7 = 0.199

Note: small deviations are acceptable.

Question 2 [20 marks]

The architecture of a simple convolutional neural network (CNN) is illustrated below. There are two layers in the architecture. The first one is a convolutional layer with a filter of size 3 × 3 applied with a stride of 2. The second one is a pooling layer with a max pooling filter of size 2 ×2 applied with a stride of 1. Both the convolutional layer and the pooling layer are no bias.

a)   Given an input image of size 7 × 7 and the corresponding output after the convolutional layer as shown above, please give the convolutional filter used in above calculation. (Hint: the value of each element in the convolutional filter is taken from the set {- 1, 0, 1}) [5 marks]

b)  Please give the output after the max pooling layer. If the pooling layer is replaced by an average pooling filter with the same size and stride, please give the new      output after the pooling layer. [5 marks]

c)  A convolutional layer with a filter of F × F  applied with a stride of S accepts an input image of size ×  H . Please compute the size of the output image. [5        marks]

d)  Given an input image of size 1000  ×  1000 where the value of each pixel is listed in input_img.xlsx, using pytorch to implement a CNN with below two layers: the  first layer is a convolutional layer with a filter of size 5 × 5 applied with a stride of

2. The second layer is a pooling layer with a max pooling filter of size 2 × 2     applied with a stride of 1. Then please give the value of the pixel at [50,100] of the output image. [5 marks]

Solution:

a)

- 1

0

0

- 1

1

1

0

1

- 1

b)

4

4

3

1


2

1.75


0.5

-0.75

c)  ( + 1)  ×  ( + 1)

d) 9 (or 4)   Code:          import torch

from torch.autograd import Variable

import numpy as np

import pandas as pd

df = pd.read_excel(' input_img.xlsx', header=None)

input_img = df.values

input = Variable(torch.tensor([[input_img]]).type(torch.FloatTensor)) conv_kenal=Variable(torch.tensor([[[[0, - 1, 0, 0, 1],

[1, 0, 2, - 1, 0],

[- 1, 1, - 1, 0, 1],

[1, - 1, 2, 1, - 1],

[0, 2, - 1, 0, 1]]]]).type(torch.FloatTensor)) output1 = torch.nn.functional.conv2d(input,conv_kenal, stride=2)

output2 = torch.nn.functional.max_pool2d(output1, kernel_size=2, stride = 1) print(output2[0][0][50][100])

Question 3 [20 marks]

Given an input image X with 4×4 pixel, we have separated the image into four patches with size of 2×2 as following:

Patch1

Patch2

Patch3

Patch4

These four

1

1

1

0


1

2

1

0


0

1

0

1


0

1

1

0

patches have been represented in patch embeddings as follows:

# Patch Embedding 1

1

1

1

0

# Patch Embedding 2

1

2

1

0

# Patch Embedding 3

0

1

0

1

# Patch Embedding 4

0

1

1

0

1) Given following q , k , and v   , compute the q, k, and v of each embedded image patch. [10 marks]

1

1

0

1

0

1

0

0

1

1

1

0


1

1

1

0

0

1

0

1

0

1

1

0


1

1

1

1

2

3

0

1

0

1

2

0

2) Compute the self-attention results on the input image. [10 marks]

1) Because of q= w q a; k= w k a; v= wv a, we can get:

Q =

K =

V=

q1      q2      q3      q4

2

3

2

1

1

2

1

1

3

4

1

2

k1      k2       k3      k4

3

4

1

2

1

2

2

1

2

3

1

2

v1      v2      v3      v4

3

4

2

2

5

8

4

3

3

4

1

3

2) Attention between q and k is computed as , where d=3 is the dimension of q and k:

7.5056

10.9697

4.0415

6.3509

10.9697

16.1658

6.3509

9.2376

5.1962

7.5056

2.8868

4.0415

4.6188

6.9282

2.8868

4.0415

Final results are computed as V*Softmax(A):

3.9492

3.9924

3.8407

3.7902

7.8588

7.9784

7.5669