闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Primary Examination, Semester 1, 2019

Parallel and Distributed Computing

COMP SCI 3305, 7305

Question 1

(a) Provide brief answers to the following questions:

(a) What is a lock?

(b) What is a race condition? [4 marks]

(b) Below is pseudo-code for two threads, running in parallel. The shared

variable counter should be incremented by two. But you observe that sometimes this does not happen as expected.

i. Explain carefully circumstances in which an incorrect value could be produced. Note that tmp1 and tmp2 are local variables. You may assume that counter is initialized to 0.

// Task 1:

tmp1 = counter; tmp1++;

counter = tmp1;

// Task 2:

tmp2 = counter; tmp2++;

counter = tmp2; [5 marks]

(d) This question is about high performance computers with a distributed memory architecture, and the interconnection networks used with them.

i. Draw a diagram showing the structure of a distributed memory high performance computer. Brieﬂy explain each item on the dia- gram, in particular why the interconnection network is important. [3 marks]

ii. Suppose that you are designing a high performance computing, with a requirement to efﬁciently support applications that have a moderate to high intrinsic communication. Discuss suitable inter- connection network architectures. Compare at least two alterna- tives. [4 marks]

[Total for Question 1: 19 marks]

Question 2

(a) Suppose that you wish to use a group of threads to form the sum, in

a shared variable result, of an array of numbers. This can be done by giving each thread responsibility for a portion of the array, such that it forms, in its local memory space, the sum of its portion of the array. Outline (using pseudo-code), with explanation, an algorithm that can be executed in parallel by each task in an MPI program.

The array should be distributed, and global sum should be computed, using suitable collective operations.

Include in your explanation a diagram showing the respective tasks and address spaces, in particular, the placement of the array, and the variables representing local and global sums. [8 marks]

(b) i. Deﬁne the term speedup. [2 marks]

ii. The efﬁciency of a parallel program executed on N processors is deﬁned as S/N, where S is the speedup.

If a parallel program is run on N processors, what is the (ideal) maximum speedup normally expected? What is the corresponding efﬁciency? Explain brieﬂy. [3 marks]

P 1 2 3 4 5 6 7

t 100 60 40 30 23 20 18

Using these results, draw two graphs. The ﬁrst graph is the above data, t against P. For the second graph, compute the efﬁciency at each

value of P, and plot efﬁciency against P.

Explain your results. [8 marks]

[Total for Question 2: 21 marks]

Question 3

(a) Batch scheduling is commonly used in high-performance computing.

Brieﬂy discuss what is meant by batch scheduling, and why it is often preferred to interactive operation on high-performance computers. [3 marks]

(b) The following pseudo-code describes the code executing in each of

several MPI tasks.

It is intended to represent nodes communicating with each other in a ring topology, with communications commands in the following order:

send value to right

receive value from right

The purpose of the code is to cyclically shift the values one place to the right around the ring.

i. What is the problem with this pattern of communication? Explain clearly, with diagrams. [3 marks]

ii. How would you ﬁx it? Again explain with diagrams. [3 marks]

i. Explain, with diagrams, how an MPI Reduce collective operation is executed in MPI. You may use the addition of numbers as an example. [3 marks]

ii. Suppose that we are using MPI reduce with addition. What can we say about the order in which the individual addition operations are done? [2 marks]

iii. MPI does not provide a reduction with subtraction as the operator. Why? Give an example to illustrate your answer. [3 marks]

[Total for Question 3: 17 marks]

Question 4

(a) Discuss brieﬂy the role of the Netﬂix Simian Army, and what it is capa- ble of doing. In your answer, address this question: would it be better

to build a distributed system that does not fail, rather than spending a lot of time and effort to develop something like the Simian Army? [6 marks]

(b) Consider each of the statements below. State whether the statement

is true or false, then write one line of explanation for your answer.

1. MPI is a programming language.

2. Java uses automatic parallelization.

3. Deadlock is impossible when MPI is used.

4. Increasing problem size always increases speedup.

5. A barrier is a form of global synchronisation. [10 marks]