CSCI2122 Final Exam

Candidate Name :............................... Banner Num. :.............

Thursday, December 10th (8.30am to 10.30am)


• The final exam is open book, but is to be completed independently.

• It is your responsibility to be aware of the university Plagiarism Policy.

• Each question carries an equal contribution to the overall grade unless indicated oth-erwise.

• Submit your final exam by writing your answers on the question sheet, photographing all pages and uploading all pages as if you were submitting an assignment. Only the most recent copy of each question will be graded.


Qu 1 - Quick C coding questions [Qu(a) and Qu(b)]1

Qu(a): What operation does the following code perform. Why would the function return a pointer to p1 if it is also passed as a parameter? Are there any circumstances in which undefined behaviour may result? [if so, define these circumstances]

char *something(char *p1, const char *p2)

{

char *p = p1;

while (*p)

p++;

while (*p++ = *p2++)

;

return p1;

}

Qu(b): Re-write the following nested loop using a single loop and eliminate the array indexes i, j and k using instead a single pointer for manipulating matrix content:2

int sum = 0;

int A[L][M][N];

for (int i = 0; i < L; i++)

for (int j = 0; j < M; j++)

for (int k = 0; k < N; k++)

sum += A[i][j][k];


Qu 2 - Amdahl’s Law

Parallelization in the ideal case implies that the speedup equals the number of simultane-ously parallelized computations. However, in practice this is limited by the amount that an application can actually make use of the parallelization and overheads/ bottlenecks between parallelized parts of the application. With this in mind consider the following:

1. Ignoring any overheads, what is the speedup for N processors if 80% of the application can make use of the parallelization? [Answer is an expression, see Amdalh’s Law]

2. What is the speedup for N processors if each time the number of processors is doubled, the overhead of communication increases by 5%? [Answer will be a modified version of your expression from 1]

Your answer [space for supporting work]


Qu 3 - Integer Arithmetic in C

In the case of the following code snippet, let the values of type int and unsigned be expressed using 32-bit precision. As int is a Two’s complement type, assume that they are right shifted arithmetically.

    Also assume that we have a function random() that returns a random 32 bit int subject to INT MIN ≤ x, y ≤ INT MAX3 . You then cast this to a corresponding unsigned variable resulting in a total of 4 variables. For example:

/* define some legal values for x and y */

int x = random();

int y = random();

unsigned ux = (unsigned) x;

unsigned uy = (unsigned) y;

In each of the following expressions, are there any numerical values for the variables x, y ux and uy for which the equality / inequality does not hold? i.e. either show why the equality / inequality is always true, or give example numerical values for which it fails.

    1. ((x>>2) << 2) <= x

    2. (x < y) == (-x>-y)

    3. (ux - uy) == -(unsigned)(y-x)

    4. ~x+~y+1 == ∼(x + y)

    5. ((x + y) << 4) + y - x == 17 * y + 15 * x

Your answer [space for supporting work]


Qu 4 - Address indexing in assembly code

Array indexing can be generalized to any number of dimensions. With that in mind the following C function initializes a pointer to an element from a 3-dimensional matrix, Data:

long Data[K][L][M];

long Dptr(long x, long y, long z, long *ptr3)

{

*ptr3 = Data[x][y][z];

return sizeof(Data);

}

Let the following represent the corresponding assembly code:

    

Question: What are the values for K, L and M? [Hint: first establish an expression for address indexing of a 3-dimensional array using the expression for 2-dimensional arrays as a starting point (see your course notes).]

Your answer [space for supporting work]


Qu 5 - Memory hierarchy and cache efficient code

With respect to the following C code, assume that:

• The cache in each of the following questions is initially empty.

• Variables ‘sum’ and ‘i’ are associated with registers.

• The array ‘x’ is stored in main memory using row major order with block alignment and a start address of ‘0x0’.

• Applying the standard library C function ‘sizeof’ to ‘int’ returns a value of ‘4’ on the target computer.

int main(void)

{

int x[2][128];

int sum = 0;

for (int i = 0; i < 128; i++) {

sum += x[0][i] * x[1][i];

}

return 0;

}

Questions:

1. Under the following cache configuration, what is the miss rate?

• Total cache capacity: 512 bytes

• Cache block size: 16 bytes

• Direct mapping

2. What is the miss rate of the above cache configuration if the cache capacity is doubled, but all the other properties remain unchanged?

3. Under the following cache configuration, what is the miss rate?

• Total cache capacity: 512 bytes

• Cache block size: 16 bytes

• 2-way set associative mapping

• LRU replacement policy

4. What is the miss rate of the Qu 3 cache configuration above if you could either double the cache capacity or keep the capacity the same, but double the block size? The approach you recommend should minimize the miss rate.

Your answer [space for supporting work]