Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

CSE120 Spring 2022 Mid-term

Q4. Assume you have three arrays, A, B, and C, holding elements of size 8 bytes each.           Assume the base address of A is stored in $t0, that of B in $t1, and that of C in $t2. Convert the following C code to RV64I assembly code. A[i]=B[C[i*2]];

Assume the i variable is stored in $t3.

Your lines of assembly code should not exceed 20!

Make sure to comment your code. Precede your comments with "#" for better readability.

You dont need to follow any register convention while writing the code.

Ans:

 

slli $t3, $t3, 3

#offset for i = (i*8)

slli $t4, $t3, 1

#offset for C[i*2] = (i*2)*8

add $t2, $t2, $t4

#base+offset for C

ld $t5, 0($t2)

#load from memory C[i*2]

slli $t5, $t5, 3

#offset for B

add $t1, $t1, $t5

#base+offset for B

ld $t6, 0($t1)

#load from memory B[C[i*2]]

add $t0, $t0, $t3

#base+offset for A

sd $t6, 0($t0)

#store from B[C[i*2]] into A[i]

Q7. Assume that to spell check a large file,100,000 instructions are needed. The instructions in the program are broken down into 4 different  classes, and each class requires its own number of  clock cycles to execute.

Specific information is given in the table below.

Instruction Class

Clock Cycles per Instruction

Number of Instructions

Branch

3

40,000

Store

4

20,000

Load

5

30,000

ALU / R-type

4

10,000

Part A (10 POINTS)

If the total execution time for this program is found to be 2 seconds, what is the clock rate (expressed in KHz) of the computer on which it was run?

Ans:

CPI*IC = CPI*IC  Branch + CPI*IC  Store + CPI*IC  Load + CPI*IC  R

= (3*40000) + (4*20000) + (5*30000) + (4*10000)

= 120000 + 80000 + 150000 + 40000

= 390000 cycles

Execution time = CPI*IC/clock rate

=> clock rate   = CPI*IC/Execution time

= 390000 cycles/2 seconds

= 195000 Hz = 195kHz

Part B(5 POINTS)

Now, assume that as part of the 100,000 instruction spell check, 20% of all the original number of Load instructions are immediately followed by an ALU/R-type instruction that uses the data that was just loaded. To speed up the original spell check program, we are contemplating          adding a new type of instruction to our architecture: an ALU instruction where one of the source operands is a value from memory. Ex: add rd, rs1, mem[address]

- This new instruction will replace the previous 2 instruction sequence (Load followed by ALU/R type).

- It will take 7 clock cycles.

Will this change offer any speedup over the original design? If so, by how much?

You may assume that the clock rate does not change and your answer to this question does not depend on your answer to Part A.

Ans:

No. of Load instructions replaced by new instruction = 20% of 30000 = 6000

No. of remaining original Load instructions = 30000 - 6000 = 24000

No. of remaining ALU/R-type instructions = 10000 - 6000 = 4000

CPI*IC = CPI*IC  Branch + CPI*IC  Store + CPI*IC  Load + CPI*IC  R + CPI*IC  New

= (3*40000) + (4*20000) + (5*24000) + (4*4000) + (7*6000)

= 120000 + 80000 + 120000 + 16000 + 42000

= 378000

Execution time= CPI*IC/clock rate

= 378000/195000

= 378/195

= 1.94s (approx)

Speedup = Execution time_A/Execution time_B = 2/1.94 = 1.03 (approx)