ELEC2441 Computer Organization and Microprocessors
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Department of Electrical &Electronic Engineering
ELEC2441 Computer Organization and Microprocessors
Examination
Date: 20 December 2022 Time: 9:30am-12:30pm
Answer ALL questions.
Q1. A Review on Essential Concepts (30%)
Carefully read each question and write down the most appropriate option in your answer book,e.g.1)X;2)Y;…etc.where X and Y are the selected options in the corresponding questions. Each question carries 3 marks.1 mark will be deducted for each wrong answer or multiple options provided to each question. No mark will be deducted for any blank answer.
1)After performing the binary computation of (1011-0010),an ARM processor output the result as 1001 that will be interpreted as
A.+9 since+ve integers is the default settings in any ARM processor;
B.-6 with the MSB=1 under the 1's complement scheme;
C.-7 with the MSB=1 under the 2's complement scheme;
D.a signed or unsigned integer as up to the programmer's knowledge;
E.an invalid result as overflow occurs.
2)The system throughput of an ARM processor is 1 CPI after the pipeline is filled.A further reduction of the CPI can be attained by
A.increasing the total number of instructions entered into the pipeline;
B.increasing the total number of registers available in the processor;
C.increasing the largest difference between the latencies of all pipeline stages;
D.increasing the total number of pipeline stages;
E.none of the above.
3)For a 32-bit signed integer under the 2's complement scheme,both the MSB and LSB are 1 while the remaining bits are 0.The value of this 32-bit signed integer is
A.-(232)+1;
B.-(231)+1;
C.-(2(32/2))+2;
D.-(231)-1;
E.-(232)-1.
4)Assume the register rl holds -I while another register r2 holds +20 for which both values are represented with the 2's complement scheme,a student enrolled in ELEC 2441 wrote the following ARM instructions to perform the index out-of- bounds check for register r1.
CMP r1,r2;
BGE IndexOutOfBounds;
Which of the following suggestions is correct about the above ARM instructions to perform the relevant check for register r1 ?
A.the first instruction should be revised as“CMP r2,rl;”;
B.the second instruction should be revised as “BLE IndexOutofBounds;”;
C.the second instruction should be revised as “BVS IndexOutOfBounds;”;
D.the second instruction should be revised as “BHS IndexOutOfBounds;”;
E.none of the above.
5) When using PC-relative addressing for any branch instruction like BEQ for ARM computers,the branch address will be computed as:
A.[PC]+8
B.[PC]+(32-bit address specified in the compiled instruction)+8;
C.[PC]+(24-bit address specified in the compiled instruction)×4+8;
D.[PC]+(16-bit address specified in the compiled instruction)×4+8;
E.none of the above.
6)A GPU card has its own Video RAM(VRAM)of 4GB.Assume the whole VRAM is equally divided into 4 data blocks with the size of each data block as 1GB,the starting address of the first data block is $00000000.The ending address of the first data block should be:
A.SO1FF FFFF;
B.$02FF FFFF;
C.S04FF FFFF;
D.$3FFF FFFF;
E.$5FFF FFFF;
7)As discussed in LT-2,a multiplexer can be implemented by:
A.a decoder with a number of AND-gates and OR-gates;
B.an encoder,a decoder with a number of XOR-gates;
C.an encoder with a number of AND-gates and NOR-gates;
D.a decoder with a number of NOR-gates;
E.none of the above.
8)Assume it takes 1 clock cycle for each of the basic steps for executing any instruction,and the operand values of $2B and $3A are already stored at memory locations $80 and $81.After executing all the subsequent instructions,
LDAA $80; Load into ACC-A from address $80
LDAA $81; Load into ACC-A from address $81
ADDA $80; Add into ACC-A from address $80
Which of the following observations is correct ?
A.it takes 9 cycles in total to produce the final result of [ACC-A]as $56;
B.it takes 10 cycles in total to produce the final result of [ACC-A]as $65;
C.it takes 11 cycles in total to produce the final result of [ACC-A]as $65;
D.it takes 12 cycles in total to produce the final result of [ACC-A]as $74;
E.none of the above.
9)A Python program consisted of 570 instructions in total is run on a computer X with the user and system CPU times measured as 3.25 ns and 1.28 ns respectively. Suppose we run the same Python program on computer Y and Z of the same instruction set.Computer Y has a clock cycle time of 265 ps and a CPI of 2.3 while another computer Z has a clock cycle time of 552 ps and a CPI of 1.2 for the same Python program.Which computer is the fastest to run the Python program, and by how much to the slowest computer to run the same program ?
A.X is the fastest and is faster than Y by 76.6;
B.Y is the fastest and is faster than Z by 53.3;
C.X is the fastest and is faster than Z by 83.3;
D.Zis the fastest and is faster than X by 63.6;
E.none of the above.
10)Suppose A<0 and B≥0,the computation of (A-B)will give rise to the overflow condition when
A.The computed result of (A-B)>0;
B.The computed result of (A-B)<0;
C.The computed result of (A-B)=0
D.The computed result of(A-B)≤0;
E.The computed result of (A-B)≥0.
Q2.
(a)For a specific I/O interface board of a robot,the control/status register (CSR) and data buffer register (DBR)of input devices are assigned to the memory address 0x8018 and 0x8016 respectively.Below is a program fragment written for the original micro-controller of the robot to read in 1 byte of input data from a light sensor via the I/O interface board using one of the basic I/O mechanisms.
LDAA #$12
STAA $8018
LDAB S8016
To enhance the flying drone,a computer engineer named Chris decides to change the original micro-controller to a new micro-controller installed with a 32-bit ARM processor.All the relevant sensors and I/O interface board remain unchanged,i.e.with the same set of commands for I/O and assignments of memory addresses.Modify the above program fragment for the new micro- controller with the register r2 to hold the input data.Also,with the aid of a diagram,clearly explain the working of your modified program fragment for the above I/O mechanism. (14 marks)
(b)The following ARM assembly program fragment shows an interrupt service routine(ISR)of the revamped robot to perform the data transfer for an interrupt driven camera.
LDR r8, [r6,#Cdata] ; Cdata is an offset STR r8, [r7, #0]
ADD r7,r7,#4
CMP r7, r5
BLT EXIT ; signal the transfer as complete
EXIT ;return to the caller
Copy Table 2b into your answer book and then fill in the meanings of addresses/values being held by the corresponding registers rs -rg in the above ISR during program execution.The first row of the table is provided as an example only that can be skipped in the copied version.
Table 2b
Registers |
Meanings of Addresses/Values |
rd |
The final result computed by the ISR |
r5 |
|
r6 |
|
r7 |
|
r8 |
|
(8 marks)
(c)The computer engineer Chris discovers that the camera module in(b)also supports a new I/O mechanism that can transfer 128 bytes of input data from the camera module to the main memory.In addition,using the new I/O mechanism is always faster than the complete ISR used in(b).Clearly state the new I/O mechanism and also explain why this new I/O mechanism can always achieve a better efficiency of data transfer when compared to that of the complete ISR in (b) (8 marks)
Q3.
(a)Suppose the miss rates of a data cache and an instruction cache of a processor are 6.5%and 8%respectively.If the processor has a CPI of 2.5 without any memory stalls,the miss penalty as 46 cycles for any miss of the data cache, and the miss penalty as 56 cycles for any miss of the instruction cache, determine how much faster (rounded to 2 decimal places)a processor would run with a perfect cache that never missed.In the calculation,the frequency of all loads and stores is assumed to be 43.5% (15 marks)
(b)The following ARM assembly program fragment is used to generate 6 integers using an iterative procedure named myComp.The 6 newly generated integers should then be used to overwrite the original 6 computer words of zeros(0, 0,……0)reserved right after the initial value of I at the address label my tab as below.Assume i is the loop variable ranging from I to 6,at each iteration, the generated integer is the sum of i and the previously generated value j in my_tab.Initially when i=I,the generated integer is 1+the initial value in my_tab=1+1=2.When i=2,the generated integer is 2 +previously stored value=2+2=4.Similarly,when i=3,the generated integer is 3+4.
and so on.
my_tab DCI 1, 0, 0, 0, 0,0, 0 ;
Main LDR r1,=my_tab ;
MOV r0,#6 ;
MOV r2,#1 ;
BL myComp ;
END
Exit MOV pc,lr;return to the caller
myComp CMP r2, r0 ;
******
B myComp
i.Complete the above program fragment to successfully generate and output the following 6 new integers to the targeted memory locations starting from 0x204 using the iterative procedure myComp.
Symbo |
Address (Hex) |
Value(Decimal) |
my tab |
0x200 |
1 [original |
|
0x204 |
2 |
|
0x208 |
4 |
|
0x20C |
7 |
|
0x210 |
11 |
|
0x214 |
16 |
|
0x218 |
22 |
ii.When the above underlined instruction is changed from `MOV r0, #6’to `MOV r0,#11',clearly state the decimal value of the last integer generated by the iterative procedure myComp.
iii.Modify your completed program in part i.so that the 6 newly
generated integer is the sum of ((the loop variable i)/2)and (4×the
previously generated value j in my_tab).Precisely when i =I,the
generated integer is 1/2+4×the initial value in my_tab=0+4=4. When i=2,the generated integer is 2/2+4×previously stored value =1+4×4=17,and so on.Clearly show your code modification (i.e. highlighting your modified instructions only without copying the whole program),and also state the last two values of the 6 integers generated by the modified codes into my tab. (15 marks)
(c)Carefully examine the following diagram for the datapath design of the ARM7 processors.A student enrolled in ELEC 2441 argued that the basic functions of the barrel shifter and register bank is totally unimportant for the datapath of the ARM7 processors.Clearly stated whether you would agree with the student or not,and justify your answer with detailed explanation(S).
(10 marks)
2023-12-20