闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

G5035

BSc AND MComp SECOND YEAR EXAMINATION 2021

Compilers and Computer Architecture

1. This question is about lexing, parsing, and CFGs (= context free grammars).

(a) For each of the following statements about ASTs (= abstract syntax trees), state whether they are true or false. Correct answers are awarded 2 marks; wrong answers are penalised by 1 mark (to a minimum of 0 marks for this question). Omitted answers do not contribute positively or negatively to your grade. [14 marks]

i. The lexer takes as input a list of tokens, representing a program.

ii. The lexer produces an AST from the list of tokens.

iii. The token list is produced in the semantic analysis stage.

iv. The purpose of the lexing phase is to make it easier to adapt a compiler to new machine architectures (such as moving from Intel to ARM processors).

v. The purpose of constructing an AST is to represent the structure of the computer program, to simplify code generation.

vi. The purpose of constructing an AST is to allow the compiler to generate faster executable machine code.

vii. Dynamically typed languages like Python and Javascript don’t need to do lexing and parsing.

(b) In this question, we consider CFGs in which we take S to be the initial variable. For each of the following statements, state whether they are true or false. Correct answers are awarded 4 marks; wrong answers are penalised by 1 mark (to a minimum of 0 marks for this question). Omitted answers do not contribute positively or negatively to your grade.

[16 marks] i. The CFG over the alphabet {a, b, =, +, 1, ; } with reductions

S 二 A; S │ A 二 a=E │ A 二 b=E │ E 二 a │ E 二 b │ E 二 1 │ E 二 V+V

True or false: the language of this CFG is inﬁnite (meaning: contains inﬁnitely many strings).

ii. The CFG over the alphabet {(, )} with reductions S 二 (S) │ S 二 (SS) │ S 二 ()

True or false: the language of this CFG is precisely the set of non-empty strings over {(, )} with balanced brackets, including e.g. (), (()), (()(())), ...

iii. The CFG over the alphabet {(, )} with reductions

S 二 (T) │ S 二 T (T) │ S 二 (T ((T)T)) │ S 二 ε │ T 二 S

True or false: the language of this CFG is precisely the set of (possibly empty) strings over {(, )} with balanced brackets, including e.g. (), (()), (()(())), ....

iv. The CFG over the alphabet {a, !} with reductions

S 二 HSHS │ S 二 aH!! │ S 二 ε │ H 二 SHS True or false: the CFG is left-recursive.

(c) For this question, you are to provide a formal speciﬁcation of the structure of the reduction relations of a CFG, given that a CFG can itself be described using a sequence of symbols.

Consider a context-free grammar G = (A, V, S, X), with alphabet A = {a, b, c}, variables V = {P, Q, R, S, T}, and start variable S . Observe that one may specify the reduction relation X with a string of the following format:

❼ A single reduction rule has the format “〈var) ::=〈string)”, where

–〈var) e V , and

–〈string) is either a single symbol ‘epsilon’, or a non-empty string over the set A n V .

❼ The set of all reduction rules consists either of just a single reduction rule, or a semicolon-separated list “〈rule1 ) ;〈rule2 ) ; . . . ” in which each〈rulej ) is a single reduction rule.

Write a formal grammar (involving alphabet symbols ‘ ::= ’, ‘epsilon’, ‘;’ , ‘a’, ‘b’, ‘c’, ‘P ’, ‘Q’, etc.) which accepts precisely the strings with the above format, for a CFG G = (A, V, S, X) as set out above. (Hint: use notation which distinguishes the rules of the CFG G, from the rules of the formal grammar which you use to describe G.) [20 marks]

2. This question is about code generation. The ﬁrst two parts make reference to the following program fragment in a Java-like language:

class MyInt {

private int n;

public MyInt (int j) { n=j; }

public int f (int x) { n=n+1; return n*x; }

public MyInt g (int x) {

int j = x+n;

MyInt r = new MyInt(j);

return r;

}

(a) Consider an invocation of the form a .g(z) for variables a (of type MyInt) and z (of type int). Describe the locations in memory of the variables x, j, n, and r (making reference to the stack, activation frame, symbol table, or heap as appropriate), and explain the reason for your answer.

i. Brieﬂy describe the location of the variable x, and the reason for this. [4 marks]

ii. Describe the location of the variable j. Does it have any simple relationship to the location of x? Brieﬂy explain your answer.

[8 marks] iii. Describe the location of the variable n. Does it have any simple relationship to the location of x or j? Brieﬂy explain your answer.

[8 marks] iv. Describe the location of the object r. Does it have any simple relationship to the location of x, j, or n? Brieﬂy explain your answer.

[8 marks]

(b) Explain how the compilation of methods and method invocation in object-oriented languages, can be reduced to the compilation of procedures and procedure calls. Your answer should suppose an approach involving method tables.

i. Write pseudocode representing how MyInt::f would be realised as a procedure, and how an invocation a .f(x) would be realised as a procedure invocation for an object a of class MyInt. [6 marks]

ii. Explain brieﬂy how method bodies are stored in memory, and how these memory locations may be found at run-time in a method invocation. [10 marks]

iii. Explain brieﬂy what advantage there is to perform such a conversion, e.g., in a compiler accepting multiple source languages.

[6 marks]

3. This question is about computer architecture.

(a) In most common architectures, a memory access to an odd-valued address may either fail, or be slower than a memory access to an

address which is a multiple of 4 or 8. Brieﬂy explain why. [10 marks]

(b) Describe the two main issues which apply to the use and availability of the following for kinds of memory. Describe how these issues apply to each sort of memory (a very brief description for each). [10 marks]

i. Registers;

ii. Flash memory;

iii. Dynamic RAM (main memory);

iv. Static RAM (cache);

(c) In this question, you will write (short) RISC-V assembly programs for two different tasks. Your programs may only use the registers x0, a0, t1, t2, sp, and fp, and the following RISC-V assembly commands:

lw	register1	offset (register2 )	load a word of data at offset from an address stored in register2 , into register1 .
sw	register1	offset (register2 )	store the value of register1 into memory at offset from an address stored in register2 .
add	register1	register2 register3	add the values stored in register3 and register2 , and store the result in register1 .
addi	register1	register2 constant	add the signed 16-bit value constant to the value stored in register2 and and store the result in register1 .
b	address		jump to the instruction at address .
beq	register1	register2 address	if the contents of register1 and register2 are equal, jump to the instruction at address .
bne	register1	register2 address	if the contents of register1 and register2 are different, jump to the instruction at address .

Each instruction is to be written on a separate line. Use the convention that the stack grows “downwards” (i.e., that the front of the stack is at a lower address than the rest of the stack) and that sp points at an unused word in memory below the front of the stack.

i. Write a program that pushes the value 0 onto the front of the stack. [10 marks]

ii. Write a RISC-V assembly program to add a sequence of numbers which are stored on the stack. Assume the following:

❼ The last item on the stack to be added, is at the address which is pointed to by fp; and that the initial value stored by fp is an address in memory which is higher than that stored in sp.

❼ In particular: there is at least one item on the stack to be added.

The resulting sum is to be stored at the front of the stack. The ﬁrst instruction should have a symbolic address “start”, and the last instruction should jump to a symbolic address “done” . [20 marks]