闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

CS3342 – Assignment 1

2022

1. (20pt) We have discussed a simple calculator (textbook p.54, slide 2a.16), where the C-style com- ments are described using this regular expressions (\n is newline):

comment _→ /* (non-* | * non-/)* *+ /

| // (non-\n)* \n

Actually, the regular expression above for C comments is wrong! (Always think for yourself; there can be errors and incorrect information anywhere!) You are required to:

(a) (10pt) Prove there is an error; that means the regular expression either accepts a string which

is not a comment, or fails to accept a valid comment. You need to give an example of such a string. Explain why your example is correct.

(b) (10pt) Give a valid regular expression. This should be a nice, intuitive looking regular expres-

sion, not something horrible produced by jﬂap. (You can use jﬂap if you want, but the result must look readable.) You are allowed to use non-x (or x, or [^x]) to denote any character diferent from x ; x can also be a set of characters. Explain why your regular expression is correct.

2. (20pt) Keywords ﬁt the deﬁnition of identiﬁers and scanners identify them as such, and then look them up a table of keywords, because otherwise the deterministic ﬁnite automaton is unnecessarily larger. This question addresses the size of this DFA.

Assume that the identiﬁers and keywords are deﬁned as follows:

identifier _→ (_ | letter)(_ | letter | digit)*

letter _→ a | b | . . . | z

digit _→ 0 | 1 | . . . | 9

(a) (10pt) Draw a deterministic ﬁnite automaton that recognizes the following keywords directly: keyword _→ this | throw | throws | try

Each keyword will be identiﬁed in a separate accepting state, which is diﬀerent from any state recognizing identiﬁers. Label each accepting state accordingly. You are allowed to use non-x (or x, or [^x]) to denote any character diferent from x ; x can also be a set of characters.

(b) (10pt) Assuming the deﬁnition for identiﬁer stays the same as the one above and that all keywords belong to letter+ , what is the maximum number of states this DFA can have for a language with 40 keywords? Explain your answer. (Hint: the maximum number of states depends on the lengths of the keywords.)

3. (20pt) The driver for a table-driven scanner in Fig. 2.11 (textbook p.66, slide 2a.28) is built to handle the case when t1 <p w <p t2 , for some tokens t1 , t2 and a non-token string w, where the relation <p indicates a proper preﬁx. Consider the table-driven scanner in Fig. 2.12 (textbook p.67, slide 2a.29).

(a) (10pt) Identify such a situation t1 <p w <p t2 in the DFA associated with this scanner.

Indicate what strings t1 , t2 , and w are, as well as the token types for t1 , t2 .

(b) (10pt) Give an example of an error that causes the above scanner to unread more than two

characters in line 9 from bottom: unread remembered_chars. Indicate the string causing the error and the unread characters.

4. (40pt) Consider the following unambiguous grammar, G, for the dangling else problem:

1. program _→ stmt $$

2. stmt _→ balanced stmt

3. stmt _→ unbalanced stmt

4. balanced stmt _→ if cond then balanced stmt else balanced stmt

5. balanced stmt _→ other stmt

6. unbalanced stmt _→ if cond then stmt

7. unbalanced stmt _→ if cond then balanced stmt else unbalanced stmt

8. cond _→ ci , i > 1

9. other stmt _→ si , i > 1

Nonterminals: {program, stmt, balanced stmt, unbalanced stmt, other stmt, cond} Terminals: {if, then, else, ci , si , $$}

Starting nonterminal: program

(a) (5pt) Show the parse tree of G for the input:

if c1 then if c2 then s1 else if c3 then s2 $$ .

(b) (10pt) Compute all sets F1RsT(X), FoLLow(X), for all nonterminals X, as shown in the ex-

ample in Figure 2.23 (textbook p.87, slide 2b.21). The algorithm in Fig. 2.24 (textbook p.88, slides 2b.19-20) for computing these is adding symbols in four steps, three of which are of interest for our question (the 1..3 labels are added here for future reference):

1. add F1RsT(Yi ) to F1RsT(X)

2. add string F1RsT(β) to FoLLow(B)

3. add FoLLow(A) to FoLLow(B)

Each of these steps uses a production to add symbols to a set. For each symbol added to any of the sets you computed, indicate the step and the production used: (step, prod), 1 s step s 3, 1 s prod s 9. The meaning is that production prod was used to add he symbol at step step. If the same symbol is added multiple times, give the (step, prod) for the ﬁrst time it is added.

2.23 (textbook p.87, slide 2b.21).

(d) (5pt) Prove that G is not LL(1).

(e) (10pt) Show how you can employ on G the techniques we used for attempting to make a

grammar LL(1).

(f) (5pt) Is the new grammar LL(1)? Prove your answer.

READ ME! Submit your answers as a single pdf ﬁle in OWL. Solutions should be typed but readable (by others!)

hand-written solutions are acceptable. Source code, if required, is submitted as separate ﬁles.

JFLAP: You are allowed to use JFLAP to help you solve the assignment. Make sure you understand what it does; JFLAP will not be available during in-person exams!

LATEX: For those interested, the best program for scientiﬁc writing is LATEX. It is far superior to all the other programs, it is free, and you can start using it in minutes; here is an introduction: https://tobi.oetiker.ch/lshort/lshort.pdf