Consider the context-free grammar:
S -> S S + | S S * | a
aa+a*
can be generated by this grammar.S
-> S
S -> S
S + S -> a S
+ S -> a a + S
-> a a + a *What language is generated by the following grammars? In each case justify your answer.
Which of the grammars in Exercise 2.2.2 are ambiguous?
No
No
Yes
Yes
Yes
Construct unambiguous context-free grammars for each of the following languages. In each case show that your grammar is correct.
1. E -> E E op | num
2. list -> list , id | id
3. list -> id , list | id
4. expr -> expr + term | expr - term | term
term -> term * factor | term / factor | factor
factor -> id | num | (expr)
5. expr -> expr + term | expr - term | term
term -> term * unary | term / unary | unary
unary -> + factor | - factor | factor
factor - > id | num | (expr)
Show that all binary strings generated by the following grammar have values divisible by 3. Hint. Use induction on the number of nodes in a parse tree.
num -> 11 | 1001 | num 0 | num num
Does the grammar generate all binary strings with values divisible by 3?
Proof
Any string derived from the grammar can be considered to be a sequence consisting of 11 and 1001, where each sequence element is possibly suffixed with a 0.
Let n
be the set of positions where 11
is placed. 11
is said to be at position i
if the first 1
in 11
is at position i
, where i
starts at 0 and
grows from least significant to most significant bit.
Let m
be the equivalent set for 1001
.
The sum of any string produced by the grammar is:
sum
= Σn (21 + 20) 2 n + Σm (23 + 20) 2m
= Σn 3 2 n + Σm 9 2m
This is clearly divisible by 3.
No. Consider the string "10101", which is divisible by 3, but cannot be derived from the grammar.
Readers seeking a more formal proof can read about it below:
Proof:
Every number divisible by 3 can be written in the form 3k
. We will consider k > 0
(though it would be valid to consider k
to be an arbitrary integer).
Note that every part of num(11, 1001 and 0) is divisible by 3, if the grammar could generate all the numbers divisible by 3, we can get a production for binary k from num's production:
3k = num -> 11 | 1001 | num 0 | num num
k = num/3 -> 01 | 0011 | k 0 | k k
k -> 01 | 0011 | k 0 | k k
It is obvious that any value of k
that has more than 2 consecutive bits set to 1 can never be produced. This can be confirmed by the example given in the beginning:
10101 is 3*7, hence, k = 7 = 111 in binary. Because 111 has more than 2 consecutive 1's in binary, the grammar will never produce 21.
Construct a context-free grammar for roman numerals.
Note: we just consider a subset of roman numerals which is less than 4k.
via wikipedia, we can categorize the single roman numerals into 4 groups:
I, II, III | I V | V, V I, V II, V III | I X
then get the production:
digit -> smallDigit | I V | V smallDigit | I X
smallDigit -> I | II | III | ε
and we can find a simple way to map roman to arabic numerals. For example:
via the upper two rules, we can derive the production:
romanNum -> thousand hundred ten digit
thousand -> M | MM | MMM | ε
hundred -> smallHundred | C D | D smallHundred | C M
smallHundred -> C | CC | CCC | ε
ten -> smallTen | X L | L smallTen | X C
smallTen -> X | XX | XXX | ε
digit -> smallDigit | I V | V smallDigit | I X
smallDigit -> I | II | III | ε