Hints¶
Tip
You may use JFLAP
to help yourself work on these exercises.
You may wish to go through the tutorial sections: "Context-free Grammar" and "Pushdown Automata" available at http://www.jflap.org/modules/ (accessible through the left yellow navigation pane) or go through the relevant chapters in the JFLAP book.
Green¶
Q1¶
Q1
Consider the following PDA
-
Simulate the following strings: (For each step record: the state, the symbol just read and the stack contents)
\[\tt 0001\qquad 00001\qquad 001\qquad 0011\qquad 000011\] -
Use set notation to describe the language recognized by this PDA.
\[\{\quad \texttt{0}^{\text{___}}\texttt{1}^{\text{___}} \quad\mid\quad n\geq \text{___}\quad \}\] -
Produce the formal definition for the above PDA.
NB. I have simplified what happens at the start of the string: technically it starts in \(\{A,B\}\), and we have 2 stacks, but the one at \(A\) disappears as soon as we read an actual symbol from the input string.
I have shown the non-determinism at \(\{D\}\).
For every two zeros (00
) you get a one (1
).
-
The set of states \(Q=\{A,\text{___},\text{___},\text{___},E\}\)
-
The input alphabet \(\Sigma=\{{\tt 0},\text{___}\}\)
-
The stack alphabet \(\Gamma=\{\circ,\text{___}\}\)
-
The start state \(q_\text{start}=\text{___}\)
-
The set of accept states \(F=\{\text{___},\text{___}\}\)
-
The transition function, \(\delta\colon Q\times\Sigma_\varepsilon\times \Gamma_\varepsilon \to 2^{Q\times \Gamma_\varepsilon}\), in table form
\[\begin{array}{|r||c|c|c||c|c|c||c|c|c|} \hline \Sigma_\varepsilon\times\Gamma_\varepsilon: & (\texttt{0},\bullet) & (\texttt{0},\circ) & (\texttt{0},\varepsilon) & (\texttt{1},\bullet) & (\texttt{1},\circ) & (\texttt{1},\varepsilon) & (\varepsilon,\bullet) & (\varepsilon,\circ) & (\varepsilon,\varepsilon) \\ \hline\hline {\to *}A & & & & & & & & & \{(B,\circ)\} \\ B & & & \{(C,\bullet)\} & \{(D,\varepsilon)\} & & & & & \\ C & & & \{(\_\_\_,\varepsilon)\} & & & & & & \\ D & & & & \{(D,\varepsilon)\} & & & & \{(E,\_\_\_)\} & \\ {*}E & & & & & & & & & \\ \hline \end{array}\]The \(\emptyset\) entries have been left blank to make the table easier to read.
Q2¶
Q2
For each of the following Context-Free Grammars (CFGs), give answers to the accompanying questions (together with a brief justification where needed).
You are given the following CFG \(G\) defined by the productions
This grammar generates all the strings over \(\texttt{a}\) and \(\texttt{b}\) that are not palindromes.
Answer the following questions:
-
What are the variables (non-terminals)? \(V=\{\text{___},\text{___},\text{___},\text{___}\}\)
-
What are the terminals? \(\Sigma=\{\text{___},\text{___}\}\)
-
What is the start variable? \(\text{___}\)
-
Give three strings in \(L(G)\). (\(L(G)\) means: "the language of \(G\)".)
-
Give three strings not in \(L(G)\).
-
True or False:
(a) \(T \to \tt aba\)
(b) \(T \xrightarrow* \tt aba\)
© \(T \to T\)
(d) \(T \xrightarrow* T\)
(e) \(XXX \xrightarrow* \tt aba\)
(f) \(X \xrightarrow* \tt aba\)
(g) \(T \xrightarrow* XX\)
(h) \(T \xrightarrow* XXX\)
(i) \(S \xrightarrow* \varepsilon\)
Hints
(a) \(T \to \tt aba\): False, there is no such rule in the given set of rules.
(b) \(T \xrightarrow* \tt aba\): True, \(T \to XTX \to \texttt{a} TX \to \texttt{a} T \texttt{b} \to \texttt{a} X\texttt{b} \to {\tt aba}\)
© \(T \to T\): False, there is no such rule in the given set of rules.
(d) \(T \xrightarrow* T\): True, always possible in zero steps, i.e. no replacement.
(e) \(XXX \xrightarrow* \tt aba\): True, \(XXX \to \texttt{a} XX\to \texttt{a}\texttt{b} X \to \tt aba\)
(f) \(X \xrightarrow* \tt aba\): False, \(X\) can only be replaced by one terminal.
(g) \(T \xrightarrow* XX\): True, \(T \to XTX \to X\varepsilon X = XX\)
(h) \(T \xrightarrow* XXX\): True, \(T \to XTX \to XXX\)
(i) \(S \xrightarrow* \varepsilon\): False, only possible route to \(\varepsilon\) is \(T\to\varepsilon\), but from the starting variable there is no route to \(T\) only. (\(\texttt{a} T\texttt{b}\) or \(\texttt{b} T\texttt{a}\))
Use the grammar to derive the following strings
Hints
-
\(A\to \texttt{b}\texttt{b} A\texttt{b} \to \texttt{b}\texttt{b} B\texttt{b} \to \texttt{b}\texttt{b}\texttt{a} B\texttt{b} \to \texttt{b}\texttt{b}\texttt{a} \varepsilon\texttt{b} \to \tt bbab\)
-
\(A\to \texttt{b}\texttt{b} A\texttt{b} \to \texttt{b}\texttt{b} B\texttt{b} \to \texttt{b}\texttt{b}\varepsilon\texttt{b} \to \texttt{b}\texttt{b}\texttt{b}\)
-
\(A\to B\to \texttt{a} B \to \texttt{a}\texttt{a} B \to \texttt{a}\texttt{a}\texttt{a} B \to \texttt{a}^4 B \to \texttt{a}^5 B \to \texttt{a}^6 B \to\texttt{a}^6\varepsilon= \texttt{a}^6\)
-
\(A \to \texttt{b}\texttt{b} A\texttt{b} \to \texttt{b}\texttt{b} \texttt{b}\texttt{b} A\texttt{b}\texttt{b} \to \texttt{b}^4 B\texttt{b}^2 \to \texttt{b}^4 \texttt{a} B\texttt{b}^2 \to \texttt{b}^4 \texttt{a}^2 B\texttt{b}^2 \to \texttt{b}^4 \texttt{a}^3 B\texttt{b}^2 \to \texttt{b}^4 \texttt{a} \varepsilon\texttt{b}^2 = \texttt{b}^4\texttt{a}^3\texttt{b}^2\)
Use the grammar to derive the following strings (where possible):
Hints
-
\(S \to \texttt{a} A\texttt{b}\texttt{b} \to \texttt{a}\texttt{a} A\texttt{b}\texttt{b}\texttt{b}\texttt{b} \to \texttt{a}\texttt{a}\varepsilon\texttt{b}\texttt{b}\texttt{b}\texttt{b} = \tt aabbbb\)
-
\(S \to \texttt{b} B\texttt{a}\texttt{a} \to \texttt{b}\texttt{b} B\texttt{a}\texttt{a}\texttt{a}\texttt{a} \to \texttt{b}\texttt{b}\varepsilon\texttt{a}\texttt{a}\texttt{a}\texttt{a} = \tt bbaaaa\)
-
\(\texttt{aabb}\), not possible.
-
\(S \to \texttt{b} B\texttt{a}\texttt{a} \to \texttt{b}\varepsilon\texttt{a}\texttt{a} = \tt baa\)
Let \(\Sigma=\{\texttt{a},+,\times,(,)\}\).
Give parse trees for each of the following strings
Hints
Q3¶
Q3
Convert the following (G)NFAs into regular grammars.
Hint
Hint
Q4¶
Q4
Design a PDA and a CFG for the following language over \(\Sigma=\{\texttt{a},\texttt{b}\}\)
Do this in two steps:
-
Explain the idea used, i.e. how does the stack help you?
-
Design a state diagram for the PDA.
-
Design a CFG.
-
Idea: union of two languages \(\(L=\{(\texttt{a}\texttt{b})^{n}\mid n\geq 0\} \; \cup \; \{(\texttt{a}\texttt{a}\texttt{a}\texttt{a})^{n}(\texttt{b}\texttt{b}\texttt{b})^{n}\mid n\geq 0\}.\)\)
Here \(\{(\texttt{a}\texttt{b})^{n}\mid n\geq 0\}=(ab)^*\) is regular -- no need to use the stack for it.
For \(\{(\texttt{a}\texttt{a}\texttt{a}\texttt{a})^{n}(\texttt{b}\texttt{b}\texttt{b})^{n}\mid n\geq 0\}\): count the occurrences of the string Β then match it with the number of occurrences of .
-
Abbreviated PDA:
Expanded PDA:
-
CFG
\[\begin{array}{rcl} S &\to& A\mid B\\ A &\to& \texttt{a}\texttt{b} A\mid \varepsilon\\ B &\to& \texttt{a}\texttt{a}\texttt{a}\texttt{a} B \texttt{b}\texttt{b}\texttt{b} \mid \varepsilon\\ \end{array}\]
Q5¶
Q5
Design PDAs and CFGs for each of the following languages
-
\(\{w \mid w=\texttt{b}^n\texttt{a}\texttt{b}^n,\quad n\geq 0 \}\)
-
\(\{w\texttt{c} w^R\mid w\in\{\texttt{a},\texttt{b}\}^*\}\)(so it is defined over the alphabet \(\{\texttt{a},\texttt{b},\texttt{c}\}\))
-
\(\{ww^R\mid w\in\{\texttt{a},\texttt{b}\}^*\}\)
-
The language of palindromes over \(\{\texttt{a}, \texttt{b}\}\)
-
The language of palindromes over \(\{\texttt{a}, \texttt{b}\}\) whose length is a multiple of 3
Hint: Consider the even and odd length cases first.
-
\(\{w \mid w=\texttt{b}^n\texttt{a}\texttt{b}^n,\quad n\geq 0 \}\)
\[S \to \texttt{b} S\texttt{b}\mid \texttt{a}\] -
\(\{w\texttt{c} w^R\mid w\in\{\texttt{a},\texttt{b}\}^*\}\)
\[S \to \texttt{a} S\texttt{a} \mid \texttt{b} S\texttt{b} \mid \texttt{c}\] -
\(\{ww^R\mid w\in\{\texttt{a},\texttt{b}\}^*\}\)
\[S \to \texttt{a} S\texttt{a} \mid \texttt{b} S\texttt{b} \mid \varepsilon\] -
The language of palindromes over \(\{\texttt{a}, \texttt{b}\}\)
\[S \to \texttt{a} S\texttt{a} \mid \texttt{b} S\texttt{b} \mid \texttt{a} \mid \texttt{b} \mid \varepsilon\] -
The language of palindromes over \(\{\texttt{a}, \texttt{b}\}\) whose length is a multiple of 3
\[\begin{array}{rcl} S &\to& \texttt{a} A \texttt{a} \mid \texttt{b} A \texttt{b} \mid \varepsilon \\ A &\to& \texttt{a} B \texttt{a} \mid \texttt{b} B \texttt{b} \mid \texttt{a} \mid \texttt{b} \\ B &\to& \texttt{a} C \texttt{a} \mid \texttt{b} C \texttt{b} \mid \texttt{a}\texttt{a} \mid \texttt{b}\texttt{b} \\ C &\to& S \mid \varepsilon \end{array}\]
Orange¶
Q1 -- Ambiguity¶
Q1
Sometimes a grammar can generate the same string in several different ways, with several different parse trees, and likely several different meanings. If this happens, we say that the string is derived ambiguously in that grammar, which is then qualified as being an ambiguous grammar.
Consider the CFG
Derive the string \(a+a\times a\) in two different ways using parse trees, and explain their (different) meanings.
Now note that the following alternative CFG is not ambiguous:
What is the parse tree for the previous example string (\(a+a\times a\))?
What is the parse tree for \((a+a)\times a\)?
The first one: \(a+(a\times a)\).
The second one: \((a+a)\times a\).
Using the second grammar to parse \(a+a\times a\) gives
and for \((a+a)\times a\) we get
Q2 -- CFG design¶
Q2
Design CFGs generating the following languages.
-
The language of all strings over \(\{\texttt{a}, \texttt{b}\}\) with a single symbol '\(\texttt{b}\)' located exactly in the middle of the string.
\[\tt \{b,aba,abb,bba,bbb,aabaa,\ldots\}\] -
The language of strings over \(\{\texttt{a}, \texttt{b}\}\) containing an equal number of \(\texttt{a}\)'s and \(\texttt{b}\)'s.
-
The language of strings with twice as many \(\texttt{a}\)'s as \(\texttt{b}\)'s.
-
\(\{\texttt{a}^i \texttt{b}^j\mid i,j\geq 0 \text{ and } i\geq j \}\)
-
\(\{\texttt{a}^i\texttt{b}^j\mid i,j\geq 0 \text{ and } i\neq j\}\) (Complement of the language \(\{\texttt{a}^n\texttt{b}^n\mid n\geq 0\}\))
-
The language of strings over \(\{\texttt{a}, \texttt{b}\}\) containing more \(\texttt{a}\)'s than \(\texttt{b}\)'s. (e.g.
abaab
) -
\(\{w\#x \mid w, x \in \{\texttt{0},\texttt{1}\}^* \text{ and } w^R \text{ is a substring of } x\}\)
-
\(\{x_1\#x_2\#\cdots \#x_k \mid k\geq 1, \text{each } x_i \in \{\texttt{a}, \texttt{b}\}^*, \text{and for some } i \text{ and } j, x_i = x_j^R\}\)
Give informal descriptions of PDAs for the above languages. (How would you use the stack?)
-
The language of all strings over \(\{\texttt{a}, \texttt{b}\}\) with a single symbol '\(\texttt{b}\)' located exactly in the middle of the string.
\[\begin{array}{rccc} S &\to& A S A & \mid \texttt{b} \\ A &\to& \texttt{a} & \mid \texttt{b} \end{array}\]PDA: need to guess the middle of the string, so need non-deterministic transition after each symbol. State before the non-deterministic transition counts the number of prior symbols, and the one after it checks if the number of the remaining symbols matches.
-
The language of strings over \(\{\texttt{a}, \texttt{b}\}\) containing an equal number of \(\texttt{a}\)'s and \(\texttt{b}\)'s.
\[S \to SS \mid \texttt{a} S\texttt{b} \mid \texttt{b} S \texttt{a} \mid \varepsilon\]PDA: string made of sub-strings/chunks that have equal symbols: if chunk starts with Β then fill stack for 's and empty for 's, and vice versa.
-
The language of strings with twice as many \(\texttt{a}\)'s as \(\texttt{b}\)'s.
\[S \to S\texttt{a} S \texttt{a} S\texttt{b} S \mid S\texttt{a} S \texttt{b} S\texttt{a} S \mid S\texttt{b} S \texttt{a} S\texttt{a} S \mid \varepsilon\]PDA: string made of sub-strings/chunks that have the required property: if chunk starts with Β then fill stack for 's and empty twice for 's, and vice versa.
-
\(\{\texttt{a}^i \texttt{b}^j\mid i,j\geq 0 \text{ and } i\geq j \}\)
Think: \(\texttt{a}^* \texttt{a}^n\texttt{b}^n\)
\[\begin{array}{rccl} S &\to& AB \\ A &\to& \texttt{a} A &\mid \varepsilon \\ B &\to& \texttt{a} B\texttt{b} &\mid \varepsilon \end{array}\]PDA: fill stack one token for each , then remove one token for each . If stack is not empty at the end then accept.
-
\(\{\texttt{a}^i\texttt{b}^j\mid i,j\geq 0 \text{ and } i\neq j\} = \{\texttt{a}^i \texttt{b}^j\mid i>j \} \;\cup\; \{\texttt{a}^i \texttt{b}^j\mid i<j \}\):
Think: \(\texttt{a}^+\texttt{a}^n\texttt{b}^n\) or \(\texttt{a}^n\texttt{b}^n\texttt{b}^+\).
Using variable \(C\) for \(\texttt{a}^n\texttt{b}^n\), \(A\) for \(\texttt{a}^+\), and \(B\) for \(\texttt{b}^+\), we get:
\[\begin{array}{rccl} S &\to& AC \mid CB \\ A &\to& \texttt{a} A &\mid \texttt{a} \\ B &\to& B\texttt{b} &\mid \texttt{b} \\ C &\to& \texttt{a} C\texttt{b} &\mid \varepsilon \end{array}\]or
\[\begin{array}{rcc} S &\to& X\texttt{b} X\texttt{a} B \mid T \mid U\\ T &\to& \texttt{a} T\texttt{b} \mid T\texttt{b} \mid \texttt{b}\\ U &\to& \texttt{a} U\texttt{b} \mid \texttt{a} U \mid \texttt{a}\\ X &\to& \texttt{a} \mid \texttt{b} \end{array}\]PDA: non-deterministically create two branches at the beginning: branchΒ #1 for the \(i>j\) case where we use the idea from the previous case; branchΒ #2 for the \(i<j\) case where we fill stack one token for each , then remove one token for each . If stack is empty before the end of the string then accept.
-
The language of strings over \(\{\texttt{a}, \texttt{b}\}\) containing more \(\texttt{a}\)'s than \(\texttt{b}\)'s.
\[\begin{array}{rcl} S &\to& A\texttt{a} A \\ A &\to& AA \mid \texttt{a} A\texttt{b} \mid \texttt{b} A\texttt{a} \mid \texttt{a} A \mid \varepsilon \end{array}\]or
\[\begin{array}{rcl} S &\to& AS \mid \texttt{a} A \mid \texttt{a} S \\ A &\to& AA \mid \texttt{a} A\texttt{b} \mid \texttt{b} A\texttt{a} \mid \varepsilon \end{array}\]This is because if a string \(w\) contains more 's that 's, then it must be of one of the following forms:
-
"\(\texttt{a} x\)" such that \(x\) contains more 's than 's.
-
"\(\texttt{a} x\)" such that \(x\) contains equal number of 's and 's.
-
"\(xy\)" such that \(x\) contains equal number of 's and 's, and \(y\) contains more 's than 's.
-
-
\(\{w\#x \mid w, x \in \{\texttt{0},\texttt{1}\}^* \text{ and } w^R \text{ is a substring of } x\}\)
\[\begin{array}{rcl} S &\to& TX \\ T &\to& 0T0 \mid 1T1 \mid \#X \\ X &\to& 0X \mid 1X \mid \varepsilon \end{array}\] -
\(\{x_1\#x_2\#\cdots \#x_k \mid k\geq 1, \text{each } x_i \in \{\texttt{a}, \texttt{b}\}^*, \text{and for some } i \text{ and } j, x_i = x_j^R\}\)
\[\begin{array}{rcl} S &\to& UPV \\ P &\to& \texttt{a} P\texttt{a} \mid \texttt{b} P\texttt{b} \mid T \mid \varepsilon \\ T &\to& \#MT \mid \# \\ U &\to& M\#U \mid \varepsilon \\ V &\to& \#MV \mid \varepsilon \\ M &\to& \texttt{a} M \mid \texttt{b} M \mid \varepsilon \end{array}\]
Q3¶
Q3
Let \(\Sigma = \{\texttt{a},\texttt{b}\}\) and let \(B\) be the language of strings that contain at least one Β in their second half. In other words, \(B = \{uv\mid u\in\Sigma^*, v\in\Sigma^*\texttt{b}\Sigma^* \text{ and } |v| \leq |u|\}\).
-
Give a PDA that recognizes \(B\).
-
Give a CFG that generates \(B\).
We need to guess where to break the input string into \(uv\), so we will need non-determinism.
We need to compute the length of \(u\), then ensure that \(v\) is at most as long as \(v\) and that it contains a \(\texttt{b}\).
Q4¶
Q4
Let
Show that \(C\) and \(D\) are both CFLs by producing PDAs or CFGs for them.
Q5¶
Q5
Give a counter example to show that the following construction fails to prove that the class of context-free languages (CFLs) is closed under the star operation.
Incorrect construction
Let \(A\) be a CFL that is generated by the CFG \(G = (V,\Sigma,R, S)\).
Add the new rule \(S \to SS\) and call the resulting grammar \(G'\).
This grammar is supposed to generate \(A^*\).
\(S \to SS\) produces multiple copies but does not produce .......... It needs to be added (if it is not present in the given language), so we need: \(S \to SS \mid .......\).
"Advanced"¶
Code to simulate PDAs
Extend your class for simulating NFAs from lab 2 to simulate PDAs.