Assignment 3 (Regular Languages, Pumping Lemma, RE2DFA)

  1. Convert the following regular expressions to DFAs using the RE2DFA algorithm:

    • a*bb*a

    • (aa+bb)*aaa

    Show all of the intermediate steps: expression tree, leaf node numberings, nullable, firstpos, lastpos, followpos, and the final DFA.

  2. Using Arden's Lemma, derive the regular expression for the following DFA:
    start q1
    final q4
    trans q1:a:q2
    trans q1:b:q4
    trans q2:a:q3
    trans q2:b:q3
    trans q3:a:q2
    trans q3:b:q1
    trans q4:a:q4
    trans q4:b:q4
    

  3. Use the Pumping Lemma for regular languages to prove that the following language is not regular:

    L = { anba2n | n ≥ 0}

    Claim: L = { anba2n | n ≥ 0} is not regular.

    Proof: Let L be regular. Then, the Pumping Lemma for Regular Languages guarantees us a constant M.

    Pick w = aMba2M. Clearly w is in L and |w| ≥ M.

    So, the Pumping Lemma assures us that w = uvx with:

    |uv| ≤ M

    |v| ≥ 1

    uvix is in L for i ≥ 0.

    Since |uv| ≤ M, u = ak1, v = ak2, and x = ak3ba2M. Also, k1 + k2 + k3 = M.

    Pick i = 2.

    So, uv2x is in L. i.e., ak1a2k2 ak3ba2M is in L.

    So, k1 + 2 k2 + k3 = M.

    So, M + k2 = M

    But this is a contradiction because k2 ≥ 1.

    Hence, L is not regular.

  4. Implement the RE2DFA algorithm using the PLY package. The context-free grammar for the language of regular expressions is shown below.
    re : re PLUS term
    re : term
    term : term factor
    term : factor
    factor : factor STAR
    factor : niggle
    niggle : LETTER
    niggle : LPAREN re RPAREN
    
    Since the expression tree nodes have various attributes associated with them, such as nullable, firstpos, lastpos, etc., this problem can benefit from building a traditional tree data structure with nodes and pointers to children nodes. I am providing a skeletal file, RENode.py, below, which you should use to construct the tree data structure in REParser.py and also use in the main program to compute the various functions such as nullable, firstpos, and lastpos. For this assignment, we will assume that there are no ε or φ regular expressions.
    class RENode:
      def __init__(self):
        self.operator = ''     # '*', '.', '+', 'leaf'
        self.symbol = ''       # only for leaf nodes
        self.position = 0      # only for leaf nodes
        self.lchild = None
        self.rchild = None     # only for . and +
        self.nullable = False
        self.firstpos = set()
        self.lastpos = set()
    
      # Your methods go here
    
      def to_string(self,n):
        result = ' '*n
        if self.operator == 'leaf':
          result += 'SYMBOL: '+self.symbol
          result += ' NULLABLE='+str(self.nullable)
          result += ' FIRSTPOS='+str(self.firstpos)
          result += ' LASTPOS='+str(self.lastpos)+'\n'
        elif self.operator == '*':
          result += 'OPERATOR: STAR'
          result += ' NULLABLE='+str(self.nullable)
          result += ' FIRSTPOS='+str(self.firstpos)
          result += ' LASTPOS='+str(self.lastpos)+'\n'
          result += self.lchild.to_string(n+2)
        elif self.operator == '.':
          result += 'OPERATOR: DOT'
          result += ' NULLABLE='+str(self.nullable)
          result += ' FIRSTPOS='+str(self.firstpos)
          result += ' LASTPOS='+str(self.lastpos)+'\n'
          result += self.lchild.to_string(n+2)
          result += self.rchild.to_string(n+2)
        elif self.operator == '+':
          result += 'OPERATOR: PLUS'
          result += ' NULLABLE='+str(self.nullable)
          result += ' FIRSTPOS='+str(self.firstpos)
          result += ' LASTPOS='+str(self.lastpos)+'\n'
          result += self.lchild.to_string(n+2)
          result += self.rchild.to_string(n+2)
        else:
          result += 'SOMETHING WENT WRONG'
        return result
    
      def __str__(self):
        return self.to_string(0)
    
    Submit RE2DFA.py, RELexer.py, REParser.py, RENode.py, and DFA.py. A sample run is shown below:
    $ python3 RE2DFA.py (aa+bb)*aaa
    start 1_3_5
    final 2_6_8
    1_3_5:a:2_6
    1_3_5:b:4
    2_6:a:1_3_5_7
    2_6:b:EMPTY
    4:a:EMPTY
    4:b:1_3_5
    1_3_5_7:a:2_6_8
    1_3_5_7:b:4
    2_6_8:a:1_3_5_7
    2_6_8:b:EMPTY
    EMPTY:a:EMPTY
    EMPTY:b:EMPTY