Home » Compiler Design

Parsing in Compiler

In this article, we are going to learn about the parsing in compiler. Classification of Grammar based on derivation trees and Number of Strings.
Submitted by Anusha Sharma, on March 21, 2018

Parsing

The processes of constructing the parse tree for a given input string are called Parsing. To parse any input string the language must be defined by the context-free grammar(CFG).

Grammar

Finite set of rules that may generate infinite number of sentence is called grammar.

Grammar has a four tuple G= (V, T, P, S)

  • V = set of all non terminal
  • T = set of all terminals
  • P = set of all productions
  • S = start symbol

Classification of Grammar

There are two types of classification one is based on number of derivation trees and second is based on number of strings.

1) Classification based on Derivation Trees

  1. Ambiguous grammar: The Grammar 'G' is said to be Ambiguous if there exist more than one derivation tree for the given input string.
  2. Unambiguous grammar: The Grammar 'G' is said to be Unambiguous if there exist unique derivation tree for given input string.

DERIVATION

The process of deriving a string is called derivation and geometrical representation is called derivation tree.

There are two types of derivation,

  1. LMD: The process of deriving a string by expanding the left most non terminals is called LMD and geometrical representation of LMD is called LMDT.
  2. RMD: The process of deriving a string by expanding right most non terminals is called RMD and geometrical representation of RMD is called RMDT.

2) Classification based on Number of Strings

  1. Recursive Grammar: The Grammar 'G' is said to be recursive if there exist at least one production which is having same variable both at LHS and RHS.
  2. Non Recursive Grammar: The Grammar 'G' is said to be non recursive if no production contains same variable both at LHS and RHS.

Types of Recursion

  1. Left Recursion: The Grammar is said to be left recursive if left most variable of RHS is same as the variable at LHS.
  2. Right Recursion: The Grammar is said to be right recursive if right most variable of RHS is same as the variable at LHS.

Note: If the grammar is left recursive then parser go to infinite loop .So to avoid the looping we need to convert left recursive grammar into right grammar.

Non Deterministic Grammar

The grammar with common prefixes is known as non deterministic grammar.

Example: A → ab1|ab2|ab3

It is non deterministic because the prefix is same.

  1. The grammar with common prefixes requires backtracking and backtracking is costly.
  2. To avoid the backtracking we need to convert the non deterministic grammar into deterministic grammar. Thus we need to perform backtracking.

Left Factoring

The process of conversion of grammar with common prefixes into deterministic grammar is called left factoring.

Example:


    A → ab1|ab2|ab3
    A → aw
    w → b1|b2|b3

Classification of Parser

  1. Top Down Parser
    The process of constructing a parse tree starting from root and proceed to children is called TDP.
    • TDP internally uses left most derivation.
    • TDP constructed for grammar if it is free from ambiguity and left recursion.
    • TDP used for the grammar with less complexity.
    • Average time complexity is O(n^4).
  2. Bottom Up Parser
    The process of constructing a parse tree in a bottom up manner or starting with children and proceeds to root is called Bottom Up Parser.
    • BUP is also called shift reduce parser.
    • The BUP uses reverse of right most derivation.
    • BUP is constructed for Unambiguous grammar.


Comments and Discussions!

Load comments ↻





Copyright © 2024 www.includehelp.com. All rights reserved.