Home »
Compiler Design
Parsing in Compiler
In this article, we are going to learn about the parsing in compiler. Classification of Grammar based on derivation trees and Number of Strings.
Submitted by Anusha Sharma, on March 21, 2018
Parsing
The processes of constructing the parse tree for a given input string are called Parsing. To parse any input string the language must be defined by the contextfree grammar(CFG).
Grammar
Finite set of rules that may generate infinite number of sentence is called grammar.
Grammar has a four tuple G= (V, T, P, S)
 V = set of all non terminal
 T = set of all terminals
 P = set of all productions
 S = start symbol
Classification of Grammar
There are two types of classification one is based on number of derivation trees and second is based on number of strings.
1) Classification based on Derivation Trees
 Ambiguous grammar: The Grammar 'G' is said to be Ambiguous if there exist more than one derivation tree for the given input string.
 Unambiguous grammar: The Grammar 'G' is said to be Unambiguous if there exist unique derivation tree for given input string.
DERIVATION
The process of deriving a string is called derivation and geometrical representation is called derivation tree.
There are two types of derivation,
 LMD: The process of deriving a string by expanding the left most non terminals is called LMD and geometrical representation of LMD is called LMDT.
 RMD: The process of deriving a string by expanding right most non terminals is called RMD and geometrical representation of RMD is called RMDT.
2) Classification based on Number of Strings
 Recursive Grammar: The Grammar 'G' is said to be recursive if there exist at least one production which is having same variable both at LHS and RHS.
 Non Recursive Grammar: The Grammar 'G' is said to be non recursive if no production contains same variable both at LHS and RHS.
Types of Recursion
 Left Recursion: The Grammar is said to be left recursive if left most variable of RHS is same as the variable at LHS.
 Right Recursion: The Grammar is said to be right recursive if right most variable of RHS is same as the variable at LHS.
Note: If the grammar is left recursive then parser go to infinite loop .So to avoid the looping we need to convert left recursive grammar into right grammar.
Non Deterministic Grammar
The grammar with common prefixes is known as non deterministic grammar.
Example: A → ab1ab2ab3
It is non deterministic because the prefix is same.
 The grammar with common prefixes requires backtracking and backtracking is costly.
 To avoid the backtracking we need to convert the non deterministic grammar into deterministic grammar. Thus we need to perform backtracking.
Left Factoring
The process of conversion of grammar with common prefixes into deterministic grammar is called left factoring.
Example:
A → ab1ab2ab3
A → aw
w → b1b2b3
Classification of Parser

Top Down Parser
The process of constructing a parse tree starting from root and proceed to children is called TDP.
 TDP internally uses left most derivation.
 TDP constructed for grammar if it is free from ambiguity and left recursion.
 TDP used for the grammar with less complexity.
 Average time complexity is O(n^4).

Bottom Up Parser
The process of constructing a parse tree in a bottom up manner or starting with children and proceeds to root is called Bottom Up Parser.
 BUP is also called shift reduce parser.
 The BUP uses reverse of right most derivation.
 BUP is constructed for Unambiguous grammar.