# What is Chomsky Normal Form?

In this tutorial, we are going to learn about the chomsky normal form, why it is used, explanation with examples.

Submitted by **IncludeHelp**, on December 08, 2020

A grammar where each output is either of the form **A → BC** or **A → c** Example: **S → AS | an A → SA | b**

Where, the non-terminals are *A*, *B*, *C* and the terminal is a, we infer, from the above description, that in order to be in *CNF*, either two non-terminals or a single terminal must be derived from all production.

*CNF* limits the number of symbols to two on the right side of an output.

- Non-terminals or a single terminal must be the two symbols.

The standard form of Chomksy is a useful form for dealing with context-free grammar. This is a basic method of writing a CFG that is useful for understanding and explaining things about CFGs. It also allows a binary tree of the parse tree for derivations that use this form of the CFG.

What is the standard Chomsky form, then? When each rule is of the form A** → BC** and **A → a**, where an is a terminal, and *A*, *B* and *C* are variables, a *CFG* is natural in **Chomsky form**. Moreover the starting variable is not *B* or *C*. In addition, for technical reasons, we allow the **S → ε** rule where *S* is the starting variable. Notice that this means that, as one of several possible laws, we allow **S → ε**.

## Why Chomsky's Normal Form?

The key advantage is that every derivation of a string of n letters has exactly 2n-1 steps in the Chomsky Normal Form. Thus, by an exhaustive search of all derivations, one can determine if a string is in the language.

For instance—

S x AB AB

A → a

B → b

Free grammar in this context is in normal Chomsky form.

To standardize the grammar using CNF, the following steps are followed—

**Step 1:**

Absolutely minimize your grammar by-

Elimination of productivity

Eliminating unit manufacturing

Elimination of obsolete inventions

**Step 2:**

Replace each A → B1B2B3 production....Bn where n → 2 with A → B1C where C → B2B3....Bn.

Repeat this step for all productions that have more than two RHS variables.

**Step-3:**

Substitute A x XB and X x a for each output of type A x aB.

For all productions having the form A x aB, repeat this step.

**Practice Solution**

**Example 1**

Convert the grammar given to CNF-

S → aADD

A → aB / bAB

B → b

D →d

**ANSWER**

**Step 1:**

The grammar given is already completely diminished.

**Step 2:**

The productions that are already in normal Chomsky form are—

B → b .......... (1)

D → d .......... (2)

Such productions will stay as they are.

Productions that are not in the normal Chomsky form are—

S → aAD .......... (3)

A → aB / bAB .......... (4)

We're going to convert these productions to the normal Chomsky form.

**Step 3:**

Replace the terminal symbols a and b with the new Ca and Cb variables.

This is achieved by incorporating the grammar of the following two recent productions—

Ca → a .......... (5)

Cb b .......... (6)

The productions (3) and (4) are now adapted to—

S → CaAD .......... (7)

A → CaB / CbAB .......... (8)

**Step 4:**

Replace AD and AB with new CAD and CAB variables.

This is achieved by incorporating the grammar of the following two recent productions—

CAD → AD .......... (9)

CAB →AB .......... (10)

The productions (7) and (8) are now altered to—

S → CaCAD .......... (11)

A → CaB / CbCAB .......... (12)

**Step 5:**

The resulting grammar is—from (1), (2), (5), (6), (9), (10), (11) and (12),

S CaCAD CaCAD

A → Ta→i / CbCAB

B → b b

D to d

Ca an a

Cb → b b

CAD → ADD

CAB to AB

This grammar is in the normal Chomsky form.

**Example 2**

Consider the following CFG: S → aXbXX

X → aY | bY | ε

Y → X | c

**ANSWER**

**Step 1:**

The vector X is nullable, and thus Y is nullable. After ε has been deleted, we get:

S → aXbX | abX | aXb | ab XbX | ab

X → aY | bY | a | b

Y → X | c

**Step 2:**

After the development unit Y → X is deleted, we get:

S → aXbX | abX | aXb | ab XbX | ab

X → aY | bY | a | b

Y → aY | bY | a | b | c | c

**Step 3:**

Now, sever S's RHSs; and replace a with A, b with B and c with C, wherever the units are not:

S → EF | AF | EB | AB EF | AF | AB

X for AY | BY | a | b

Y → AY | BY | a | b | c

E → AX

F → BX

A → a

B → b

C → c

TOP Interview Coding Problems/Challenges

- Run-length encoding (find/print frequency of letters in a string)
- Sort an array of 0's, 1's and 2's in linear time complexity
- Checking Anagrams (check whether two string is anagrams or not)
- Relative sorting algorithm
- Finding subarray with given sum
- Find the level in a binary tree with given sum K
- Check whether a Binary Tree is BST (Binary Search Tree) or not
- 1[0]1 Pattern Count
- Capitalize first and last letter of each word in a line
- Print vertical sum of a binary tree
- Print Boundary Sum of a Binary Tree
- Reverse a single linked list
- Greedy Strategy to solve major algorithm problems
- Job sequencing problem
- Root to leaf Path Sum
- Exit Point in a Matrix
- Find length of loop in a linked list
- Toppers of Class
- Print All Nodes that don't have Sibling
- Transform to Sum Tree
- Shortest Source to Destination Path

Comments and Discussions