# What is Chomsky Normal Form?

In this tutorial, we are going to learn about the chomsky normal form, why it is used, explanation with examples.
Submitted by IncludeHelp, on December 08, 2020

A grammar where each output is either of the form A → BC or A → c Example: S → AS | an A → SA | b

Where, the non-terminals are A, B, C and the terminal is a, we infer, from the above description, that in order to be in CNF, either two non-terminals or a single terminal must be derived from all production.

CNF limits the number of symbols to two on the right side of an output.

• Non-terminals or a single terminal must be the two symbols.

The standard form of Chomksy is a useful form for dealing with context-free grammar. This is a basic method of writing a CFG that is useful for understanding and explaining things about CFGs. It also allows a binary tree of the parse tree for derivations that use this form of the CFG.

What is the standard Chomsky form, then? When each rule is of the form A → BC and A → a, where an is a terminal, and A, B and C are variables, a CFG is natural in Chomsky form. Moreover the starting variable is not B or C. In addition, for technical reasons, we allow the S → ε rule where S is the starting variable. Notice that this means that, as one of several possible laws, we allow S → ε.

## Why Chomsky's Normal Form?

The key advantage is that every derivation of a string of n letters has exactly 2n-1 steps in the Chomsky Normal Form. Thus, by an exhaustive search of all derivations, one can determine if a string is in the language.

For instance—

S x AB AB
A → a
B → b

Free grammar in this context is in normal Chomsky form.

To standardize the grammar using CNF, the following steps are followed—

Step 1:

Elimination of productivity

Eliminating unit manufacturing

Elimination of obsolete inventions

Step 2:

Replace each A → B1B2B3 production....Bn where n → 2 with A → B1C where C → B2B3....Bn.

Repeat this step for all productions that have more than two RHS variables.

Step-3:

Substitute A x XB and X x a for each output of type A x aB.

For all productions having the form A x aB, repeat this step.

Practice Solution

Example 1

Convert the grammar given to CNF-

A → aB / bAB
B → b
D →d

Step 1:

The grammar given is already completely diminished.

Step 2:

The productions that are already in normal Chomsky form are—

B → b .......... (1)
D → d .......... (2)

Such productions will stay as they are.

Productions that are not in the normal Chomsky form are—

A → aB / bAB .......... (4)

We're going to convert these productions to the normal Chomsky form.

Step 3:

Replace the terminal symbols a and b with the new Ca and Cb variables.

This is achieved by incorporating the grammar of the following two recent productions—

Ca → a .......... (5)
Cb b .......... (6)

The productions (3) and (4) are now adapted to—

A → CaB / CbAB .......... (8)

Step 4:

This is achieved by incorporating the grammar of the following two recent productions—

CAB →AB .......... (10)

The productions (7) and (8) are now altered to—

A → CaB / CbCAB .......... (12)

Step 5:

The resulting grammar is—from (1), (2), (5), (6), (9), (10), (11) and (12),

A → Ta→i / CbCAB
B → b b
D to d
Ca an a
Cb → b b
CAB to AB

This grammar is in the normal Chomsky form.

Example 2

Consider the following CFG: S → aXbXX

X → aY | bY | ε
Y → X | c

Step 1:

The vector X is nullable, and thus Y is nullable. After ε has been deleted, we get:

S → aXbX | abX | aXb | ab XbX | ab
X → aY | bY | a | b
Y → X | c

Step 2:

After the development unit Y → X is deleted, we get:

S → aXbX | abX | aXb | ab XbX | ab
X → aY | bY | a | b
Y → aY | bY | a | b | c | c

Step 3:

Now, sever S's RHSs; and replace a with A, b with B and c with C, wherever the units are not:

S → EF | AF | EB | AB EF | AF | AB
X for AY | BY | a | b
Y → AY | BY | a | b | c
E → AX
F → BX
A → a
B → b
C → c