# What is Chomsky Normal Form in DBMS?

In this tutorial, we are going to learn about the chomsky normal form, why it is used, explanation with examples.

By **IncludeHelp** Last updated : May 29, 2023

## Overview

A grammar where each output is either of the form **A → BC** or **A → c** Example: **S → AS | an A → SA | b**

Where, the non-terminals are *A*, *B*, *C* and the terminal is a, we infer, from the above description, that in order to be in *CNF*, either two non-terminals or a single terminal must be derived from all production.

*CNF* limits the number of symbols to two on the right side of an output.

- Non-terminals or a single terminal must be the two symbols.

## What is Chomsky Normal Form in DBMS?

The standard form of Chomksy is a useful form for dealing with context-free grammar. This is a basic method of writing a CFG that is useful for understanding and explaining things about CFGs. It also allows a binary tree of the parse tree for derivations that use this form of the CFG.

What is the standard Chomsky form, then? When each rule is of the form A** → BC** and **A → a**, where an is a terminal, and *A*, *B* and *C* are variables, a *CFG* is natural in **Chomsky form**. Moreover the starting variable is not *B* or *C*. In addition, for technical reasons, we allow the **S → ε** rule where *S* is the starting variable. Notice that this means that, as one of several possible laws, we allow **S → ε**.

## Why Chomsky's Normal Form?

The key advantage is that every derivation of a string of n letters has exactly 2n-1 steps in the Chomsky Normal Form. Thus, by an exhaustive search of all derivations, one can determine if a string is in the language.

For instance—

S x AB AB

A → a

B → b

Free grammar in this context is in normal Chomsky form.

## Algorithm

To standardize the grammar using CNF, the following steps are followed—

**Step 1:**

Absolutely minimize your grammar by-

Elimination of productivity

Eliminating unit manufacturing

Elimination of obsolete inventions

**Step 2:**

Replace each A → B1B2B3 production....Bn where n → 2 with A → B1C where C → B2B3....Bn.

Repeat this step for all productions that have more than two RHS variables.

**Step 3:**

Substitute A x XB and X x a for each output of type A x aB.

For all productions having the form A x aB, repeat this step.

### Practice Solution

#### Example 1

Convert the grammar given to CNF-

S → aADD

A → aB / bAB

B → b

D →d

**ANSWER**

**Step 1:**

The grammar given is already completely diminished.

**Step 2:**

The productions that are already in normal Chomsky form are—

B → b .......... (1)

D → d .......... (2)

Such productions will stay as they are.

Productions that are not in the normal Chomsky form are—

S → aAD .......... (3)

A → aB / bAB .......... (4)

We're going to convert these productions to the normal Chomsky form.

**Step 3:**

Replace the terminal symbols a and b with the new Ca and Cb variables.

This is achieved by incorporating the grammar of the following two recent productions—

Ca → a .......... (5)

Cb b .......... (6)

The productions (3) and (4) are now adapted to—

S → CaAD .......... (7)

A → CaB / CbAB .......... (8)

**Step 4:**

Replace AD and AB with new CAD and CAB variables.

This is achieved by incorporating the grammar of the following two recent productions—

CAD → AD .......... (9)

CAB →AB .......... (10)

The productions (7) and (8) are now altered to—

S → CaCAD .......... (11)

A → CaB / CbCAB .......... (12)

**Step 5:**

The resulting grammar is—from (1), (2), (5), (6), (9), (10), (11) and (12),

S CaCAD CaCAD

A → Ta→i / CbCAB

B → b b

D to d

Ca an a

Cb → b b

CAD → ADD

CAB to AB

This grammar is in the normal Chomsky form.

### Example 2

Consider the following CFG: S → aXbXX

X → aY | bY | ε

Y → X | c

**ANSWER**

**Step 1:**

The vector X is nullable, and thus Y is nullable. After ε has been deleted, we get:

S → aXbX | abX | aXb | ab XbX | ab

X → aY | bY | a | b

Y → X | c

**Step 2:**

After the development unit Y → X is deleted, we get:

S → aXbX | abX | aXb | ab XbX | ab

X → aY | bY | a | b

Y → aY | bY | a | b | c | c

**Step 3:**

Now, sever S's RHSs; and replace a with A, b with B and c with C, wherever the units are not:

S → EF | AF | EB | AB EF | AF | AB

X for AY | BY | a | b

Y → AY | BY | a | b | c

E → AX

F → BX

A → a

B → b

C → c

Related Tutorials

- Relational Calculus in DBMS
- DBMS SQL Joins (with Examples)
- Difference Between Inner and Outer Joins in DBMS SQL
- DBMS Transaction and ACID Properties
- Commit Point of Transaction | DBMS
- What is PJNF (Project-Join Normal Form)?
- Difference Between 1NF and 2NF in DBMS
- Difference Between 1NF and 3NF in DBMS
- Domain-key normal form | DBMS
- Fourth Normal Form (4NF) | DBMS
- Fifth Normal Form (5NF) | DBMS
- How to find the highest normal form of a relation in DBMS?
- DBMS | File Organization
- Physical Database Design Decisions
- Log-based Recovery in Database Management System

Comments and Discussions!