How to remove duplicate columns in Pandas DataFrame?

Given a Pandas DataFrame, we have to remove duplicate columns. By Pranit Sharma Last updated : September 21, 2023

Columns are the different fields that contain their particular values when we create a DataFrame. We can perform certain operations on both rows & column values.

Duplicity is a column of pandas DataFrame occurs when there is more than 1 occurrence of similar elements.

Problem statement

Given a Pandas DataFrame, we have to remove duplicate columns.

Removing duplicate columns in Pandas DataFrame

For this purpose, we are going to use pandas.DataFrame.drop_duplicates() method. This method is useful when there are more than 1 occurrence of a single element in a column. It will remove all the occurrences of that element except one.

Syntax:

DataFrame.drop_duplicates(
    subset=None, 
    keep='first', 
    inplace=False, 
    ignore_index=False
    )

Parameter(s):

  • Subset: It takes a list or series to check for duplicates.
  • Keep: It is a control technique for duplicates.
  • inplace: It is a Boolean type value that will modify the entire row if True.

To work with pandas, we need to import pandas package first, below is the syntax:

import pandas as pd

Let us understand with the help of an example.

Python program to remove duplicate columns in Pandas DataFrame

# Importing pandas package
import pandas as pd

# Defining two DataFrames
df = pd.DataFrame(
    data={
        "Parle": ["Frooti", "Krack-jack", "Hide&seek", "Frooti"],
        "Nestle": ["Maggie", "Kitkat", "EveryDay", "Crunch"],
        "Dabur": ["Chawanprash", "Honey", "Hair oil", "Hajmola"],
    }
)

# Display DataFrame
print("Original DataFrame:\n", df, "\n")

# Removing duplicates
result = df.drop_duplicates(subset="Parle")

# Display result
print("DataFrame after removing duplicates:\n", result)

Output

The output of the above program is:

Original DataFrame:
Parle    Nestle        Dabur
0      Frooti    Maggie  Chawanprash
1  Krack-jack    Kitkat        Honey
2   Hide&seek  EveryDay     Hair oil
3      Frooti    Crunch      Hajmola 

DataFrame after removing duplicates:
Parle    Nestle        Dabur
0      Frooti    Maggie  Chawanprash
1  Krack-jack    Kitkat        Honey
2   Hide&seek  EveryDay     Hair oil

Python Pandas Programs »


Comments and Discussions!

Load comments ↻






Copyright © 2024 www.includehelp.com. All rights reserved.