×

Python Tutorial

Python Basics

Python I/O

Python Operators

Python Conditions & Controls

Python Functions

Python Strings

Python Modules

Python Lists

Python OOPs

Python Arrays

Python Dictionary

Python Sets

Python Tuples

Python Exception Handling

Python NumPy

Python Pandas

Python File Handling

Python WebSocket

Python GUI Programming

Python Image Processing

Python Miscellaneous

Python Practice

Python Programs

How to remove duplicate columns in Pandas DataFrame?

Given a Pandas DataFrame, we have to remove duplicate columns. By Pranit Sharma Last updated : September 21, 2023

Columns are the different fields that contain their particular values when we create a DataFrame. We can perform certain operations on both rows & column values.

Duplicity is a column of pandas DataFrame occurs when there is more than 1 occurrence of similar elements.

Problem statement

Given a Pandas DataFrame, we have to remove duplicate columns.

Removing duplicate columns in Pandas DataFrame

For this purpose, we are going to use pandas.DataFrame.drop_duplicates() method. This method is useful when there are more than 1 occurrence of a single element in a column. It will remove all the occurrences of that element except one.

Syntax:

DataFrame.drop_duplicates(
    subset=None, 
    keep='first', 
    inplace=False, 
    ignore_index=False
    )

Parameter(s):

  • Subset: It takes a list or series to check for duplicates.
  • Keep: It is a control technique for duplicates.
  • inplace: It is a Boolean type value that will modify the entire row if True.

To work with pandas, we need to import pandas package first, below is the syntax:

import pandas as pd

Let us understand with the help of an example.

Python program to remove duplicate columns in Pandas DataFrame

# Importing pandas package
import pandas as pd

# Defining two DataFrames
df = pd.DataFrame(
    data={
        "Parle": ["Frooti", "Krack-jack", "Hide&seek", "Frooti"],
        "Nestle": ["Maggie", "Kitkat", "EveryDay", "Crunch"],
        "Dabur": ["Chawanprash", "Honey", "Hair oil", "Hajmola"],
    }
)

# Display DataFrame
print("Original DataFrame:\n", df, "\n")

# Removing duplicates
result = df.drop_duplicates(subset="Parle")

# Display result
print("DataFrame after removing duplicates:\n", result)

Output

The output of the above program is:

Original DataFrame:
Parle    Nestle        Dabur
0      Frooti    Maggie  Chawanprash
1  Krack-jack    Kitkat        Honey
2   Hide&seek  EveryDay     Hair oil
3      Frooti    Crunch      Hajmola 

DataFrame after removing duplicates:
Parle    Nestle        Dabur
0      Frooti    Maggie  Chawanprash
1  Krack-jack    Kitkat        Honey
2   Hide&seek  EveryDay     Hair oil

Python Pandas Programs »

Advertisement
Advertisement

Comments and Discussions!

Load comments ↻


Advertisement
Advertisement
Advertisement

Copyright © 2025 www.includehelp.com. All rights reserved.