Combine duplicated columns within a DataFrame

Learn, how to combine duplicated columns within a DataFrame in Python? By Pranit Sharma Last updated : October 06, 2023

Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consist of rows, columns, and data.

Problem statement

We are given the data frame that has columns that include the same name we need to find a way to combine the columns that have the same name with the help of some function.

Combining duplicated columns within a DataFrame

For this purpose, we will simply use the groupby() method which accepts a 'level' argument that we can specify in conjunction with the 'axis' argument. The groupby() is a simple but very useful concept in pandas. By using groupby, we can create a grouping of certain values and perform some operations on those values.

The groupby() method splits the object, applies some operations, and then combines them to create a group hence a large amount of data and computations can be performed on these groups.

Let us understand with the help of an example,

Python program to combine duplicated columns within a DataFrame

# Importing pandas
import pandas as pd

# Import numpy
import numpy as np

# Creating a dataframe
df = pd.DataFrame(np.random.choice(50, (5, 5)), columns=list('AABBB'))

# Display original DataFrame
print("Original DataFrame:\n",df,"\n")

# Grouping the df using level and 
# axis arguement
res = df.groupby(level=0, axis=1).sum()

# Display result
print("Result:\n",res)