How to merge multiple DataFrames on columns?

Given multiple DataFrames, we have to merge them on columns. By Pranit Sharma Last updated : September 20, 2023

DataFrames are 2-dimensional data structure in pandas. DataFrames consists of rows, columns and the data. DataFrame can be created with the help of python dictionaries or lists but in real world, csv files are imported and then converted into DataFrames.

Problem statement

Given multiple DataFrames, we have to merge them on columns.

Merging multiple DataFrames on columns

We can have one, two or more than 2 DataFrames in pandas, and it may be possible that our required information is distributed in all of the DataFrames, the best way to deal with this situation is to join these DataFrames into a single DataFrame.

To merge two or more DataFrames into a single DataFrame, we use pandas.merge() method.

Syntax:

pandas.merge(
    left, 
    right, 
    how='inner', 
    on=None, 
    left_on=None, 
    right_on=None, 
    left_index=False, 
    right_index=False, 
    sort=False, 
    suffixes=('_x', '_y'), 
    copy=True, 
    indicator=False, 
    validate=None
    )

Let us understand with the help of an example.

Python program to merge multiple DataFrames on columns

# Importing pandas package
import pandas as pd

# Creating a Dictionary
dict1 = {'Name':['Amit Sharma','Bhairav Pandey','Chirag Bharadwaj','Divyansh Chaturvedi','Esha Dubey']}
dict2 = {'Name':['Jatin prajapati','Rahul Shakya','Gaurav Dixit','Pooja Sharma','Mukesh Jha']}
dict3 = {'Name':['Ram Manohar','Sheetal Bhadoriya','Anand singh','Ritesh Arya','Aman Gupta']}

# Creating a DataFrame
df1 = pd.DataFrame(dict1)
df2 = pd.DataFrame(dict2)
df3 = pd.DataFrame(dict3)

# Display DataFrame
print("DataFrame1:\n",df1,"\n")
print("DataFrame2:\n",df2,"\n")
print("DataFrame3:\n",df3,"\n")

# Joining two DataFrames and then joining the first two DataFrames 
# with third DataFrame on the column Name
result = pd.merge(df1,df2,left_index=True, right_index=True)
result = pd.merge(result,df3,left_index=True, right_index=True)

# Display Result
pd.set_option('display.max_columns', 3)

print("Merged DataFrames:\n",result)

Output

The output of the above program is:

DataFrame1:
                   Name
0          Amit Sharma
1       Bhairav Pandey
2     Chirag Bharadwaj
3  Divyansh Chaturvedi
4           Esha Dubey 

DataFrame2:
               Name
0  Jatin prajapati
1     Rahul Shakya
2     Gaurav Dixit
3     Pooja Sharma
4       Mukesh Jha 

DataFrame3:
                 Name
0        Ram Manohar
1  Sheetal Bhadoriya
2        Anand singh
3        Ritesh Arya
4         Aman Gupta 

Merged DataFrames:
                 Name_x           Name_y               Name
0          Amit Sharma  Jatin prajapati        Ram Manohar
1       Bhairav Pandey     Rahul Shakya  Sheetal Bhadoriya
2     Chirag Bharadwaj     Gaurav Dixit        Anand singh
3  Divyansh Chaturvedi     Pooja Sharma        Ritesh Arya
4           Esha Dubey       Mukesh Jha         Aman Gupta

Python Pandas Programs »


Comments and Discussions!

Load comments ↻






Copyright © 2024 www.includehelp.com. All rights reserved.