ADVERTISEMENT
ADVERTISEMENT

Import multiple csv files into pandas and concatenate into one DataFrame

How to import multiple csv files into pandas and concatenate into one DataFrame?
Submitted by Pranit Sharma, on April 20, 2022

Importing CSV files in pandas is pretty simple. The DataFrame.read_csv() method is used to import the CSV file in the pandas. This CSV file is then converted into the DataFrame for further analysis. Sometimes, a single dataset does not contain all the required data, in that case we might need to import multiple CSV files and then concatenate into one DataFrame.

Note: To import CSV file, you must have a CSV file in your computer. 

To import multiple CSV files, we need to apply a check with the help of which we can get all the required files. The glob.glob() method returns a list of all the files containing some component of file name which is passed as a parameter inside it.

The glob.glob() Method

This method takes the path of the folder where all the required files are located. Secondly, it takes the string as a parameter which works as an identification of the required file.

Syntax:

req_files = glob.glob("C:/Users/hp/Desktop/Includehelp/*.csv")

Here, glob.glob() returns a list of all the files having ".csv" in its name.

pandas.concat() Method

This method just combines the datasets passed inside it as a parameter either along the rows or the columns. The list of path of datasets is passed as a parameter.

To work with MultiIndex in Python Pandas, we need to import the pandas library. Below is the syntax,

import pandas as pd

Let us understand with the help of an example.

# Importing pandas package
import pandas as pd

# Importing glob library in order to get 
# a list of all the required files
import glob

# First Importing separate datasets
data1 = pd.read_csv('C:/Users/hp/Desktop/Includehelp/mycsv.csv')
data2 = pd.read_csv('C:/Users/hp/Desktop/Includehelp/mycsv1.csv')

# Creating separate DataFrames with 
# these two files
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)

# Printing separate DataFrames
print("First DataFrame:\n",df1,"\n\n")
print("Second DataFrame:\n",df2,"\n\n")

# Requesting a list of all the csv files present 
# in a specified folder
req_files = glob.glob("C:/Users/hp/Desktop/Includehelp/*.csv")

# Importing multiple CSV files and 
# concatenating them
data = pd.concat(map(pd.read_csv, req_files,),ignore_index=True)

# Creating the dataframe for 
# the combined data
df3=pd.DataFrame(data)

# Printing the combined DataFrame
print("Combined Dataframe:\n",df3)

Output:

Import multiple csv files

Python Pandas Programs »



ADVERTISEMENT


ADVERTISEMENT


Comments and Discussions!



ADVERTISEMENT

ADVERTISEMENT

ADVERTISEMENT

ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT

Languages: » C » C++ » C++ STL » Java » Data Structure » C#.Net » Android » Kotlin » SQL
Web Technologies: » PHP » Python » JavaScript » CSS » Ajax » Node.js » Web programming/HTML
Solved programs: » C » C++ » DS » Java » C#
Aptitude que. & ans.: » C » C++ » Java » DBMS
Interview que. & ans.: » C » Embedded C » Java » SEO » HR
CS Subjects: » CS Basics » O.S. » Networks » DBMS » Embedded Systems » Cloud Computing
» Machine learning » CS Organizations » Linux » DOS
More: » Articles » Puzzles » News/Updates

© https://www.includehelp.com some rights reserved.