Splitting dataframe into multiple dataframes based on column values and naming them with those values

Given a pandas dataframe, we have to split it into multiple dataframes based on column values and naming them with those values.
Submitted by Pranit Sharma, on November 16, 2022

Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consist of rows, columns, and data.

Splitting dataframe into multiple dataframes

We are given a DataFrame of some brand and its product in different regions, we need to split this dataframe in multiple dataframes based on the column region and we will use the column values to name the new dataframes.

For this purpose, we will first apply groupby on the column region, and then we simply iterate through the group with a for a loop.

The groupby() is a simple but very useful concept in pandas. By using groupby(), we can create a grouping of certain values and perform some operations of those values.

The groupby() method split the object, apply some operations, and then combines them to create a group hence large amounts of data and computations can be performed on these groups.
In each iteration, we will get the subset of the data frame that is the distributed data frame based on region.

Let us understand with the help of an example,

Python code to split dataframe into multiple dataframes based on column values and naming them with those values

# Importing pandas package
import pandas as pd

# Importing numpy package
import numpy as np

# Creating a dictionary
d= {
    'brand':['Nike','Nike','Nike','Puma','Puma','Puma','Reebok','Reebok','Reebok'],
    'Region':['A','B','C','A','B','C','A','B','C'],
    'product':['Tshirt','Shoes','Jacket','Tshirt','Shoes','Jacket','Tshirt','Shoes','Jacket']
}

# Creating DataFrame
df = pd.DataFrame(d)

# Display dataframe
print('Original DataFrame:\n',df,'\n')

i = 1
# Using groupby and splitting df
for region, df_region in df.groupby('Region'):
    print("Subset "+str(i)+"\n",df_region,"\n")
    i=i+1

Output:

Example: Splitting dataframe into multiple dataframes based on column values and naming them with those values

Python Pandas Programs »



ADVERTISEMENT
ADVERTISEMENT




Comments and Discussions!




Languages: » C » C++ » C++ STL » Java » Data Structure » C#.Net » Android » Kotlin » SQL
Web Technologies: » PHP » Python » JavaScript » CSS » Ajax » Node.js » Web programming/HTML
Solved programs: » C » C++ » DS » Java » C#
Aptitude que. & ans.: » C » C++ » Java » DBMS
Interview que. & ans.: » C » Embedded C » Java » SEO » HR
CS Subjects: » CS Basics » O.S. » Networks » DBMS » Embedded Systems » Cloud Computing
» Machine learning » CS Organizations » Linux » DOS
More: » Articles » Puzzles » News/Updates

© https://www.includehelp.com some rights reserved.