Home » 
        Python » 
        Python Programs
    
    
    Pandas: Assign an index to each group identified by groupby
    
    
    
    
	    Learn, how to assign an index to each group identified by groupby in Python pandas?
	    
		    Submitted by Pranit Sharma, on December 05, 2022
	    
    
    
    Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consist of rows, columns, and data.
    Problem statement
    Given a Pandas DataFrame, we have to assign an index to each group identified by groupby in Python pandas.
    Assigning an index to each group identified by groupby
    By assigning an index to each group we mean that we need to create a data frame with the new column containing an index of the group number and then use the df.groupby() method on the columns along with the ngroup() method.
    groupby() Method
    The groupby() is a simple but very useful concept in pandas. By using groupby, we can create a grouping of certain values and perform some operations on those values.
    The groupby() method splits the object, apply some operations, and then combines them to create a group hence large amounts of data and computations can be performed on these groups.
    
    ngroup() method
    Pandas provide the ngroup() method which is used with the groupby object to give the numbers to each group in the dataframe. Hence, we will use this method to solve this problem.
    Let's understand with the help of an example,
    Python program to assign an index to each group identified by groupby
# Importing pandas package
import pandas as pd
# Importing numpy package
import numpy as np
# Creating DataFrame
df = pd.DataFrame({'a':[1,1,1,2,2,2],'b':[1,1,2,1,1,2]})
# Display Original dataframe
print("Original DataFrame:\n",df,"\n")
# Using groupby and giving them numbers
df['nums'] = df.groupby(['a', 'b']).ngroup()
# Print result
print("Result:\n",df)
    Output
    The output of the above program is:
     
    Python Pandas Programs »
    
    
    
    
    
  
    Advertisement
    
    
    
  
  
    Advertisement