Python - How to calculate mean values grouped on another column in Pandas?

Given a Pandas DataFrame, we have to calculate mean values grouped on another column in Pandas.
Submitted by Pranit Sharma, on July 28, 2022

Pandas DataFrame

Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consist of rows, columns, and data.

Calculate mean values grouped on another column in Pandas

To calculate mean values grouped on another column in pandas, we will use groupby, and then we will apply mean() method.

The average of a particular set of values is the called mean of that set. Mathematically, it can be represented as:

Formula: Calculate mean values grouped on another column

Pandas allow us a direct method called mean() which calculates the average of the set passed into it.

pandas.DataFrame.groupby()

The pandas.DataFrame.groupby() method is a simple but very useful concept in pandas. By using groupby, we can create a grouping of certain values and perform some operations on those values. It split the object, apply some operations, and then combines them to create a group hence large amount of data and computations can be performed on these groups.

Syntax

DataFrame.groupby(
    by=None, 
    axis=0, 
    level=None, 
    as_index=True, 
    sort=True, 
    group_keys=True, 
    squeeze=NoDefault.no_default, 
    observed=False, 
    dropna=True
    )

Parameter(s)

It takes several parameters, but here we will use 'dropna = False', setting this value as False will not drop the NaN values from the column while grouping the elements.

Let us understand with the help of an example,

Python program to calculate mean values grouped on another column in Pandas

# Import pandas package
import pandas as pd

# import numpy package
import numpy as np

# Creating a dictionary
d = {
    'Name': ['Rajeev', 'Akhilesh', 'Sonu', 'Timak', 'Divyansh', 'Megha'],
    'Insurance': [0, 1, 1, 1, 0, 1],
    'Claimed':[0,25000,67000,100000,0,24000]
}

# Creating a Dataframe
df = pd.DataFrame(d)

# Display the dataframe
print('Created DataFrame:\n',df,"\n")

# Calculating mean on groupby
result = df.groupby('Name')['Claimed'].mean()

# Display result
print("Result:\n",result)

Output

The output of the above program will be:

Example: Calculate mean values grouped on another column

Output in text format

Created DataFrame:
        Name  ...  Claimed
0    Rajeev  ...        0
1  Akhilesh  ...    25000
2      Sonu  ...    67000
3     Timak  ...   100000
4  Divyansh  ...        0
5     Megha  ...    24000

[6 rows x 3 columns] 

Result:
 Name
Akhilesh     25000.0
Divyansh         0.0
Megha        24000.0
Rajeev           0.0
Sonu         67000.0
Timak       100000.0
Name: Claimed, dtype: float64

Python Pandas Programs »

Comments and Discussions!

Load comments ↻





Copyright © 2024 www.includehelp.com. All rights reserved.