Groupby Pandas DataFrame and calculate mean and stdev of one column and add the std as a new column with reset_index

Learn, how to group by Pandas dataframe and then calculating mean and std of one column and adding it as a new column? By Pranit Sharma Last updated : October 06, 2023

Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consist of rows, columns, and data.

Problem statement

Suppose we are given the Pandas dataframe with some columns a, b, c, and, d.

Now, we need to group the rows by column 'a' while replacing values in column 'c' by the mean of the values and grouped rows and add another column with the standard deviation of the values in column 'c' whose mean has to be calculated the values in column 'b' or 'd' are constant for all rows been grouped.

Groupby Pandas DataFrame and calculate mean and stdev of one column

For this purpose, we will simply use the groupby() method for the column 'a' and on this object, we will apply the aggregate function (agg()) where we will pass a dictionary where the column 'c', 'b', and 'd' will be the keys and their values will be the aggregate functions.

Let us understand with the help of an example,

Python program to Groupby Pandas DataFrame and calculate mean and stdev of one column and add the std as a new column with reset_index

# Importing pandas package
import pandas as pd

# Importing numpy package
import numpy as np

# Creating a DataFrame
df = pd.DataFrame({
    'a':['Ram','Shyam','Seeta','Ram'],
    'b':[3,4,7,3],
    'c':[5,4,1,4],
    'd':[7,8,3,7]
})

# Display DataFrame
print("Original DataFrame:\n",df,"\n")

# grouping and applying agg
res = df.groupby(['a'], as_index=False).agg({'c':['mean','std'],'b':'first', 'd':'first'})

# adding another column and assigning 
# a new order of columns
res.columns = ['a','c','e','b','d']
res.reindex(columns=sorted(res.columns))

# Display result
print("Result:\n",res)

Output

The output of the above program is:

Example: Groupby Pandas DataFrame and calculate mean and stdev

Python Pandas Programs »


Comments and Discussions!

Load comments ↻






Copyright © 2024 www.includehelp.com. All rights reserved.