×

Python Tutorial

Python Basics

Python I/O

Python Operators

Python Conditions & Controls

Python Functions

Python Strings

Python Modules

Python Lists

Python OOPs

Python Arrays

Python Dictionary

Python Sets

Python Tuples

Python Exception Handling

Python NumPy

Python Pandas

Python File Handling

Python WebSocket

Python GUI Programming

Python Image Processing

Python Miscellaneous

Python Practice

Python Programs

Pandas Correlation Groupby

Here, we are going to learn how to find the correlation between some specific columns? By Pranit Sharma Last updated : October 05, 2023

Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consist of rows, columns, and data.

Problem statement

Suppose that we are given a DataFrame with some columns like Id, value, name, and age, and we need to find the correlation between and then group by with ID.

Finding the correlation between some specific columns

Here, pandas.DataFrame.corr() method is useful but it only finds the correlation between all the columns.

So, for this purpose, we will first apply the groupby() method on the columns with the column we want to group and then we will apply pandas.DataFrame.corr() method to this groupby object.

The groupby() is a simple but very useful concept in pandas. By using groupby(), we can create grouping of certain values and perform some operations of those values.

The groupby() method split the object, apply some operations, and then combines them to create a group hence large amounts of data and computations can be performed on these groups.

Let us understand with the help of an example,

Python program to find the correlation between some specific columns

# Importing pandas package
import pandas as pd

# Importing numpy package
import numpy as np

# Creating a dictionary
d = {
    'ID':[1,1,1,2,2,2,3,3,3],
    'value':[5,4,6,7,4,3,4,2,4],
    'name':['A','B','C','D','E','F','G','H','I'],
    'data':[6,5,4,7,6,5,3,2,6]
}

# Creating DataFrame
df = pd.DataFrame(d)

# Display dataframe
print('Original DataFrame:\n',df,'\n')

# Grouping and finding correlation
res = df.groupby('ID')[['value','data']].corr()

# Display result
print('Result:\n',res,'\n')

Output

The output of the above program is:

Example: Pandas Correlation Groupby

Python Pandas Programs »

Advertisement
Advertisement

Comments and Discussions!

Load comments ↻


Advertisement
Advertisement
Advertisement

Copyright © 2025 www.includehelp.com. All rights reserved.