How to calculate intraclass correlation coefficient in Python?

In this tutorial, we will learn how to calculate intraclass correlation coefficient in Python? By Shivang Yadav Last updated : October 26, 2023

Intraclass Correlation Coefficient

The Intraclass Correlation Coefficient (ICC) is a statistical measure used to assess the degree of similarity or agreement between multiple observations or measurements made on the same subjects or entities. It is commonly employed in various fields, including psychology, medicine, and social sciences, to evaluate the reliability or consistency of measurements taken by different individuals or under different conditions.

The ICC quantifies the proportion of the total variance in the data that can be attributed to differences between subjects (or entities) as opposed to differences within subjects. In other words, it helps researchers determine to what extent the variability in the measurements is due to real differences among the subjects being measured and how much is due to random error or measurement variability.

There are several different formulas and models for calculating ICC, and the choice of which one to use depends on the specific research design and objectives. Here, we'll briefly discuss the two most commonly used types of ICC:

One-Way Random Effects Model: In this model, each subject is measured by a different set of raters or under different conditions. The ICC calculated from this model considers the variance between subjects, the variance within subjects (measurement error), and the variance due to random effects (such as different raters or conditions).
Two-Way Mixed Effects Model: This model is used when a fixed set of raters or conditions is used for all subjects, but the subjects themselves are considered random. The ICC calculated under this model assesses the proportion of total variance attributed to differences between subjects and measurement error.

The ICC value ranges between 0 and 1.

- An ICC of 0 indicates no agreement among measurements, suggesting that all observed variability is due to random error.

- An ICC of 1 implies perfect agreement, meaning that all variability in the measurements is due to true differences among subjects, and there is no measurement error.

Calculation of Intraclass Correlation Coefficient

Python programming language provides method to its users to perform all statistical operations. Intraclass Correlation Coefficient can also be calculated using python's built-in methods. Python's pingouins library provides an intraclass_corr() method to perform the task.

Syntax:

Here is the syntax of the intraclass_corr() method:

pingouin.intraclass_corr(data, targets, raters, ratings)

Here,

data refers to the name of dataframe.
targets refer to the name of the column of targets, (data of things to be rated).
rates is the column for raters.
rating is the column for rating.

Python program to calculate intraclass correlation coefficient

import pandas as pand
import pingouin as ping

#create DataFrame
corrData = pand.DataFrame({
    'tests': [1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6,1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6],
    'grades': ['A', 'A', 'A', 'A', 'A', 'A','B', 'B', 'B', 'B', 'B', 'B','C', 'C', 'C', 'C', 'C', 'C','D', 'D', 'D', 'D', 'D', 'D'],
    'score': [1, 1, 3, 6, 6, 7, 2, 3, 8, 4, 5, 5,0, 4, 1, 5, 5, 6, 1, 2, 3, 3, 6, 4]})

print(f'data set is \n{corrData.head()}')

intraCorrCoef = ping.intraclass_corr(data=corrData, targets='tests', raters='grades', ratings='score')

print(f"The intraCorrCoef is\n{intraCorrCoef.set_index('Type')}")

Output

The output of the above program is:

data set is 
   tests grades  score
0      1      A      1
1      2      A      1
2      3      A      3
3      4      A      6
4      5      A      6
The intraCorrCoef is
                   Description       ICC         F  df1  df2      pval         CI95%
Type                                                                                
ICC1    Single raters absolute  0.505252  5.084916    5   18  0.004430  [0.11, 0.89]
ICC2      Single random raters  0.503054  4.909385    5   15  0.007352   [0.1, 0.89]
ICC3       Single fixed raters  0.494272  4.909385    5   15  0.007352  [0.09, 0.88]
ICC1k  Average raters absolute  0.803340  5.084916    5   18  0.004430  [0.33, 0.97]
ICC2k    Average random raters  0.801947  4.909385    5   15  0.007352  [0.31, 0.97]
ICC3k     Average fixed raters  0.796309  4.909385    5   15  0.007352  [0.27, 0.97]

Description of the values resulted as the calculation,

Description defines the type of correlation calculated.
ICC is the intraclass correlation coefficient (ICC), it is a statistical measure of the strength of the relationship between two or more variables. It is often used to assess the reliability of measurements, and to determine whether the data are clustered or not.
F stands for the f-value of the ICC.
df1, df2 are the degrees of freedom associated with the F-value of ICC.
pval is the p-value associated with the F-value
CI95% is the 95% confidence interval for the ICCThe 95% confidence interval (CI95%) is a statistical measure of the uncertainty around the ICC. It is calculated by taking the ICC and adding and subtracting 1.96 times the standard error of the ICC. The CI95% tells us that we are 95% confident that the true ICC lies within the interval.

Python Pandas Programs »

How to retrieve name of column from its index in Pandas?

How to remove outliers in Python?