How to perform equal frequency binning in Python?

By Shivang Yadav Last updated : November 21, 2023

Equal frequency binning

Equal frequency binning is a data binning technique where the data is divided into bins such that each bin contains approximately the same number of data points. This can be useful for handling skewed data distributions.

Performing equal frequency binning

Equal frequency binning is performed in Python using the qcut() method. This method Discretizes variables into equal-sized buckets based on rank or based on sample quantiles. For example, 1000 values for 10 quantiles would produce a Categorical object indicating quantile membership for each data point. Reference.

Python program to perform equal frequency binning

# Performing equal frequency binning
import pandas as pd

def eqFreqBinning(data, num_bins):
    df = pd.DataFrame(data, columns=["value"])
    df["bin"] = pd.qcut(df["value"], q=num_bins, labels=False)
    binned_data = df.groupby("bin")["value"].apply(list).to_dict()
    return binned_data

data = [5, 2, 1, 5, 2, 7, 9, 5, 7, 1]

print(f"Data Set: {data}")

num_bins = 3

print("Equal Frequency Binning : ")

binned_data = eqFreqBinning(data, num_bins)

for bin_num, values in binned_data.items():
    print(f"Bin {bin_num}: {values}")

Output

The output of the above program is:

Data Set: [5, 2, 1, 5, 2, 7, 9, 5, 7, 1]
Equal Frequency Binning : 
Bin 0: [2, 1, 2, 1]
Bin 1: [5, 5, 5]
Bin 2: [7, 9, 7]

Python Pandas Programs »


Comments and Discussions!

Load comments ↻






Copyright © 2024 www.includehelp.com. All rights reserved.