Home »
Python »
Python Programs
How to get topmost N records within each group of a Pandas DataFrame?
Given a Pandas DataFrame, we have to get topmost N records within each group.
Submitted by Pranit Sharma, on June 04, 2022
Rows in pandas are the different cell (column) values which are aligned horizontally and also provides uniformity. Each row can have same or different value. Rows are generally marked with the index number but in pandas we can also assign index name according to the needs.
Here, we are going to learn how to get the topmost N records within each group, for this purpose we will first use pandas.DataFrame.groupby() and then we will select topmost N record by using the following piece of code with the groupby result,
head(N).reset_index(drop=True)
To work with pandas, we need to import pandas package first, below is the syntax:
import pandas as pd
Let us understand with the help of an example,
# Importing pandas package
import pandas as pd
# Create dictionary
d = {
'a':['A','A','A','A','B','B'],
'b':[1,2,1,2,1,2],
'c':[12,10,16,20,14,10]
}
# Create DataFrame
df = pd.DataFrame(d)
# Display DataFrame
print("Created DataFrame:\n",df)
# Groupby function
result = df.groupby('a', as_index=False)
# Selecting 1st row of group by result
final = result.head(2).reset_index(drop=True)
# Display final result
print("Final result:\n",final)
Original DataFrame:
Topmost N records in groupby result:
Python Pandas Programs »