Best way to count the number of rows with missing values in a pandas DataFrame

Given a pandas dataframe, we have to find the best way to count the number of rows with missing values.
Submitted by Pranit Sharma, on November 16, 2022

Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consist of rows, columns, and data.

Counting the number of rows with missing values

While creating a DataFrame or importing a CSV file, there could be some NaN values in the cells. NaN values mean "Not a Number" which generally means that there are some missing values in the cell. To deal with this type of data, you can either remove the particular row (if the number of missing values is low) or you can handle these values. For handling these values, you might need to count the number of NaN values or you need to count the number of non-NaN values.

We are given a DataFrame with multiple columns and each column contains some nan values at some particular position.

There are two things that we can do, we can either count the number of nan values present in the DataFrame or we can count the rows which have some missing value.

One of the quickest possible ways to count the number of rows with missing values is to subtract the number of rows returned from the dropna() method from the total number of rows.

Let us understand with the help of an example,

Python code to count the number of rows with missing values

# Importing pandas package
import pandas as pd

# Importing numpy package
import numpy as np

from numpy.random import randn

# Creating DataFrame
df = pd.DataFrame(randn(5, 3), index=['a', 'c', 'e', 'f', 'h'],
               columns=['one', 'two', 'three'])

# Display dataframe
print('Original DataFrame:\n',df,'\n')

# Setting index
df = df.reindex(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])

# using dropna and counting rows
res = df.shape[0] - df.dropna().shape[0]

# Display Result
print("Rows with missing values:\n",res)

Output:

Example: Best way to count the number of rows with missing values in a pandas DataFrame

Python Pandas Programs »



ADVERTISEMENT
ADVERTISEMENT




Comments and Discussions!




Languages: » C » C++ » C++ STL » Java » Data Structure » C#.Net » Android » Kotlin » SQL
Web Technologies: » PHP » Python » JavaScript » CSS » Ajax » Node.js » Web programming/HTML
Solved programs: » C » C++ » DS » Java » C#
Aptitude que. & ans.: » C » C++ » Java » DBMS
Interview que. & ans.: » C » Embedded C » Java » SEO » HR
CS Subjects: » CS Basics » O.S. » Networks » DBMS » Embedded Systems » Cloud Computing
» Machine learning » CS Organizations » Linux » DOS
More: » Articles » Puzzles » News/Updates

© https://www.includehelp.com some rights reserved.