Home » Python

Python Pandas – Missing Data

Python Pandas: In this tutorial, we are going to learn about the working of the Missing Data in Python Pandas.
Submitted by Sapna Deraje Radhakrishna, on January 09, 2020

While using pandas, if there is a missing data point, pandas will automatically fill in that missing point with NULL or NAN.

Let's first define a dataFrame using Numpy and Pandas.

import numpy as np
import pandas as pd

d = {'A':[1,2,np.nan],'B':[3,np.nan,np.nan],'C':[4,5,6]}
df = pd.DataFrame(d)
print(df)

Output

     A    B  C
0  1.0  3.0  4
1  2.0  NaN  5
2  NaN  NaN  6

Pandas provide following options to work with the missing data,

Drop NAN values

# drops rows with null or NAN values
print(df.dropna())

'''
     A    B  C
0  1.0  3.0  4
'''
# drops columns with null or NAN values
print(df.dropna(axis=1))

'''
   C
0  4
1  5
2  6
'''

Specify a threshold to not drop any number of non-NA values.

# Does not remove the 2nd row because, 
# it has less than 2 NAN values.
print(df.dropna(thresh=2))

'''
     A    B  C
0  1.0  3.0  4
1  2.0  NaN  5
'''

Fill missing values

print(df.fillna('empty'))

'''
       A      B  C
0      1      3  4
1      2  empty  5
2  empty  empty  6
'''







Comments and Discussions

Ad: Are you a blogger? Join our Blogging forum.





Languages: » C » C++ » C++ STL » Java » Data Structure » C#.Net » Android » Kotlin » SQL
Web Technologies: » PHP » Python » JavaScript » CSS » Ajax » Node.js » Web programming/HTML
Solved programs: » C » C++ » DS » Java » C#
Aptitude que. & ans.: » C » C++ » Java » DBMS
Interview que. & ans.: » C » Embedded C » Java » SEO » HR
CS Subjects: » CS Basics » O.S. » Networks » DBMS » Embedded Systems » Cloud Computing
» Machine learning » CS Organizations » Linux » DOS
More: » Articles » Puzzles » News/Updates


© https://www.includehelp.com some rights reserved.