Home » 
        Python
    
    
    Python Pandas – Missing Data
    
    
    
           
        Python Pandas: In this tutorial, we are going to learn about the working of the Missing Data in Python Pandas.
        Submitted by Sapna Deraje Radhakrishna, on January 09, 2020
    
    
    While using pandas, if there is a missing data point, pandas will automatically fill in that missing point with NULL or NAN.
    Let's first define a dataFrame using Numpy and Pandas.
import numpy as np
import pandas as pd
d = {'A':[1,2,np.nan],'B':[3,np.nan,np.nan],'C':[4,5,6]}
df = pd.DataFrame(d)
print(df)
Output
     A    B  C
0  1.0  3.0  4
1  2.0  NaN  5
2  NaN  NaN  6
    Pandas provide following options to work with the missing data,
    Drop NAN values
# drops rows with null or NAN values
print(df.dropna())
'''
     A    B  C
0  1.0  3.0  4
'''
# drops columns with null or NAN values
print(df.dropna(axis=1))
'''
   C
0  4
1  5
2  6
'''
    Specify a threshold to not drop any number of non-NA values.
# Does not remove the 2nd row because, 
# it has less than 2 NAN values.
print(df.dropna(thresh=2))
'''
     A    B  C
0  1.0  3.0  4
1  2.0  NaN  5
'''
    Fill missing values
print(df.fillna('empty'))
'''
       A      B  C
0      1      3  4
1      2  empty  5
2  empty  empty  6
'''
    
    
  
    Advertisement
    
    
    
  
  
    Advertisement