Pages

Tuesday, 9 April 2019

How to find missing values in each row and column using Apply function in Pandas library?

apply function returns some value after passing each row/column of a data frame with some function. The function can be default or user-defined or lambda. We will create a user defined function which calculates missing values and returns the count. First we will call this function for all columns and then for all rows using apply function.

Consider a Load Prediction dataset. We will try to find out count of missing values in each row and column using apply function.

Step 1: Import the required libraries

import pandas as pd
import numpy as np

Step 2: Load the dataset

dataset = pd.read_csv("C:/train_loan_prediction.csv")

Step 3: Create a function which returns count of missing values

def num_missing(x):
  return sum(x.isnull())
  
Step 4: Find out number of missing values in each column
  
print("Missing values per column:")
print(dataset.apply(num_missing, axis=0)) 

axis=0 defines that function is to be applied on each column.

Step 5: Find out number of missing values in each row

print("Missing values per row:")
print(dataset.apply(num_missing, axis=1).head()) 

axis=1 defines that function is to be applied on each row.

You can also use lambda function with apply. Here is an example.

1 comment: