Pages

Sunday, 7 April 2019

How to print Frequency Table for all categorical variables using value_counts() function?

Frequency Table displays the frequency of occurrence of each and every category in a feature. This provides a very useful information while analyzing the categorical variables. Pandas library provides value_counts function for this.

Consider a Load Prediction dataset. We will try find out frequency of occurrence of each and every category in all the variables.

Step 1: Import the required libraries

import pandas as pd
import numpy as np

Step 2: Load the dataset

dataset = pd.read_csv("C:/train_loan_prediction.csv")

Step 3: Find datatype of all variables

dataset.info()
dataset.dtypes

We find that columns Loan_ID, Gender, Married, Dependents, Education, Self_Employed are of object type (categorical variables).

Step 4: Draw Frequency Table

#Find all the categorical variables
categorical_columns = [x for x in dataset.dtypes.index if dataset.dtypes[x]=='object']

#Exclude Load_ID column
categorical_columns = [x for x in categorical_columns if x not in ['Loan_ID']]

#Print frequency of categories
for col in categorical_columns:
    print('\nFrequency of Categories for variable %s'%col)
    print(dataset[col].value_counts())

Run the above code and observe the results. It displays frequency of all the categorical variables (how many times a particular category exists in a feature).

SimilarHow to use pandas value_counts() function to impute missing values?

No comments:

Post a Comment