**describe()**function gives all the statistical information about all the numeric columns in the dataset like

**count**,

**mean**,

**standard deviation**,

**minimum**,

**maximum**,

**median**etc. Lets explore it in more detail.

Consider a Load Prediction dataset. We will try to see all the statistical data of all the numeric variables. We will also calculate mean and median explicitly.

**Step 1: Import the required libraries**

import pandas as pd

import numpy as np

**Step 2: Load the dataset**

dataset = pd.read_csv("C:/train_loan_prediction.csv")

**Step 3: Calculate mean and median**

Just execute the below statement and observe the results yourself:

**dataset.describe()**

This will provide statistical data of all the numeric columns and discard other non-numeric columns. If you want to know about only a single column like ApplicantIncome, use below statement:

**dataset['ApplicantIncome'].describe()**

You can explicitly get mean and median using following statements:

**dataset['ApplicantIncome'].mean()**

**dataset['ApplicantIncome'].median()**

You can use above mean and median values to impute missing values in the variable.

## No comments:

## Post a Comment