Machine Learning algorithms require all inputs to be numeric, so we should convert all our categorical variables into numeric variables by encoding the categories. Before that, please make sure that you have imputed all the missing values in all the categorical variables. We will use

Consider a Load Prediction dataset. We will encode and transform all the categorical variables to numeric variables.

import pandas as pd

import numpy as np

from sklearn.preprocessing import

dataset = pd.read_csv("C:/train_loan_prediction.csv")

Categorical variables are Gender, Married, Dependents, Education, Self_Employed, Property_Area, Loan_Status. Lets encode and transform all these categorical variables to numeric variables in one go using following Python code.

categorical_vars = ['Gender','Married','Dependents','Education','Self_Employed','Property_Area','Loan_Status']

label_encoder =

for i in categorical_vars:

dataset[i] = label_encoder.

Now, look at the datatypes of variables:

You will see that datatype of all the categorical variables has been changed from object to other datatypes like int32, float64 etc. So, now our dataset is ready for Machine Leaning algorithms.

**LabelEncoder**which is present in Scikit Learn library to encode and transform categorical variables.Consider a Load Prediction dataset. We will encode and transform all the categorical variables to numeric variables.

**Step 1: Import the required libraries**import pandas as pd

import numpy as np

from sklearn.preprocessing import

**LabelEncoder****Step 2: Load the dataset**dataset = pd.read_csv("C:/train_loan_prediction.csv")

**Step 3: Encode categorical variables using LabelEncoder**Categorical variables are Gender, Married, Dependents, Education, Self_Employed, Property_Area, Loan_Status. Lets encode and transform all these categorical variables to numeric variables in one go using following Python code.

categorical_vars = ['Gender','Married','Dependents','Education','Self_Employed','Property_Area','Loan_Status']

label_encoder =

**LabelEncoder**()for i in categorical_vars:

dataset[i] = label_encoder.

**fit_transform**(dataset[i])Now, look at the datatypes of variables:

**dataset**

**.dtypes**You will see that datatype of all the categorical variables has been changed from object to other datatypes like int32, float64 etc. So, now our dataset is ready for Machine Leaning algorithms.

**Related**: Difference between Label Encoder and One Hot Encoder
## No comments:

## Post a Comment