astype function is used to change the datatypes of the variables in pandas dataframe. When we load a data in pandas dataframe, datatype is automatically assigned to each variable based on the data it contains.
Consider a Load Prediction dataset. We will try to change the datatype of Credit_History variable.
Step 1: Import the required libraries
import pandas as pd
import numpy as np
Step 2: Load the dataset
dataset = pd.read_csv("C:/train_loan_prediction.csv")
Step 3: Find datatype of all the variables
dataset.info()
dataset.dtypes
Step 4: Change the datatype of a variable
We observe that Credit_History is a nominal variable (categorical variable), even then it is being identified as float64 as it contains numbers. But, ideally it should be of object type as its a categorical variable.
So, we can change the datatype of this variable using following Python code:
dataset['Credit_History'] = dataset['Credit_History'].astype(np.object)
Now print the datatypes and examine the results.
dataset.info()
dataset.dtypes
You will see the datatype of Credit_History has been changed from float64 to object.
Consider a Load Prediction dataset. We will try to change the datatype of Credit_History variable.
Step 1: Import the required libraries
import pandas as pd
import numpy as np
Step 2: Load the dataset
dataset = pd.read_csv("C:/train_loan_prediction.csv")
Step 3: Find datatype of all the variables
dataset.info()
dataset.dtypes
Step 4: Change the datatype of a variable
We observe that Credit_History is a nominal variable (categorical variable), even then it is being identified as float64 as it contains numbers. But, ideally it should be of object type as its a categorical variable.
So, we can change the datatype of this variable using following Python code:
dataset['Credit_History'] = dataset['Credit_History'].astype(np.object)
Now print the datatypes and examine the results.
dataset.info()
dataset.dtypes
You will see the datatype of Credit_History has been changed from float64 to object.
No comments:
Post a Comment