We can easily filter out any subset of data from the pandas data frame. We can filter values of a column based on conditions from another set of columns? Boolean indexing is very useful here.
Consider a Load Prediction dataset. We will filter out the data based on some condition using boolean indexing.
Step 1: Import the required libraries
import pandas as pd
import numpy as np
Step 2: Load the dataset
dataset = pd.read_csv("C:/train_loan_prediction.csv")
Step 3: Filter data using boolean indexing
Suppose we want a list of all females who are not graduate and got a loan. Lets use boolean indexing to filter out the data. You can use the following code:
dataset.loc[(dataset["Gender"]=="Female") & (dataset["Education"]=="Not Graduate") & (dataset["Loan_Status"]=="Y"), ["Gender","Education","Loan_Status"]]
Above code selects the data showing all the females who are not graduate and their loan status is approved. It will only display three columns "Gender", "Education" and "Loan_Status". You can display n number of columns based on your requirement. Please try other conditions to filter out the data for the sake of practice.
Consider a Load Prediction dataset. We will filter out the data based on some condition using boolean indexing.
Step 1: Import the required libraries
import pandas as pd
import numpy as np
Step 2: Load the dataset
dataset = pd.read_csv("C:/train_loan_prediction.csv")
Step 3: Filter data using boolean indexing
Suppose we want a list of all females who are not graduate and got a loan. Lets use boolean indexing to filter out the data. You can use the following code:
dataset.loc[(dataset["Gender"]=="Female") & (dataset["Education"]=="Not Graduate") & (dataset["Loan_Status"]=="Y"), ["Gender","Education","Loan_Status"]]
Above code selects the data showing all the females who are not graduate and their loan status is approved. It will only display three columns "Gender", "Education" and "Loan_Status". You can display n number of columns based on your requirement. Please try other conditions to filter out the data for the sake of practice.
No comments:
Post a Comment