Pages

Tuesday, 2 July 2019

Building a simple sequential neural network with dense layers in Keras

Lets understand how can we create a simple neural network in Keras. We will create a simple sequential model with dense layers (fully connected layers). We will use relu as an activation function in hidden layers and softmax in outer layer and Adam as SGD.

You can download my Jupyter notebook containing below code from here.

Step 1: Import required libraries

import numpy as np
from random import randint
from sklearn.preprocessing import MinMaxScaler

import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import Adam

from sklearn.metrics import confusion_matrix, accuracy_score, classification_report

Step 2: Create training and test dataset

We will create a hypothetical medical data and will try to predict whether a drug has any side effect or not on the people of different age groups. 

People are divided into two age groups: 
1. 13 years to 64 years and 
2. 65 years to 100 years. 

Label equal to 1 means that drug has side effect and 0 means no side effect. 

We will create 2100 training observations. One array contains age which acts as sample and other array contains 0 and 1 which acts as label.

train_samples = []
train_labels = []
  
for i in range(50):
    random_younger = randint(13,64)
    train_samples.append(random_younger)
    train_labels.append(1)
    
    random_older = randint(65,100)
    train_samples.append(random_older)
    train_labels.append(0)
    
for i in range(1000):
    random_younger = randint(13,64)
    train_samples.append(random_younger)
    train_labels.append(0)
  
    random_older = randint(65,100)
    train_samples.append(random_older)
    train_labels.append(1)
    
Convert the above lists into numpy arrays as Keras expects samples and labels in the form of numpy arrays.

train_samples = np.array(train_samples)
train_labels = np.array(train_labels)

Similarly, create a test dataset.

test_samples = []
test_labels = []
  
for i in range(10):
    random_younger = randint(13,64)
    test_samples.append(random_younger)
    test_labels.append(1)
    
    random_older = randint(65,100)
    test_samples.append(random_older)
    test_labels.append(0)
    
for i in range(200):
    random_younger = randint(13,64)
    test_samples.append(random_younger)
    test_labels.append(0)
    
    random_older = randint(65,100)
    test_samples.append(random_older)
    test_labels.append(1)
    
test_samples = np.array(test_samples)
test_labels = np.array(test_labels)

Step 3: Scale the training and test data

scaler = MinMaxScaler(feature_range=(0,1))
scaled_train_samples = scaler.fit_transform((train_samples).reshape(-1,1))
scaled_test_samples = scaler.fit_transform((test_samples).reshape(-1,1))

This is a preprocessing step. We need to scale our sample data in the range of 0 and 1. This is called feature scaling. For more details on feature scaling, you can go through my this post.

Step 4: Create a model

We will create a sequential model which is a linear stack of layers. We can create a sequential model by passing a list of layer instances to the constructor like this:

model = Sequential([
    Dense(16, input_shape=(1,), activation='relu'),
    Dense(32, activation='relu'),
    Dense(2, activation='softmax'),
])

We can also simply add layers using .add() method:

model = Sequential()
model.add(Dense(16, input_shape=(1,), activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(2, activation='softmax'))

We are using dense layers in the above Keras code which denote fully connected layers in a neural network.

For hidden layers, we are using relu activation function and for outer layer, we are using softmax activation function. To know the difference between relu and softmax activation functions, please consider my this post.

Step 5: Model Summary

model.summary()

It will show the description of all the layers and parameters.

Step 6: Compile a model

model.compile(Adam(lr=0.0001), loss='sparse_categorical_crossentropy', metrics=['accuracy'])

We need to pass the optimizer we want to use, learning rate, loss function and metrics. We are using Adam as an optimizer. This is a variant of SGD (Stochastic Gradient Descent). There are a lot of other optimizers. To go in detail, you can consider visiting my this post.

Step 7: Train a model

model.fit(scaled_train_samples, train_labels, validation_split=0.1, batch_size=10, epochs=20, shuffle=True, verbose=2)

We need to pass training sample and label data, validation set, batch size, epochs, shuffle and verbose parameters. Validation set helps in removing the overfitting and increasing the generalization capabilities of the network. By default, shuffle is always true. These parameters are called hyperparameters and we need to tune these parameters. You can try with different batch sizes and epochs and observe the change in the results.

Step 8: Predict from the model

predictions = model.predict(scaled_test_samples, batch_size=10, verbose=0)
for i in predictions:
    print(i)

Above code will give us the predictions in form of probabilities. If we need exact predictions, we need to use following code. Instead of predict, we need to use predict_classes function.

rounded_predictions = model.predict_classes(scaled_test_samples, batch_size=10, verbose=0)
for i in rounded_predictions:
     print(i)

Step 9: Check accuracy

We are going to use confusion matrix, accuracy score and classification report to check the accuracy of our neural network.

confusionMatrix = confusion_matrix(test_labels, rounded_predictions)
accuracyScore = accuracy_score(test_labels, rounded_predictions)
classificationReport = classification_report(test_labels, rounded_predictions)
print(confusionMatrix)
print(accuracyScore * 100)
print(classificationReport)

Hyperparameter Tuning: In steps 6, 7 and 8, we are using a lot of hyperparameters. Network does not learn these parameters by itself. So, we need to tune these parameters explicitly in order to improve the performance and accuracy of the network. For more information on hyperparameters, you can go through my this post.

Related: Build a CNN model using Keras framework

1 comment: