Pages

Sunday, 9 June 2019

Implement a simple Gradient Descent Optimizer in TensorFlow

Lets implement a simple Gradient Descent Optimizer in TensorFlow for a linear model. We will use GradientDescentOptimizer function present in TensorFlow. We will use Gradient Descent optimizer to find out optimal values of weight and bias so that the loss is minimized. You can download my Jupyter notebook containing below code on Gradient Descent from here.

Step 1: Import TensorFlow library

import tensorflow as tf

Step 2: Declare all the variables and placeholders

W = tf.Variable([0.3], tf.float32)
b = tf.Variable([-0.3], tf.float32)

We have initialized the weight and bias with a random number 0.3 and -0.3 respectively. Task of Gradient Descent Optimizer is to find out an optimal value for both of these variables so that our loss is minimum. Lets see how it happens in next steps?

x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)

Step 3: Create a linear model

linear_model = W * x + b

Step 4: Create a loss function

squared_delta = tf.square(linear_model - y)
loss = tf.reduce_sum(squared_delta)

1. We are using sum of squared errors as a loss function.

2. "linear_model - y" expression is used to compute the error. "linear_model" contains the predicted values and "y" contains the actual values.

3. "square" function is used to square all the errors. 

4. "reduce_sum" function is used to sum all the squared errors.

Step 5: Create a Gradient Descent Optimizer

optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)

We are passing 0.01 as a learning rate.

Step 6: Initialize all the variables

init = tf.global_variables_initializer()

Step 7: Create a session and run the graph

session = tf.Session()
session.run(init)
print(session.run(loss, {x:[1,2,3,4], y:[0,-1,-2,-3]}))

Output: 23.66

So, our loss is 23.66 which is quite high. It means the initial values of weight and bias which we took as 0.3 and -0.3 in step 2 are not optimal values. We need to take help of Gradient Descent to optimize our weight and bias.

In next step, we will use Gradient Descent Optimizer (with 1000 iterations and learning rate of 0.01) and try to minimize this loss.

for _ in range(1000):
    session.run(train, {x:[1,2,3,4], y:[0,-1,-2,-3]})
print(session.run([W,b]))
session.close()

Output: [array([-0.9999969], dtype=float32), array([0.9999908], dtype=float32)]

Now we get W as -0.9999969 (approx -1) and b as 0.9999908 (approx 1). So, final conclusion is that optimized value of W is -1 and b is 1. If we initialize W and b as 1 and -1 in our step 2, we will get zero loss. So, our Gradient Descent Optimizer has done a pretty decent job for us.

No comments:

Post a Comment