Lets implement a simple Gradient Descent Optimizer in TensorFlow for a linear model. We will use

import tensorflow as tf

W = tf.Variable([0.3], tf.float32)

b = tf.Variable([-0.3], tf.float32)

We have initialized the weight and bias with a random number 0.3 and -0.3 respectively. Task of Gradient Descent Optimizer is to find out an optimal value for both of these variables so that our loss is minimum. Lets see how it happens in next steps?

x = tf.placeholder(tf.float32)

y = tf.placeholder(tf.float32)

linear_model = W * x + b

loss = tf.reduce_sum(squared_delta)

1. We are using sum of squared errors as a loss function.

2. "linear_model - y" expression is used to compute the error. "linear_model" contains the predicted values and "y" contains the actual values.

3. "square" function is used to square all the errors.

4. "reduce_sum" function is used to sum all the squared errors.

optimizer = tf.train.

train = optimizer.minimize(loss)

We are passing 0.01 as a learning rate.

init = tf.

session = tf.Session()

session.run(init)

print(session.run(loss, {x:[1,2,3,4], y:[0,-1,-2,-3]}))

Output: 23.66

So, our loss is 23.66 which is quite high. It means the initial values of weight and bias which we took as 0.3 and -0.3 in step 2 are not optimal values. We need to take help of Gradient Descent to optimize our weight and bias.

In next step, we will use Gradient Descent Optimizer (with 1000 iterations and learning rate of 0.01) and try to minimize this loss.

for _ in range(1000):

session.run(train, {x:[1,2,3,4], y:[0,-1,-2,-3]})

print(session.run([W,b]))

session.close()

Output: [array([-0.9999969], dtype=float32), array([0.9999908], dtype=float32)]

Now we get W as -0.9999969 (approx -1) and b as 0.9999908 (approx 1). So, final conclusion is that optimized value of W is -1 and b is 1. If we initialize W and b as 1 and -1 in our step 2, we will get zero loss. So, our Gradient Descent Optimizer has done a pretty decent job for us.

**GradientDescentOptimizer**function present in TensorFlow. We will use Gradient Descent optimizer to find out optimal values of weight and bias so that the loss is minimized. You can download my Jupyter notebook containing below code on Gradient Descent from here.**Step 1: Import TensorFlow library**import tensorflow as tf

**Step 2: Declare all the variables and placeholders**W = tf.Variable([0.3], tf.float32)

b = tf.Variable([-0.3], tf.float32)

We have initialized the weight and bias with a random number 0.3 and -0.3 respectively. Task of Gradient Descent Optimizer is to find out an optimal value for both of these variables so that our loss is minimum. Lets see how it happens in next steps?

x = tf.placeholder(tf.float32)

y = tf.placeholder(tf.float32)

**Step 3: Create a linear model**linear_model = W * x + b

**Step 4: Create a loss function****squared_delta = tf.square(linear_model - y)**

loss = tf.reduce_sum(squared_delta)

1. We are using sum of squared errors as a loss function.

2. "linear_model - y" expression is used to compute the error. "linear_model" contains the predicted values and "y" contains the actual values.

3. "square" function is used to square all the errors.

4. "reduce_sum" function is used to sum all the squared errors.

**Step 5: Create a Gradient Descent Optimizer**optimizer = tf.train.

**GradientDescentOptimizer**(0.01)train = optimizer.minimize(loss)

We are passing 0.01 as a learning rate.

**Step 6: Initialize all the variables**init = tf.

**global_variables_initializer**()**Step 7: Create a session and run the graph**session = tf.Session()

session.run(init)

print(session.run(loss, {x:[1,2,3,4], y:[0,-1,-2,-3]}))

Output: 23.66

So, our loss is 23.66 which is quite high. It means the initial values of weight and bias which we took as 0.3 and -0.3 in step 2 are not optimal values. We need to take help of Gradient Descent to optimize our weight and bias.

In next step, we will use Gradient Descent Optimizer (with 1000 iterations and learning rate of 0.01) and try to minimize this loss.

for _ in range(1000):

session.run(train, {x:[1,2,3,4], y:[0,-1,-2,-3]})

print(session.run([W,b]))

session.close()

Output: [array([-0.9999969], dtype=float32), array([0.9999908], dtype=float32)]

Now we get W as -0.9999969 (approx -1) and b as 0.9999908 (approx 1). So, final conclusion is that optimized value of W is -1 and b is 1. If we initialize W and b as 1 and -1 in our step 2, we will get zero loss. So, our Gradient Descent Optimizer has done a pretty decent job for us.

## No comments:

## Post a comment