Thursday 20 June 2019

Transfer Learning and Fine Tuning a model in Deep Learning

Transfer learning and fine-tuning terms are very similar in many ways and widely used almost interchangeably. 

Fine-tuning: Suppose you already have an efficient deep learning model which performs task A. Now you have to perform a task B which is quite similar to task A. You don't need to create a separate model from scratch for task B. Just fine-tune the existing model which is efficiently performing task A. 

Example: You have a well trained model which identifies all types of cars. Car model has already learned a lot of features like edges, shapes, textures, head lights, door handles, tyres, windshield etc. Now you have to create a model which can identify trucks. We know that many features of cars and trucks are similar. So, why to create a new model for trucks from scratch. Lets just tweak the existing car model to create a new model for truck. 

Transfer Learning: We can transfer the learning from the existing model on cars to new model on trucks. So, transfer learning happens while fine-tuning an existing model.

Advantages of Transfer Learning and Fine Tuning:

Creating a new model is a very tough and time consuming task. We need to decide a lot of things while creating a model like:

1. Different types of layers to use (fully connected, convoluted, capsule, LSTM etc.)
2. How many layers to use?
3. How many nodes in a layer?
4. Which activation function to use in which layer?
5. Which regularization techniques to use?
6. Which optimizer to use?
7. Tuning various hyperparameters like initializing weights, learning rate, momentum, batch size, number of epochs etc.

So, if we can fine-tune an existing model, we can very well escape from above tasks and save our time and energy.

How to fine-tune a model? 

We need to make some reasonable changes and tweaks to our existing model to create a new model. Below are some basic steps to fine-tune an existing model:

1. Remove output layer: First of all remove output layer which was identifying cars. Add a new output layer which will now identify trucks.

2. Add and remove hidden layers: Trucks have some features different from cars. So accordingly, add some hidden layers which will learn new features of trucks. Remove those hidden layers which are not required in case of trucks.

3. Freeze the unchanged layers: Freeze the layers which are maintained (not changed) so that no weight update happens on them when we again train this model on the new data with trucks. Weight should only be updated on new hidden layers. 

No comments:

Post a Comment

About the Author

I have more than 10 years of experience in IT industry. Linkedin Profile

I am currently messing up with neural networks in deep learning. I am learning Python, TensorFlow and Keras.

Author: I am an author of a book on deep learning.

Quiz: I run an online quiz on machine learning and deep learning.