Pages

Tuesday, 26 February 2019

What are Hyperparameters in Machine Learning Algorithms? How is Cross Validation used for Hyperparameter Tuning?

Hyperparameters are the parameters which we pass to the Machine Learning algorithms to maximize their performance and accuracy. 

For example, we need to pass the optimal value of K in the KNN algorithm so that it delivers good accuracy as well as does not underfit / overfit. Our model should have a good generalization capability. So, choosing the optimal value of the hyperparameter is very crucial in the Machine Learning algorithms.

Related: How to choose optimal value of K in KNN Algorithm? 

Examples of Hyperparameters

1. K in KNN (Number of nearest neighbors in KNN algorithm)

2. K in K-Means Clustering (Number of clusters in K-Means Clustering algorithm)

3. Depth of a Decision Tree

4. Number of Leaf Nodes in a Decision Tree

5. Number of Trees in a Random Forest

6. Step Size and Learning Rate in Gradient Descent (or Stochastic Gradient Descent)

7. Regularization Penalty (Lambda) in Ridge and Lasso Regression

Hyperparameters are also called "meta-parameters" and "free parameters".

Hyperparameters and Cross Validation

Selecting the optimal value of hyperparameters is usually not an easy task. These parameters are usually chosen using cross validation. These values remain fixed once chosen throughout the training of the model. 

In a K Fold Cross Validation, we initialize hyper-parameters to some value and and then train our model K times, every time using different Test Folds. Note down the average performance of the model over all the test folds and repeat the whole process for another set of hyper-parameters. Then we choose the set of hyper-parameters that corresponds to the best performance during cross-validation. 

As you can see, the computation cost of this process heavily depends on the number of hyper-parameter sets that need to be considered. 

Finding good hyper-parameters allows to avoid or at least reduce overfitting, but keep in mind that hyper-parameters can also overfit the data.

No comments:

Post a Comment