Monday, 11 February 2019

Why is Dimensionality Reduction required in Machine Learning?

Dimensionality Reduction is a very important step in Machine Learning. Below are the advantages of Dimensionality Reduction in Machine Learning:

1. Reduction in Computation Time: Less dimensions lead to less computation/training time which increase the performance of the algorithm.

2. Improves Algorithm Performance: Some algorithms do not perform well when we have large dimensions in dataset. So by reducing these dimensions, we can increase the performance of the algorithm.

3. Removes Multicollinearity and Correlated variables: Multicollinearity occurs when independent variables in a model are correlated. This correlation is a problem because independent variables should be independent. It takes care of multicollinearity by removing redundant features. 

For example, you have two variables – ‘time spent on treadmill in minutes’ and ‘calories burnt’. These variables are highly correlated as the more time you spend running on a treadmill, the more calories you will burn. Hence, there is no point in storing both as just one of them does what you require.

4. Better Data Visualization: It helps in visualizing the data in a better way. It is very difficult to visualize data in higher dimensions so reducing our space to 2D or 3D may allow us to plot and observe patterns more clearly.

5. Less Storage Required: Space required to store the data is reduced as the number of dimensions comes down.

RelatedDimensionality Reduction: Feature Selection and Feature Extraction

No comments:

Post a Comment