Online Machine Learning Quiz

100+ Objective Machine Learning Questions. Lets see how many can you answer?

Start Quiz

Friday, 10 May 2019

Online Machine Learning Quiz (Objective Questions)

Machine Learning is the revolutionary technology which has changed our life to a great extent. Machines are learning from data like humans. A lot of scientists and researchers are exploring a lot of opportunities in this field and businesses are getting huge profit out of it.

Keeping that in mind, I have created an online quiz in Machine Learning which will help you in sharpening your ML skills. This ML quiz contains a lot of multiple choice questions (objective questions) regarding Machine Learning.


This ML quiz contains objective questions on following Machine Learning concepts:

1. Data Exploration and Visualization: Hypothesis Generation, Seaborn, Matplotlib, Bar Plot, Box Plot, Histogram, Heatmap, Scatter Plot, Regression Plot, Joint Plot, Distribution Plot, Strip Plot, Violin Plot, KDE, Pair Plot, Pair Grid, Facet Grid etc.

2. Data Wrangling: Missing values, Invalid and corrupted values, Outliers, Skewed data, Feature Scaling, Standardization, Normalization, Binning, Feature Encoding, Label Encoder, One Hot Encoder etc.

3. Dimensionality Reduction: Finding correlation, Feature Selection and Feature Extraction, PCA, t-SNE, SVD, LDA, MDS, ICA etc.

4. Algorithms: Supervised and Unsupervised Learning, Linear Regression, Logistic Regression, KNN, SVM, Naive Byes, Decision Tree, K-Means Clustering etc.

5. Overfitting: Overfitting, Underfitting, Bias, Variance, Cross-validation etc.

6. Ensemble Learning: Bagging, Boosting, Random Forest, Adaboost, GBM (Gradient Boosting Machine), XGBoost (Extreme Gradient Boosting) etc.

7. Regularization: Ridge Regression (L2 Regularization), Lasso Regression (L1 Regularization), Elastic Net Regression etc.

8. Accuracy Measurement: Confusion Matrix, Classification Report, Accuracy Score, F1 Score, Mean Absolute Error, Mean Square Error, Root Mean Square Error etc.

9. Python: Basic Datastructures, Libraries like Scikit Learn, Pandas, Numpy, Scipy, Seaborn, Matplotlib etc.

Rules and Guidelines

1. All questions are objective type questions with 4 options. Only one option is correct.

2. 20 seconds are allotted for each question.

3. Correct answer gives you 4 marks and wrong answer takes away 1 mark (25% negative marking).

4. We will take short breaks during the quiz after every 10 questions.

5. Passing score is 75%. Quiz contains very simple Machine Learning objective questions, so I think 75% marks can be easily scored.

6. Please don't refresh the page or click any other link during the quiz.

7. Please don't use Internet Explorer to run this quiz.

Helplines

There are 4 helplines given in this quiz:

1. Weed Out

2. Blink

3. Magic Wand

4. Hands Up












You can use one helpline per question except "Hands Up". Below is the description of all these helplines:

1. Weed Out

"Weed Out" helpline weeds out two incorrect options. So, now you have to guess the answer only from 2 options from which one is the right answer.

2. Blink

Keep your eyes wide open while using the "Blink" helpline. "Blink" helpline first lights the bulb against the right option and then in fraction of a second (100 milliseconds), it goes on lighting the bulbs against wrong options. So you have to identify against which option, the bulb was lighted first.

3. Magic Wand

This is the most flexible helpline in which you have nothing to do. Just click on the "Magic Wand" and you get the right answer magically.

4. Hands Up

By using "Hands Up" helpline, you are not adding up score but saving your quiz time. You can use it as many times you want. I would suggest you to use this helpline when you have exhausted all your other helplines. If you find a question whose answer is not clear to you, and you don’t have any helpline left, please don’t waste time on that question and just raise your hands to save your time.

Quit Quiz

Quiz contains a lot of objective questions on Machine Learning which will take a lot of time and patience to complete. If you feel tired at any point of time and don't want to continue, you can just quit the quiz and your results will be displayed based on the number of questions you went through.

Quiz Results

At the end of the quiz, you will get your score and time taken to complete the quiz. You can take screenshot of the result for any future reference.














Contribute

Please email me more Machine Learning questions which can be included in this quiz.
Please email me your feedback and suggestions to improve this quiz

Difference between Decision Tree and Random Forest in Machine Learning

Random Forest is a collection of Decision Trees. Decision Tree makes its final decision based on the output of one tree but Random Forest combines the output of a large number of small trees while making its final prediction. Following is the detailed list of differences between Decision Tree and Random Forest:

1. Random Forest is an Ensemble Learning (Bagging) Technique unlike Decision Tree: In Decision Tree, only one tree is grown using all the features and observations. But in case of Random Forest, features and observations are spitted into multiple parts and a lot of small trees (instead of one big tree) are grown based on the spitted data. So, instead of one full tree like Decision Tree, Random Forest uses multiple trees. Larger the number of trees, better is the accuracy and generalization capability. But at some point, increasing the number of trees does not contribute to the accuracy, so one should stop growing trees at that point. 

2. Random Forest uses voting system unlike Decision Tree: All the trees grown in Random Forest are called weak learners. Each weak learner casts a vote as per its prediction. The class which gets maximum votes is considered as the final output of the prediction. You can think of it like a democracy system. On the other hand, there is no voting system in Decision Tree. Only one tree predicts the outcome. No democracy at all!! 

3. Random Forest rarely overfits unlike Decision Tree: Decision Tree is very much prone to overfitting as there is only one tree which is responsible for predicting the outcome. If there is a lot of noise in the dataset, it will start considering the noise while creating the model and will lead to very low bias (or no bias at all). Due to this, it will show a lot of variance in the final predictions in real world data. This scenario is called overfitting. In Random Forest, noise has very little role in spoiling the model as there are so many trees in it and noise cannot affect all the trees.

4. Random Forest reduces variance instead of bias: Random forest reduces variance part of the error rather than bias part, so on a given training dataset, Decision Tree may be more accurate than a Random Forest. But on an unexpected validation dataset, Random Forest always wins in terms of accuracy.

5. Performance: The downside of Random Forest is that it can be slow if you have a single process but it can be parallelized.

6. Decision Tree is easier to understand and interpret: Decision Tree is simple and easy to interpret. You know what variable and what value of that variable is used to split the data and predict the outcome. On the other hand, Random Forest is like a Black Box. You can specify the number of trees you want in your forest (n_estimators) and also you can specify maximum number of features to be used in each tree. But you cannot control the randomness, you cannot control which feature is part of which tree in the forest, you cannot control which data point is part of which tree. 

Thursday, 9 May 2019

Advantages and Disadvantages of Linear Regression in Machine Learning

Linear Regression is a supervised machine learning algorithm which is very easy to learn and implement. Following are the advantages and disadvantage of Linear Regression:

Advantages of Linear Regression

1. Linear Regression performs well when the dataset is linearly separable. We can use it to find the nature of the relationship among the variables.

2. Linear Regression is easier to implement, interpret and very efficient to train. 

3. Linear Regression is prone to over-fitting but it can be easily avoided using some dimensionality reduction techniques, regularization (L1 and L2) techniques and cross-validation.

Disadvantages of Linear Regression

1. Main limitation of Linear Regression is the assumption of linearity between the dependent variable and the independent variables. In the real world, the data is rarely linearly separable. It assumes that there is a straight-line relationship between the dependent and independent variables which is incorrect many times.

2. Prone to noise and overfitting: If the number of observations are lesser than the number of features, Linear Regression should not be used, otherwise it may lead to overfit because is starts considering noise in this scenario while building the model.

3. Prone to outliers: Linear regression is very sensitive to outliers (anomalies). So, outliers should be analyzed and removed before applying Linear Regression to the dataset.

4. Prone to multicollinearity: Before applying Linear regression, multicollinearity should be removed (using dimensionality reduction techniques) because it assumes that there is no relationship among independent variables.

In summary, Linear Regression is great tool to analyze the relationships among the variables but it isn’t recommended for most practical applications because it over-simplifies real world problems by assuming linear relationship among the variables

Tuesday, 30 April 2019

Data Exploration and Visualization Techniques in Python

Data Exploration and Visualization is the first step in the process of creating a robust Machine Learning model. We need to understand and explore the data using various graphs and plots present in matplotlib and seaborn libraries. This step takes a lot of time and patience. 

Plots and graphs help us to analyze relationships among various variables present in the dataset. We can visualize and analyze missing values, outliers, skewed data, correlation among variables etc. 

Main Python libraries used in data exploration and visualization are pandasmatplotlib and seaborn.

There are mainly three types of plots: Univariate, Bivariate and Multivariate Analysis

Some commonly used plots and graphs are: Joint PlotDistribution PlotBox PlotBar PlotRegression PlotStrip PlotHeatmapViolin PlotPair Plot and GridFacet Grid.

Visualize missing values

Wednesday, 24 April 2019

Tuples in Python: Indexing, Slicing, Packing, Unpacking, Concatenation, Repetition, Comparison, Membership, Iteration

Tuples are one of the basic data types in Python. Tuples are widely used in Python programming and have very simple syntax. In this article, we will see various Tuple operations like indexing, slicing, packing, unpacking, comparison, concatenation, repetition, updating and deleting a tuple, in-built functions for tuples, membership, iteration etc.

You can also download my Jupyter notebook containing below code.

Declaration

tup1 = ()
tup2 = (50, )
tup3 = (50, 8)
tup4 = 'a', 'b', 'c', 'd'
x, y = 1, 2;
print(tup1, tup2, tup3, tup4)
print(x, y)

Output

() (50,) (50, 8) ('a', 'b', 'c', 'd')
1 2

Indexing and Slicing

tup5 = ('a', 'b', 100, 'abc', 'xyz', 2.5);
tup6 = (1, 2, 3, 4, 5, 6, 7);
print(tup5[0], tup5[1], tup5[-1], tup5[-2])
print(tup6[0:4])
print(tup6[:])
print(tup6[:4])
print(tup6[1:])
print(tup6[1:4])
print(tup6[1:-1])
print(tup6[1:-2])

Output

a b 2.5 xyz
(1, 2, 3, 4)
(1, 2, 3, 4, 5, 6, 7)
(1, 2, 3, 4)
(2, 3, 4, 5, 6, 7)
(2, 3, 4)
(2, 3, 4, 5, 6)
(2, 3, 4, 5)

Packing and Unpacking

In packing, we place value into a new tuple while in unpacking we extract those values back into variables.

x = ('Google', 208987, 'Software Engineer')
print(x[1])
print(x[-1])
(company, emp_no, profile) = x
print(company)
print(emp_no)
print(profile)

Output

208987
Software Engineer
Google
208987
Software Engineer

Comparison

a = (5, 6)
b = (1, 4)
if(a > b): print('a is bigger')
else: print('b is bigger')

Output: a is bigger

a = (5, 6)
b = (5, 4)
if(a > b): print('a is bigger')
else: print('b is bigger')

Output: a is bigger

a = (5, 6)
b = (6, 4)
if(a > b): print('a is bigger')
else: print('b is bigger')

Output: b is bigger

Concatenation

a = (1, 1.5)
b = ('abc', 'xyz')
c = a + b
print(c)

Output: (1, 1.5, 'abc', 'xyz')

Repetition

a = (1, 1.5)
b = a * 3
print(b)

Output: (1, 1.5, 1, 1.5, 1, 1.5)

Update Tuple

Tuples are immutable which means you cannot update or change the values of tuple elements. It does not support item assignment.

a = (1, 1.5)
b = ('abc', 'xyz')
a[0] = 2; #TypeError: 'tuple' object does not support item assignment

Delete Tuple

Tuples are immutable and cannot be deleted, but deleting tuple entirely is possible by using the keyword "del."

a = (5, 6)
print(a)
del a
print(a)  #NameError: name 'a' is not defined

In-built Functions

a = (5, 2, 8, 3, 6, 2, 5, 5)
print('Length:', len(a))
print('Min:', min(a))
print('Max:', max(a))
print('Count of 5:', a.count(5))
print('Index of 2:', a.index(2))
print('Sorted:', sorted(a))
print('Tuple:', tuple(a))
print('List:', list(a))

Output

Length: 8
Min: 2
Max: 8
Count of 5: 3
Index of 2: 1
Sorted: [2, 2, 3, 5, 5, 5, 6, 8]
Tuple: (5, 2, 8, 3, 6, 2, 5, 5)
List: [5, 2, 8, 3, 6, 2, 5, 5]

Membership

3 in (1, 2, 3)

Output: True

tuple_alphabets = ('a', 'b', 'c', 'd', 'e')
if 'c' in tuple_alphabets:
    print('Found')
else:
    print('Not Found')

Output: Found

Iteration

Iterating through tuple is faster than with list, since tuples are immutable.

for x in (1, 2, 3): 
    print (x)

Output

1
2
3

Tuple in Dictionary

Dictionary can return the list of tuples by calling items, where each tuple is a key value pair.

a = {'x':100, 'y':200}
b = (a.items())
c = list(a.items())
print(a)
print(b) 
print(c)

Output

{'x': 100, 'y': 200}
dict_items([('x', 100), ('y', 200)])
[('x', 100), ('y', 200)]