Mean, Median and Mode are three types of "averages" and are very useful in Data Wrangling step when we have to impute the missing values in numeric variables.

Lets say we have a list of 7 numbers as follows:

72, 2, 6, 32, 55, 15, 9

2, 6, 9,

We see that the middle most element in this ordered list is the number 15. Therefore, 15 is the median of the list.

In the above given list of ordered numbers, if I change the last number 72 to 99999, the mean becomes:

(2+6+9+15+32+55+99999) / 7 = 14302.5714

but the median of this list still remains the same. Because in this ordered list

2, 6, 9,

the middle element still remains the same, which is 15.

If we have a set where every number is equidistant from its predecessor (or successor) like:

3, 6, 9, 12, 15

Then the mean

(3+6+9+12+15) / 5 = 9

and the median (middle element is 9) are equal.

2, 5, 3, 2, 2, 8, 7, 2, 5, 2

In the above list, the number

So, while Data Wrangling, we can consider any of the above three methods depending upon the nature of distribution of values in a particular numeric variable.

**Mean**: is obtained by adding up all the values in the list of numbers and then divide the resulting sum by the number of values in the list.**Median**: is the middle value in the list of numbers. To find the median, list of numbers should be sorted.**Mode**: is the value that occurs most often. If no number in the list is repeated, then there is no mode for the list.**Example**Lets say we have a list of 7 numbers as follows:

72, 2, 6, 32, 55, 15, 9

**Mean**: (72+2+6+32+55+15+9) / 7 =**27.2857****Median**: Lets arrange the above list in an ascending order, we get :2, 6, 9,

**15**, 32, 55, 72We see that the middle most element in this ordered list is the number 15. Therefore, 15 is the median of the list.

**Difference between Mean and Median**In the above given list of ordered numbers, if I change the last number 72 to 99999, the mean becomes:

(2+6+9+15+32+55+99999) / 7 = 14302.5714

but the median of this list still remains the same. Because in this ordered list

2, 6, 9,

**15**, 32, 55, 99999the middle element still remains the same, which is 15.

**Relation between Mean and Median**If we have a set where every number is equidistant from its predecessor (or successor) like:

3, 6, 9, 12, 15

Then the mean

(3+6+9+12+15) / 5 = 9

and the median (middle element is 9) are equal.

**Mode**: In the above list, there is no mode as all the numbers are unique (no repetition of any number). But consider the following list:2, 5, 3, 2, 2, 8, 7, 2, 5, 2

In the above list, the number

**2**is repeated a lot of times. So, mode of this list is 2.So, while Data Wrangling, we can consider any of the above three methods depending upon the nature of distribution of values in a particular numeric variable.

## No comments:

## Post a Comment