Middle Years

Lesson

The mean is often referred to as the average. To calculate the mean, add all the scores in a data set, then divide this by number of scores.

To find the mean from a graphical representation, we can use a frequency table to list out the values of on the graph. Consider the histogram below:

We can construct a frequency table like the one below:

Score ($x$x) |
Frequency ($f$f) |
$xf$xf |
---|---|---|

$1$1 | $3$3 | $3$3 |

$2$2 | $8$8 | $16$16 |

$3$3 | $5$5 | $15$15 |

$4$4 | $3$3 | $12$12 |

$5$5 | $1$1 | $5$5 |

The mean will be calculated by dividing the sum of the last column by the sum of the second column, $\frac{51}{20}=2.55$5120=2.55.

The median is one way of describing the middle or the centre of a data set using a single value. The median is the **middle score** in a data set.

Suppose we have five numbers in our data set: $4$4, $11$11, $15$15, $20$20 and $24$24.

The median would be $15$15 because it is the value right in the middle. There are two numbers on either side of it.

$4,11,\editable{15},20,24$4,11,15,20,24

If we have an even number of terms, we will need to find the average of the middle two terms. Suppose we wanted to find the median of the set $2,3,6,9$2,3,6,9, we want the value halfway between $3$3 and $6$6. The average of $3$3 and $6$6 is $\frac{3+6}{2}=\frac{9}{2}$3+62=92, or $4.5$4.5, so the median is $4.5$4.5.

$2,3,\editable{4.5},6,9$2,3,4.5,6,9

If we have a larger data set, however, we may not be able to see right away which term is in the middle. We can use the "cross out" method.

Once a data set is ordered, we can cross out numbers in pairs (one high number and one low number) until there is only one number left. Let's check out this process using an example. Here is a data set with nine numbers:

- Check that the data is sorted in ascending order (i.e. in order from smallest to largest).

- Cross out the smallest and the largest number, like so:

- Repeat step 2, working from the outside in - taking the smallest number and the largest number each time until there is only one term left. We can see in this example that the median is $7$7:

Note that this process will only leave one term if there are an odd number of terms to start with. If there are an even number of terms, this process will leave two terms instead, if you cross them all out, you've gone too far! To find the median of a set with an even number of terms, we can then take the mean of these two remaining middle terms.

The idea behind the cross out method can be used in graphical representations by cross off data points from each side.

The mode describes the most frequently occurring score.

Suppose that $10$10 people were asked how many pets they had. $2$2 people said they didn't own any pets, $6$6 people had one pet and $2$2 people said they had two pets.

In this data set, the most common number of pets that people have is one pet, and so the mode of this data set is $1$1.

A data set can have more than one mode, if two or more scores are equally tied as the most frequently occurring.

The range of a numerical data set is the difference between the smallest and largest scores in the set. The range is one type of measure of spread.

For example, at one school the ages of students in Year $7$7 vary between $11$11 and $14$14. So the range for this set is $14-11=3$14−11=3.

As a different example, if we looked at the ages of people waiting at a bus stop, the youngest person might be a $7$7 year old and the oldest person might be a $90$90 year old. The range of this set of data is $90-7=83$90−7=83, which is a much larger range of ages.

Measures of spread

The range of a numerical data set is given by:

Range$=$=maximum score$-$−minimum score

Measures of center

Mean

- The numerical average of a data set, this is the sum of the data values divided by the number of data values.
- Appropriate for sets of data where there are no values much higher or lower than those in the rest of the data set

Median

- The middle value of a data set ranked in order
- A good choice when data sets have a couple of values much higher or lower than most of the others

Mode

- The data value that occurs most frequently
- A good descriptor to use when the set of data has some identical values, when data is non-numeric (categorical) or when data reflects the most popular item

Find the range of the following set of scores:

$20,19,3,19,18,3,16,3$20,19,3,19,18,3,16,3

Answer the following given this set of scores:

$9,4,14,19,20,15,12$9,4,14,19,20,15,12

Sort the scores in ascending order.

Find the total number of scores.

Find the median.

Find the range.

Assess how various changes to data sets alter their characteristics.

Consider the set of data:

$1,2,2,4,4,5,6,6,8,8,8,9,9$1,2,2,4,4,5,6,6,8,8,8,9,9

If one score of $8$8 is changed to a $9$9, which of the following would be altered? There may be more than one correct option.

Median

AMean

BRange

CMode

DMedian

AMean

BRange

CMode

DConsider this set of data that represents the number of apps on six people’s phones.

$11,12,15,17,19,19$11,12,15,17,19,19

If each person downloads another $7$7 apps, which of the following would change? There may be more than one correct option.

Mode

AMean

BRange

CMedian

DMode

AMean

BRange

CMedian

D