# Measures of relative standing - z-score, quartiles, percentiles

0/4
##### Introduction
###### Lessons
1. Z-Score
2. Quartiles
3. InterQuartile Range
4. Percentiles
0/10
##### Examples
###### Lessons
1. Using Z-score to Compare the Variation in Different Populations
Charlie got a mark of 85 on a math test which had a mean of 75 and a standard deviation of 5. Daisy got a mark of 75 on an English test which had a mean of 69 and a standard deviation of 2. Relative to their respective mean and standard deviation, who got the better grade?
1. Determining the Quartiles
Find the quartiles for each data set:
1. {9, 3, 7, 5, 2, 8, 12}
2. {2, 3, 5, 7, 8, 9, 12, 15}
3. {2, 3, 5, 7, 8, 9, 12, 15, 35}
2. Interquartile Range & Box-and-Whisker Plot
For the data set: {8, 2, 20, 4, 9, 5, 6, 12, 10, 1}
1. Determine the quartiles.
2. Find the interquartile range.
3. Construct a box-and-whisker plot.
4. Which data points, if any, are outliers?
3. Determining the Percentile
Sidney is taking a biology course in university. She got a mark of 78% and the list of all marks from her class (including her mark) is given by {56, 83, 74, 67, 47, 54, 82, 78, 86, 90}.
1. What percentile did she score in?
2. Sidney's friend Billy knows he got in the 70% percentile, what was his mark?
###### Free to Join!
StudyPug is a learning help platform covering math and science from grade 4 all the way to second year university. Our video tutorials, unlimited practice problems, and step-by-step explanations provide you or your child with all the help you need to master concepts. On top of that, it's fun - with achievements, customizable avatars, and awards to keep you motivated.
• #### Easily See Your Progress

We track the progress you've made on a topic so you know what you've done. From the course view you can easily see what topics have what and the progress you've made on them. Fill the rings to completely master that section or mouse over the icon to see more details.
• #### Make Use of Our Learning Aids

###### Practice Accuracy

See how well your practice sessions are going over time.

Stay on track with our daily recommendations.

• #### Earn Achievements as You Learn

Make the most of your time as you use StudyPug to help you achieve your goals. Earn fun little badges the more you watch, practice, and use our service.
• #### Create and Customize Your Avatar

Play with our fun little avatar builder to create and customize your own avatar on StudyPug. Choose your face, eye colour, hair colour and style, and background. Unlock more options the more you use StudyPug.

## Measures of relative standing

A measure of relative standing is a way to describe the relationship between a specific value in a data set with the rest of the values in the set, or, a way to compare values coming from different data sets with each other. Specifically, a measure of relative standing refers to mathematical tricks that allow you to scale a data set and its distribution in a way that you can meaningfully compare this data in many ways (be it within itself, or with other proportionally scaled data sets); for that, a measure of relative standing focuses on the relative position of a data value within the data set and they are also called measures of location or measures of position.
The three basic measures of relative standing are the z-score (also called the standard score), the percentiles (and their percentile rank) and quartiles.

#### What is a z score?

The z score let us know of how far away a data point is from the mean of its set, in units of the standard deviation of the set. In other words, once you have calculated the mean of a data set and its distribution, you can calculate how many of these standard deviations separate each particular data point from the mean, that is the z score for each value.

#### • $\quad$ What does z score mean

The z score definition above may seem too simple but the process is quite remarkable, let us expand on this. The z score, also called the standard score or the standardized score, is used to re-scale a data set and its distribution so we can meaningfully compare it with others. What does that mean? Imagine you are collecting statistical data from the people in the city of Richmond by checking on records done by official government agencies or specialized companies. After some research you obtain data from the ages of the city population, the rate of car ownership, information on how many of them have a professional degree and how many own a house. You have all of these sets of data (which we assume are normally distributed), for which you can create frequency distributions and histograms to compare them, and after you do, you arrive to an issue: No matter how well done are your distribution graphs, you cannot accurately compare them because the samples of the population used in each statistical data set are different, and so, the proportions do not fit with one or the other. Here is where normalization and the z-score come to play a role! Calculating the z score of the values in each data set you can produce re-scaled distributions that can literally be overlapped on each other for comparison.

The process can get quite complicated, so let us first start with the basic calculation for the z score, and once we have learned more about the normal distribution we can come back to the use of the z score for higher difficulty, unrelated data set, comparisons.

#### • $\quad$ How to calculate a z score

In order to calculate the z score of a population we follow the next formula:

$\large Z_{x} = \frac{x- \mu}{\sigma}$

Where:
$Z_x$ = Z score
$x$ = to the data value
$\mu$ = mean of the data set
$\sigma$ = standard deviation of the data set (which is a population in this case)

Equation 1 is also called the standard score formula and it represents the mathematical z-score definition.
Accordingly, the z score equation for a sample is defined as:

$\large Z_{X} = \frac{x- \bar{x}}{s}$

Where:
$Z_x$ = Z score
$x$ = to the data value
$\bar{x}$ = mean of the data set
$s$ = standard deviation of the data set (which is a population in this case)

Let us look at the usage of the z score in the next example:

#### Example 1

Using Z-score to Compare the Variation in Different Populations, look at the next case:
Charlie got a mark of 85 on a math test which had a mean of 75 and a standard deviation of 5. Daisy got a mark of 75 on an English test which had a mean of 69 and a standard deviation of 2. Relative to their respective mean and standard deviation, who got the better grade?

We need to calculate the z-score for the grades of Charlie and Daisy and see who (if any) was among the best on their classes. We have the following information:

$Z_{Ch}$ = Z score for Charlie
$x$ = 85
$\mu$ = 75
$\sigma$ = 5

$Z_D$ = Z score for Daisy
$x$ = 75
$\mu$ = 69
$\sigma$ = 2

Therefore, using the z score formula from equation 1, we calculate the z scores for each student and find:

$\large Z_{Ch} = \frac{x- \mu}{\sigma} = \frac{85-75}{5} = \frac{10}{5} = 2$

$\large Z_{D} = \frac{x- \mu}{\sigma} = \frac{75-69}{2} = \frac{6}{2} = 3$

So after we have gotten the corresponding z scores, how do we know which of their grades is better? Well, the results from equation 3 tell us that Charlie got a test mark 2 standard deviations higher than the mean of the class, while Daisy got a mark that is 3 standard deviations higher than the mean in her class. Therefore, proportionally speaking, Daisy did better within her class in comparison to Charlie.

NOTICE: Daisy did better WITHIN her class, in comparison to how Charide did WITHIN his class; thus, the z score calculation let us know how they proportionately did within their classes (meaning that Daisy was probably among the people with the highest marks for that test in her class). This does not mean that Charlies grade is absolutely worse than Daisys. If taken as an absolute value only, Charlie still got a higher mark compared to Daisy; still, proportionally speaking, it appears that people in Charlies class got higher marks too and so he wasnt among the very highest marks in his class.

#### What is a percentile?

Now let us talk about another measure of relative standing, the percentile. Percentiles indicate the percentage of data outcomes in a set which fall under a certain value.

#### • $\quad$ How do percentiles work

Percentiles divide the whole data set into a hundred equal parts, when translating this into a distribution graph, the percentiles produce 99 division marks that denote the percentage of data located up to a certain value. Each of the 99 division marks within the distribution is what we call a percentile. When looking at a percentile mark on a specific data value, we can see the percentage of data that is found below (or up to) that value, therefore, percentiles do not necessarily lay equally separated on a distribution (look at the bottom of figure 1 to see for yourself).

#### • $\quad$ How to calculate percentiles

In order to calculate the percentile of a certain value $X$ from the data set we follow the next equation:

Percentile of$\; X = \frac{number\;of\;data\;points\;less\;than\;X}{total\;number\;of\;data\;points} \times 100$

Let us look at an example so you see the process of finding percentiles in action:

#### Example 2

Sidney is taking a biology course in university. She got a mark of 78% and the list of all marks from her class (including her mark) is given by {56, 83, 74, 67, 47, 54, 82, 78, 86, 90}.
1. What percentile did she score in?
2. Sidneys friend Billy knows he got in the 70% percentile, what was his mark?

First we order the scores from lowest to highest: {47, 54, 56, 67, 74, 78, 82, 83, 86, 90}. Notice we put Sidneys score in bold. Now, solving for the percentile Sidney scored in, we use the percentile formula shown in equation 4:

Percentile for Sidney's score$= \frac{5}{10} \times 100 = 50$

So we have that Sidney scored in the 50th percentile (or above the 50%).
Now to answer the second question of this problem, let see what is Billys mark if he is in the 70th percentile: Using the percentile equation (equation 4) we solve for the number of data points less than X so we can then go and check back which score meets this condition in the set:

$\frac{Percentile\;of\;X\; \times\; total\;number\;of\;data\;points}{100} =$number of data points less than$\; X$

$\frac{70 \times10}{100} = 7$

Therefore, there are 7 data values in the set before Billys score, which means Billy got a 83% in his Biology course.

#### What are quartiles

Just as its name indicates, a quartile focuses on dividing the data distribution into four parts, where each quartile is the specific point marking the division between the first quarter and the second, the second quarter and the third or the third quarter and the fourth. In simple words, quartiles are values that divide a data set into quarters after the data set has been ordered; each quartile has a name and they are: $Q_1, Q_2$ and $Q_3$.

Where:
$Q_1$ = splits the lowest 25% of the sorted data
$Q_2$ = Median=splits the lowest 50% of the sorted data
$Q_2$ = splits the lowest 75% of the sorted data

The middle 50% of the data in the data set and its proper distribution comprises the interval named the interquartile range, which is equal to subtracting the first quartile from the third quartile.
Do not confuse a quartile with a quarter, while each quarter refers to the whole fraction of the data representing 25% of it, the quartile is the point that marks the division between one quarter and the other.

#### • $\quad$ How to calculate quartiles

Let us explain the method to calculate quartiles with the next example:

#### Example 3

Find the quartiles for each data set:

a) $\quad$ {9, 3, 7, 5, 2, 8, 12}
We first find the median, for which you have to order the data values from lowest to highest first and then find the value in the midpoint.

{9, 3, 7, 5, 2, 8, 12} = {2, 3, 5, 7, 8, 9, 12}

The media represents the second (or middle) quartile, for this case $Q_2 = 7$.
Then we just obtain the median for each half of data values on the left and right of 7, and so:

{2, 3, 5, 7, 8, 9, 12} = {2, 3, 5, 7, 8, 9, 12}

And we obtain that $Q_1=3$ and $Q_3 = 9$.

b) $\quad$ {2, 3, 5, 7, 8, 9, 12, 15}
This particular data set has its values already ordered from lowest to highest, therefore, we just find the median:

{2, 3, 5, 7, 8, 9, 12, 15}

Since the data set has an even amount of values, we obtain the median by averaging the two center values on the set:

$Q_2 = \frac{7\;+\;8} {3} = 7.5$

Therefore $Q_2 = 7.5.$
And then we find the median for the range of values on each half of the data set:

{2, 3, 5, 7, 8, 9, 12, 15} = {2, 3, 5, 7},{ 8, 9, 12, 15}

Calculating first and third quartiles:

$Q_1 = \frac{3\;+\;5}{2} = 5$

$Q_3 = \frac{9\;+\;12}{2} = 10.5$

Therefore $Q_1 = 4$ and $Q_3 = 10.5.$

c) $\quad$ {2, 3, 5, 7, 8, 9, 12, 15, 35}
Data set $c$ is already ordered too, and given that it has an odd amount of values we can easily find its median:

{2, 3, 5, 7, 8, 9, 12, 15, 35}

And so $Q_2 = 8.$
Now we get the median of each half of the data set at each side of the median we just got:

{2, 3, 5, 7, 8, 9, 12, 15, 35} = {2, 3, 5, 7}, {8}, {9, 12, 15, 35}

Calculate the first and third quartiles:

$Q_1 = \frac{3\;+\;5}{2} = 4$

$Q_3 = \frac{12\;+\;15}{2} = 13.5$

Therefore $Q_1=4$ and $Q_3=13.5.$

Therefore, the steps for finding quartiles are:
• Find the median of the data set.
• If the set has an odd number of values, then the median is the value in the middle and is equal to the second quartile.
• If the set has an even number of values, then the median is obtained by averaging the two middle values.

• If the set had an odd number of values, then the first and third quartile will be the median of the values before the middle value, and the median of the values after the middle value respectively.
• If the values before and after the middle value are an odd number of values, then their middle values will be the first and third quartiles.
• If the values before and after the middle value are an even number of values, then the median of each side is obtained by averaging the pair of middle values on each side. These will be the first and third quartile.

• If the set had an even number of values, the second quartile is calculated by averaging the two middle terms (obtaining the median of the set).
• Then, the set is divided by a midpoint. The whole first half is used to obtain the first quartile, and the whole second half is used to obtain the third quartile.
• If the values before and after the midpoint are an odd number of values, then their middle values will be the first and third quartiles.
• If the values before and after the midpoint are an even number of values, then the median of each side is obtained by averaging the pair of middle values on each side. These will be the first and third quartile.

The process has been summarized in the next diagram for each type of data you might found:

***

In summary, the measures of relative standing are those point marks or calculations that allow you to see where a particular data value is within the complete data set (or its proper distribution); the z-score will tell you how many standard deviations is a certain value away from the mean (either above or below it), the percentiles will tell you in which of the 99 points that divide the data set into 100 equal parts is your data point located and even provide you with a rank on how much data is above or below it, and the quartiles will do the same as the percentiles but dividing the data in four equal parts only.

Now, we recommend you to take a look at the next links so you can continue your independent studies in what you learned today. This lesson covers the most important measure of relative standing: the z-score, this short article contains an explanation of what is percentile rank and how is it different from percentage, and this page talks about other locations in a distribution, where they describe not only quartiles but deciles too! We suggest you to take a look to them so you can see more example problems.

This is it for the lesson of today, see you in the next one!
$\cdot$ $z_x$: z-score, a measure of how many standard deviations a data item $x$ is from the mean.

population: $z_x= \frac{x- \mu}{\sigma}$

sample: $z_x= \frac{x- \overline{x}}{s}$

z-score allows comparison of the variation in different populations/samples.

$\cdot$ Quartiles: values that divide the data set into quarters.

$Q_1=$ bottom 25% of data
$Q_2=$ Median $=$ bottom 50% of data
$Q_3=$ bottom 75% of data

$\cdot$ InterQuartile Range (IQR): represents the middle 50% of the data set.

$IQR= Q_3-Q_1$

$\cdot$ Percentiles: indicates what percentage of the data falls below a certain value

$Percentile\;of\;X= \frac{number\;of\;data\;points\;less\;than\;X}{total\;number\;of\;data\;points}$

$\cdot$ Outliers: an outlier is a data point which lies an abnormal distance from all other data points.

Outliers are either,

a) above $Q_3+1.5(IQR)$
or
b) below $Q_1- 1.5(IQR)$