Systematic sampling

Q: Calculate the number of items of data in the sample.

As Luke wants a sample of [katex] 25\% [/katex], we need to calculate [katex] 25\% [/katex] of [katex] 1200 [/katex]: [katex] \frac{1200}{100}\times{25}=300 [/katex]

Q: Calculate the interval.

As we need [katex] 300 [/katex] tracks, and we are using a systematic sample, we need to choose the tracks using a sequence. The interval is: [katex] \text{Interval}=\frac{\text{Population size}}{\text{Sample size}}=\frac{1200}{300}=4 [/katex] So we need to pick every [katex] 4^\text{th} [/katex] term in the data.

Q: Use a random number generator to select the first item of data.

As we need to pick every [katex] 4^\text{th} [/katex] term, the first number in the sample must be randomly chosen from the first [katex] 4 [/katex] terms. Using a random number generator, we get the number [katex] 1 [/katex], so we choose the first item of data in the sample to be the [katex] 1^\text{st} [/katex] track. Below we have used a table to show how the sequence develops*:

Q: Select the remaining items of data following the given sequence.

As we are selecting every [katex] 4^\text{th} [/katex] item, we can select the following tracks from the data: The sample will therefore contain [katex] 300 [/katex] tracks that belong to the sequence [katex] 4n-3 [/katex].

Q: Calculate the number of items of data in the sample.

The sample size is [katex] 5\% [/katex] so we need to calculate [katex] 5\% [/katex] of [katex] 960 [/katex]: [katex] \frac{960}{100}\times{5}=48 [/katex] vehicles

Q: Calculate the interval.

As we need [katex] 48 [/katex] vehicles, the interval is: [katex] \text{Interval}=\frac{\text{Population size}}{\text{Sample size}}=\frac{960}{48}=20 [/katex] So we need to pick every [katex] 20^\text{th} [/katex] item of data.

Q: Use a random number generator to select the first item of data.

As we need to pick every [katex] 20\text{th} [/katex] term, the first number in the sample must be randomly chosen from the first [katex] 20 [/katex] terms. Using a random number generator, we get the number [katex] 17 [/katex], so we choose the first item of data in the sample to be the [katex] 17^\text{th} [/katex] vehicle. Below we have used a table to show how the sequence develops*:

Q: Select the remaining items of data following the given sequence.

As we are selecting every [katex] 20^\text{th} [/katex] item, we can select the following vehicles from the data: The sample will therefore contain 48 vehicles that belong to the sequence [katex] 20n-3 [/katex].

GCSE Tutoring Programme

Our chosen students improved 1.19 of a grade on average - 0.45 more than those who didn't have the tutoring.

Teacher-trusted tutoring

In order to access this I need to be confident with:

Calculator skills Sequences Fraction of an amount Percentage of an amount

This topic is relevant for:

GCSE Maths Statistics Types Of Sampling

Systematic Sampling

Here we will learn about systematic sampling, including what systematic sampling is, how to take a systematic sample and the advantages and disadvantages of systematic sampling.

There are also systematic sampling worksheets based on Edexcel, AQA and OCR exam questions, along with further guidance on where to go next if you’re still stuck.

What is systematic sampling?

Systematic sampling is a type of probability sampling that selects items of data at regular intervals from a population. Systematic sampling is not examined in GCSE Mathematics; however it is on the GCSE Statistics specification and will also be examined in A level Mathematics.

Every data entry for the population must be given in a list (a sampling frame) so that they have an equal chance of being selected.

We select the first item of data using a random number generator and then select the rest at regular intervals.

Sampling method	Description	Example
Systematic sampling	Every member in the population is given a number. After the first member is chosen, the remaining members are chosen from a given interval.	A list of people with their first names in alphabetical order are numbered. The 5th person is chosen randomly, followed by every subsequent 8th person

The sampling interval

To calculate the interval required to select the sample data, we calculate the population size divided by the sample size.

$\text{Interval}=\frac{\text{Population size}}{\text{Sample size}}$

E.g.

If the population size is $1200$ and the desired sample size is $400$ items of data, we divide $1200$ by $400$ to get an interval of $3$ .

This means that every $3^\text{rd}$ item of data in the ordered list is selected for the sample.

E.g.

A factory that manufactures cars must regularly assess the quality of production. In one month, $5\%$ of cars are selected using a systematic sample to be rigorously tested for quality purposes. The first car is chosen at random then every $10^\text{th}$ car that follows.

This systematic sample helps the company to ensure the quality of their car manufacture is maintained. Testing each vehicle would be costly and take too much time.

Advantages and disadvantages of systematic sampling

Following a systematic sampling methodology has advantages and disadvantages:

Advantages	Disadvantages
It is more time efficient than asking the entire population. The sample can be selected proportional to the total population (stratified)	Every member of the population must be listed. The first member of the population must be chosen at random to avoid bias. If a data entry is missing/empty, the entry is not included.

What is systematic sampling?

How to take a systematic sample

In order to take a systematic sample:

Order the population and give each data entry a unique reference number.
Calculate the number of items of data in the sample.
Calculate the interval.
Use a random number generator to select the first item of data.
Select the remaining items of data following the given sequence.

How to take a systematic sample

Types of sampling methods worksheet (includes systematic sampling)

Get your free systematic sampling worksheet of 20+ types of sampling methods questions and answers. Includes reasoning and applied questions.

DOWNLOAD FREE

Types of sampling methods worksheet (includes systematic sampling)

Get your free systematic sampling worksheet of 20+ types of sampling methods questions and answers. Includes reasoning and applied questions.

DOWNLOAD FREE

Systematic sampling is part of our series of lessons to support revision on types of sampling methods. You may find it helpful to start with the main types of sampling methods lesson for a summary of what to expect, or use the step by step guides below for further detail on individual topics. Other lessons in this series include:

Systematic sampling examples

Example 1: systematic sampling – production line

A company produces biscuits at $100$ per minute. A machine checks the weight of $10\%$ of the biscuits. The biscuits pass through the machine one at a time. Use systematic random sampling to select the biscuits for the sample over $3$ minutes.

Order the population and give each data entry a unique reference number.

As each biscuit passes through the machine one at a time, we can assume that the first biscuit is number $1$ , the second is biscuit number $2$ , etc.

2Calculate the number of items of data in the sample.

In $3$ minutes, there will be $3\times100=300$ biscuits. As the company checks $10\%$ of the biscuits, we need a sample of: $\frac{300}{100}\times{10}=30$

3Calculate the interval.

As we need $30$ biscuits, and we are using a systematic sample, we need to choose the biscuits using a sequence. We determine the interval in the sequence by dividing the sample size by the population size: $\text{Interval}=\frac{\text{Population size}}{\text{Sample size}}=\frac{300}{30}=10$

So we need to pick every $10^\text{th}$ term in the data.

4Use a random number generator to select the first item of data.

As we need to pick every $10^\text{th}$ term, the first number in the sample (starting point of the sequence) must be randomly chosen from the first $10$ terms. Using a random number generator, we get the number $6$ , so we choose the first item of data in the sample to be the $6^\text{th}$ biscuit.

Below we have used a table to show how the sequence develops*:

5Select the remaining items of data following the given sequence.

As we are selecting every $10^\text{th}$ item, we can select the following biscuits from the data:

The sample will therefore contain $30$ biscuits with the following numbers:

$6, 16, 26, 36, 46, 56, 66, 76, 86, 96, 106, 116, 126, 136, 146, 156, 166, 176, 186,$ $196, 206, 216, 226, 236, 246, 256, 266, 276, 286$ and $296$ .

Note: These numbers are in the sequence $10n-4$ .

Example 2: systematic sampling – sample as percentage of population

Luke is looking at the beats per minute of tracks in his music player. He has $1200$ tracks. He decides to take a systematic sample of $25\%$ of his tracks. Determine the tracks that should be chosen.

Order the population and give each data entry a unique reference number.

Let’s use the number of plays to sort the data into an order. The first track in the list will be number $1$ , the second track number $2$ , etc.

Calculate the number of items of data in the sample.

As Luke wants a sample of $25\%$ , we need to calculate $25\%$ of $1200$ :

\frac{1200}{100}\times{25}=300

Calculate the interval.

As we need $300$ tracks, and we are using a systematic sample, we need to choose the tracks using a sequence. The interval is:

\text{Interval}=\frac{\text{Population size}}{\text{Sample size}}=\frac{1200}{300}=4

So we need to pick every $4^\text{th}$ term in the data.

Use a random number generator to select the first item of data.

As we need to pick every $4^\text{th}$ term, the first number in the sample must be randomly chosen from the first $4$ terms. Using a random number generator, we get the number $1$ , so we choose the first item of data in the sample to be the $1^\text{st}$ track.

Below we have used a table to show how the sequence develops*:

Select the remaining items of data following the given sequence.

As we are selecting every $4^\text{th}$ item, we can select the following tracks from the data:

The sample will therefore contain $300$ tracks that belong to the sequence $4n-3$ .

Example 3: systematic sampling – small sample size

A traffic management company is researching the proportion of lorries that use a single carriageway between $8am$ and $9am$ . A traffic camera records the details of every vehicle and produces a list of data in the order of the time that the vehicle passes the camera. $960$ vehicles are recorded within the hour on one day. The company uses a systematic sample to select a random sample of $5\%$ of the data for their research. Determine which vehicles will be in the sample.

Order the population and give each data entry a unique reference number.

The population data is in order given their time stamp and so we can list the first vehicle in the list as number $1$ , second vehicle is number $2$ , etc.

Calculate the number of items of data in the sample.

The sample size is $5\%$ so we need to calculate $5\%$ of $960$ :

$\frac{960}{100}\times{5}=48$ vehicles

Calculate the interval.

As we need $48$ vehicles, the interval is:

\text{Interval}=\frac{\text{Population size}}{\text{Sample size}}=\frac{960}{48}=20

So we need to pick every $20^\text{th}$ item of data.

Use a random number generator to select the first item of data.

As we need to pick every $20\text{th}$ term, the first number in the sample must be randomly chosen from the first $20$ terms. Using a random number generator, we get the number $17$ , so we choose the first item of data in the sample to be the $17^\text{th}$ vehicle.

Below we have used a table to show how the sequence develops*:

Select the remaining items of data following the given sequence.

As we are selecting every $20^\text{th}$ item, we can select the following vehicles from the data:

The sample will therefore contain 48 vehicles that belong to the sequence $20n-3$ .

Example 4: systematic sampling – large population

A local council is researching the distribution of voters in $15000$ homes. They take a systematic sample of $2\%$ of homes, listed in order of their postcode and house number. Determine which homes will be asked to participate in the survey.

Order the population and give each data entry a unique reference number.

The population data is in order given their postcode and house number and so we can assume that the first home on the list is number $1$ , the second home number $2$ , etc.

Calculate the number of items of data in the sample.

The sample size is $2\%$ , so we need to calculate $2\%$ of $15000$ :

\frac{15000}{100}\times{2}=300

Calculate the interval.

As we need $300$ homes, and we are using a systematic sample, we need to choose the homes using a sequence. The interval for this set of data is equal to:

\text{Interval}=\frac{\text{Population size}}{\text{Sample size}}=\frac{15000}{300}=50

So we need to pick every $50^{th}$ term in the data.

Use a random number generator to select the first item of data.

As we need to pick every $50^{th}$ term, the first number in the sample must be randomly chosen from the first $50$ terms. Using a random number generator, we get the number $8$ , so we choose the first item of data in the sample to be the $8^\text{th}$ home.

Below we have used a table to show how the sequence develops*:

Select the remaining items of data following the given sequence.

As we are selecting every $50^{th}$ item, we can select the following vehicles from the data:

The sample will therefore contain $300$ homes that belong to the sequence $50n-42$ .

Example 5: systematic sampling – small sample

An online clothing company is researching the average customer spend over the previous month. There were $12664$ orders purchased, and each order has a unique reference number. The company takes a systematic sample of $2\%$ of orders. Determine which orders will be chosen for the sample.

Order the population and give each data entry a unique reference number.

As each order has a unique reference number, we can order the numbers from smallest to largest, and then number each item of data from $1-12664$ .

Calculate the number of items of data in the sample.

As the company is taking a sample of $2\%$ , we need to calculate $2\%$ of $12664$ :

(12664\div100)\times2=253.28

The sample size is $253$ orders.

Calculate the interval.

The interval is equal to $\text{Interval}=\frac{\text{Population size}}{\text{Sample size}}=\frac{12664}{253}=50\text{ (0dp)}$

Use a random number generator to select the first item of data.

Using a random number generator, we need to select the first item of data from the first $50$ orders. The random number chosen is $13$ .

The first item of data in the list is the $13^{th}$ order.

Select the remaining items of data following the given sequence.

As the interval is $50$ , the next order will be $13+50=63$ , then $63+50=113$ , then $113+50=163$ … and so on until we have selected the $253$ items of data.

The sample will therefore contain $253$ items of data that belong to the sequence $50n-37$ .

Example 6: systematic sampling – deciding the population size

A café is carrying out some market research. Out of $1240$ customers that entered the café during a weekend, $950$ allowed the café to email them a questionnaire. The café takes a systematic sample size of $12\%$ of those who received the questionnaire. Determine which customers will be part of the sample.

Order the population and give each data entry a unique reference number.

Despite there being $1240$ customers, the population size is $950$ as these customers received a questionnaire. As they provided an email address, the population can be listed using their email address, in alphabetical order.

Calculate the number of items of data in the sample.

The café is taking a sample size of $12\%$ .

(950\div100)\times12=114

The sample will contain $114$ items of data.

Calculate the interval.

The interval is $\text{Interval}=\frac{\text{Population size}}{\text{Sample size}}=\frac{950}{114}=8\text{ (0dp)}$

Use a random number generator to select the first item of data.

As every $8^{th}$ customer is being selected, the first customer must be randomly chosen from the first $8$ items of data. Using a random number generator, the $4^{th}$ customer is chosen.

Select the remaining items of data following the given sequence.

As the first customer is number $4$ , and every $8^{th}$ customer is being selected after, we continue to add $8$ to the previous value in the sequence until we have selected the $114$ customers (items of data).

This follows the sequence $8n-4$ .

Common misconceptions

Mixing up a sampling method

Using the incorrect sampling method to select data (such as using systematic sampling or non random sampling)

The first item in the sample

If every $10^\text{th}$ item is being chosen for a sample, the first item of data must be generated using a random number generator from the first $10$ items of data in the ordered list.

If every $7^\text{th}$ value is chosen for a sample, the first item of data must be generated using a random number generator from the first $7$ items of data in the ordered list. If this doesn’t happen, you will reduce the sample size.

Data not in order

When you are finding the median value in a set of data, the data must be in order, otherwise the number being picked is not the median, it’s just the middle number in a random list. This is the same for a systematic sample. Every item of data is structured in an order from a sampling frame (age, postcode, alphabetical order etc), and then the sample is taken.

Calculating the interval for the sample

Let’s assume we have $1000$ items of data. $5\%$ of $1000$ is $50$ and so we need $50$ items of data. Those $50$ items of data must be spread equally across all of the ordered data and so by dividing the number in the population by the sample size, we find the interval between each item of data. Here we would have $\frac{1000}{50}=20$ . Here we would choose every $20^\text{th}$ item in the list of $1000$ items to get a sample of $50$ .

Practice systematic sampling questions

Order and number the items in the list. Find $20\%$ of the total population. Calculate the interval. Select the first number using a random number generator. Select every $5^\text{th}$ item in the list afterwards.

Split the total population into smaller categories. Calculate $20\%$ of each category. Use a random number generator to select items in each category, proportional to the total.

Order the population and assign each item of data a unique number. Use a random number generator to select every $20\text{th}$ item in the list.

Select the first $20\%$ of items of data in the list.

For a sample of $20$ %, we need to calculate $20$ % of the population. Here, $20$ % of $1350$ is $270$ and so we need $270$ listed items.

As the sampling method is systematic, we need to calculate the interval (the sequence) for which the items in the list will be selected. Here, as we want 20% of the population, this is equivalent to every fifth item of data in the list ( $20\%=\frac{20}{100}=\frac{1}{5}$ ).

To determine which item in the list is first, we need to use a random number generator to select one of the first five items in the list only. Here, a random number generator selected the first item in the list to be the 3rd item listed.

So, by starting at the randomly selected 3rd item in the ordered population list, and selecting every $5$ th item in the population as we want a sample size of $20$ % ( $270$ items), we generate a systematic sample.

$12, 21,$ and $30$

$4, 8, 12, 16, 20, 24, 28,$ and $32$

$7, 11, 15, 19, 23, 27, 31$ and $35$

$4,5,6,7,8,9,10,$ and $11$

$25$ % of $36 = 9$ so the sample size contains $9$ members of staff.

$\frac{9}{36}=\frac{1}{4}$ so every $4\text{th}$ person is chosen after person $3$ .

20

9

18

6

$5$ % of $120 = 6$ items of data

$\frac{120}{6}=20$ so every $20^\text{th}$ item of data is chosen

$118\div{20}=5.9$ so we can subtract $20$ from $118$ five times, leaving us with the number $18$ as the first in the list.

4. Jodie records the number of steps she takes per day over $30$ days. She wants to take a sample of data to find out how she has progressed over the month. She decides to take a systematic sample of half of the data. The first item of data is randomly chosen as day $1$ . What day is the last item of data in the sample? Use the table below to help you.

Mon	Tues	Weds	Thurs	Fri	Sat	Sun
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Tuesday

Saturday

Thursday

Monday

Half of the data means every other day, which gives us every odd number in the month. The last odd number in the month is $29$ , which is a Monday.

3n

n+6

6n-3

6n

The first number in the sequence is $3$ .

The number of students in the sample is $16\%$ of $2243$ , which equal to:
$(2243\div100)\times16=359 (0dp)$

The interval is equal to $\text{Interval}=\frac{\text{Population size}}{\text{Sample size}}=\frac{2243}{359}=6\text{ (0dp)}$

The first $5$ terms in the sequence are therefore: $3, 9, 15, 21, 27, \dots$

The common difference in the sequence is $+6$ , so we have the sequence $6n$ .

The first term in the sequence $6n$ is $6\times1=6$ . We need the first term to equal $3$ , so we have to subtract $3$ from $6n$ , giving us the nth term

6n-3.

6. A hotel has $18$ floors. Each floor has $4$ apartments, except for the ground floor which has $2$ apartments, and the top floor which is one single apartment. Each apartment is given a unique reference number according to the floor level and the apartment number (e.g. apartment $1$ on floor $18$ is number $181$ ).

A hotel inspector is required to inspect $4\%$ of apartments, chosen using a systematic sample. The first apartment that is inspected is randomly selected to be number $002$ . What is the number of the last apartment to be inspected?

124

132

159

181

There are $16$ floors with $4$ apartments $(16\times4=64)$ .

There is $1$ floor with $2$ apartments $(1\times2=2)$ .

There is $1$ floor with $1$ apartment $(1)$ .

Adding these together, we have the total number of apartments to be $64+2+1=67$ .

We need a sample of $4\%$ of $67$ : $(67\div100)\times4=3 (0dp)$ or $3$ rooms.

The interval is equal to $\text{Interval}=\frac{\text{Population size}}{\text{Sample size}}=\frac{67}{3}=22\text{ (0dp)}$

The position of the $3$ rooms in the list are: $2$ (this was given)

2+22=24

24+22=46

The $67^\text{th}$ room of the hotel is number $181$ .

The $66^\text{th}$ room of the hotel will be number $174$ .

Every $4^\text{th}$ room in the list is $1$ floor below.

66	62	58	54	50	46
174	164	154	144	134	124

The last room that will be inspected in apartment $4$ on floor $12$ .

Systematic sampling GCSE questions

1. Explain what is meant by a systematic sample

(3 marks)

Show answer

First item of data selected at random

(1)

Following items in the sample selected follow a sequence

(1)

The interval is equal to the population size divided by the sample size

(1)

2. (a) The Paddles rowing club has $20$ members. The coach wants to find out about how many hours members spend in the gym per week. He decides to take a systematic sample of $10$ members for his research. Each person is written in a list in order of how long they have been a member at the club. Given that the first item of data is person number $2$ on the list, determine the reference number of the other members in the sample. Use the table below to help you.

1	6	11	16
2	7	12	17
3	8	13	18
4	9	14	19
5	10	15	20

(b) The Boatyard is another rowing club. Their coach carries out the same research study, sampling $10$ members from their list of $100$ members. Which rowing club would achieve a better estimate of their population data? Explain your answer.

(5 marks)

Show answer

(a)

$\frac{10}{20}\times{100}=50\%$ of members

(1)

Every even number selected $(2,4,6,8,10,12,14,16,18,20)$

(1)

(b)

The Paddles

(1)

They have a sample size of $50\%$ whereas The Boatyard has a sample size of $10\%$

(1)

The larger the sample size, the more representative the data is towards the population

(1)

3. (a) A population contains $1600$ people. Zairah wants to take a sample of $80$ people for a market research study using systematic sampling. Calculate the interval size.

(b) If the first person chosen at random is person number $6$ in the list, determine the nth term of the sequence for the sample selection.

(4 marks)

Show answer

(a)

1600\div80

(1)

Every $20^\text{th}$ person

(1)

(b)

20n

(1)

-14

(1)

Learning checklist

You have now learned how to:

Infer properties of populations or distributions from a sample, whilst knowing the limitations of sampling

The next lessons are

Still stuck?

Prepare your KS4 students for maths GCSEs success with Third Space Learning. Weekly online one to one GCSE maths revision lessons delivered by expert maths tutors.