GCSE Tutoring Programme
Our chosen students improved 1.19 of a grade on average - 0.45 more than those who didn't have the tutoring.
In order to access this I need to be confident with:
Calculator skills Sequences Fraction of an amount Percentage of an amountThis topic is relevant for:
Here we will learn about systematic sampling, including what systematic sampling is, how to take a systematic sample and the advantages and disadvantages of systematic sampling.
There are also systematic sampling worksheets based on Edexcel, AQA and OCR exam questions, along with further guidance on where to go next if youβre still stuck.
Systematic sampling is a type of probability sampling that selects items of data at regular intervals from a population. Systematic sampling is not examined in GCSE Mathematics; however it is on the GCSE Statistics specification and will also be examined in A level Mathematics.
Every data entry for the population must be given in a list (a sampling frame) so that they have an equal chance of being selected.
We select the first item of data using a random number generator and then select the rest at regular intervals.
Sampling method | Description | Example |
---|---|---|
Systematic sampling | Every member in the population is given a number. After the first member is chosen, the remaining members are chosen from a given interval. | A list of people with their first names in alphabetical order are numbered. The 5th person is chosen randomly, followed by every subsequent 8th person |
To calculate the interval required to select the sample data, we calculate the population size divided by the sample size.
\text{Interval}=\frac{\text{Population size}}{\text{Sample size}}
E.g.
If the population size is 1200 and the desired sample size is 400 items of data, we divide 1200 by 400 to get an interval of 3 .
This means that every 3^\text{rd} item of data in the ordered list is selected for the sample.
E.g.
A factory that manufactures cars must regularly assess the quality of production. In one month, 5\% of cars are selected using a systematic sample to be rigorously tested for quality purposes. The first car is chosen at random then every 10^\text{th} car that follows.
This systematic sample helps the company to ensure the quality of their car manufacture is maintained. Testing each vehicle would be costly and take too much time.
Following a systematic sampling methodology has advantages and disadvantages:
Advantages | Disadvantages |
---|---|
It is more time efficient than asking the entire population. The sample can be selected proportional to the total population (stratified) | Every member of the population must be listed. The first member of the population must be chosen at random to avoid bias. If a data entry is missing/empty, the entry is not included. |
In order to take a systematic sample:
Get your free systematic sampling worksheet of 20+ types of sampling methods questions and answers. Includes reasoning and applied questions.
DOWNLOAD FREEGet your free systematic sampling worksheet of 20+ types of sampling methods questions and answers. Includes reasoning and applied questions.
DOWNLOAD FREESystematic sampling is part of our series of lessons to support revision on types of sampling methods. You may find it helpful to start with the main types of sampling methods lesson for a summary of what to expect, or use the step by step guides below for further detail on individual topics. Other lessons in this series include:
A company produces biscuits at 100 per minute. A machine checks the weight of 10\% of the biscuits. The biscuits pass through the machine one at a time. Use systematic random sampling to select the biscuits for the sample over 3 minutes.
As each biscuit passes through the machine one at a time, we can assume that the first biscuit is number 1 , the second is biscuit number 2 , etc.
2Calculate the number of items of data in the sample.
In 3 minutes, there will be 3\times100=300 biscuits. As the company checks 10\% of the biscuits, we need a sample of: \frac{300}{100}\times{10}=30
3Calculate the interval.
As we need 30 biscuits, and we are using a systematic sample, we need to choose the biscuits using a sequence. We determine the interval in the sequence by dividing the sample size by the population size: \text{Interval}=\frac{\text{Population size}}{\text{Sample size}}=\frac{300}{30}=10
So we need to pick every 10^\text{th} term in the data.
4Use a random number generator to select the first item of data.
As we need to pick every 10^\text{th} term, the first number in the sample (starting point of the sequence) must be randomly chosen from the first 10 terms. Using a random number generator, we get the number 6 , so we choose the first item of data in the sample to be the 6^\text{th} biscuit.
Below we have used a table to show how the sequence develops*:
5Select the remaining items of data following the given sequence.
As we are selecting every 10^\text{th} item, we can select the following biscuits from the data:
The sample will therefore contain 30 biscuits with the following numbers:
6, 16, 26, 36, 46, 56, 66, 76, 86, 96, 106, 116, 126, 136, 146, 156, 166, 176, 186, 196, 206, 216, 226, 236, 246, 256, 266, 276, 286 and 296 .
Note: These numbers are in the sequence 10n-4 .
Luke is looking at the beats per minute of tracks in his music player. He has 1200 tracks. He decides to take a systematic sample of 25\% of his tracks. Determine the tracks that should be chosen.
Order the population and give each data entry a unique reference number.
Letβs use the number of plays to sort the data into an order. The first track in the list will be number 1 , the second track number 2 , etc.
Calculate the number of items of data in the sample.
As Luke wants a sample of 25\% , we need to calculate 25\% of 1200 :
\frac{1200}{100}\times{25}=300Calculate the interval.
As we need 300 tracks, and we are using a systematic sample, we need to choose the tracks using a sequence. The interval is:
\text{Interval}=\frac{\text{Population size}}{\text{Sample size}}=\frac{1200}{300}=4So we need to pick every 4^\text{th} term in the data.
Use a random number generator to select the first item of data.
As we need to pick every 4^\text{th} term, the first number in the sample must be randomly chosen from the first 4 terms. Using a random number generator, we get the number 1 , so we choose the first item of data in the sample to be the 1^\text{st} track.
Below we have used a table to show how the sequence develops*:
Select the remaining items of data following the given sequence.
As we are selecting every 4^\text{th} item, we can select the following tracks from the data:
The sample will therefore contain 300 tracks that belong to the sequence 4n-3 .
A traffic management company is researching the proportion of lorries that use a single carriageway between 8am and 9am . A traffic camera records the details of every vehicle and produces a list of data in the order of the time that the vehicle passes the camera. 960 vehicles are recorded within the hour on one day. The company uses a systematic sample to select a random sample of 5\% of the data for their research. Determine which vehicles will be in the sample.
Order the population and give each data entry a unique reference number.
The population data is in order given their time stamp and so we can list the first vehicle in the list as number 1 , second vehicle is number 2 , etc.
Calculate the number of items of data in the sample.
The sample size is 5\% so we need to calculate 5\% of 960 :
\frac{960}{100}\times{5}=48 vehicles
Calculate the interval.
As we need 48 vehicles, the interval is:
\text{Interval}=\frac{\text{Population size}}{\text{Sample size}}=\frac{960}{48}=20So we need to pick every 20^\text{th} item of data.
Use a random number generator to select the first item of data.
As we need to pick every 20\text{th} term, the first number in the sample must be randomly chosen from the first 20 terms. Using a random number generator, we get the number 17 , so we choose the first item of data in the sample to be the 17^\text{th} vehicle.
Below we have used a table to show how the sequence develops*:
Select the remaining items of data following the given sequence.
As we are selecting every 20^\text{th} item, we can select the following vehicles from the data:
The sample will therefore contain 48 vehicles that belong to the sequence 20n-3 .
A local council is researching the distribution of voters in 15000 homes. They take a systematic sample of 20\% of homes, listed in order of their postcode and house number. Determine which homes will be asked to participate in the survey.
Order the population and give each data entry a unique reference number.
The population data is in order given their postcode and house number and so we can assume that the first home on the list is number 1 , the second home number 2 , etc.
Calculate the number of items of data in the sample.
The sample size is 20\% so we need to calculate 20\% of 15000 :
\frac{15000}{100}\times{20}=300Calculate the interval.
As we need 300 homes, and we are using a systematic sample, we need to choose the homes using a sequence. The interval for this set of data is equal to:
\text{Interval}=\frac{\text{Population size}}{\text{Sample size}}=\frac{15000}{300}=50So we need to pick every 50^{th} term in the data.
Use a random number generator to select the first item of data.
As we need to pick every 50^{th} term, the first number in the sample must be randomly chosen from the first 50 terms. Using a random number generator, we get the number 8 , so we choose the first item of data in the sample to be the 8^\text{th} home.
Below we have used a table to show how the sequence develops*:
Select the remaining items of data following the given sequence.
As we are selecting every 50^{th} item, we can select the following vehicles from the data:
The sample will therefore contain 300 homes that belong to the sequence 50n-42 .
An online clothing company is researching the average customer spend over the previous month. There were 12664 orders purchased, and each order has a unique reference number. The company takes a systematic sample of 2\% of orders. Determine which orders will be chosen for the sample.
Order the population and give each data entry a unique reference number.
As each order has a unique reference number, we can order the numbers from smallest to largest, and then number each item of data from 1-12664 .
Calculate the number of items of data in the sample.
As the company is taking a sample of 2\% , we need to calculate 2\% of 12664 :
(12664\div100)\times2=253.28The sample size is 253 orders.
Calculate the interval.
The interval is equal to \text{Interval}=\frac{\text{Population size}}{\text{Sample size}}=\frac{12664}{253}=50\text{ (0dp)}
Use a random number generator to select the first item of data.
Using a random number generator, we need to select the first item of data from the first 50 orders. The random number chosen is 13 .
The first item of data in the list is the 13^{th} order.
Select the remaining items of data following the given sequence.
As the interval is 50 , the next order will be 13+50=63 , then 63+50=113 , then 113+50=163 β¦ and so on until we have selected the 253 items of data.
The sample will therefore contain 253 items of data that belong to the sequence 50n-37 .
A cafΓ© is carrying out some market research. Out of 1240 customers that entered the cafΓ© during a weekend, 950 allowed the cafΓ© to email them a questionnaire. The cafΓ© takes a systematic sample size of 12\% of those who received the questionnaire. Determine which customers will be part of the sample.
Order the population and give each data entry a unique reference number.
Despite there being 1240 customers, the population size is 950 as these customers received a questionnaire. As they provided an email address, the population can be listed using their email address, in alphabetical order.
Calculate the number of items of data in the sample.
The cafΓ© is taking a sample size of 12\% .
(950\div100)\times12=114The sample will contain 114 items of data.
Calculate the interval.
The interval is \text{Interval}=\frac{\text{Population size}}{\text{Sample size}}=\frac{950}{114}=8\text{ (0dp)}
Use a random number generator to select the first item of data.
As every 8^{th} customer is being selected, the first customer must be randomly chosen from the first 8 items of data. Using a random number generator, the 4^{th} customer is chosen.
Select the remaining items of data following the given sequence.
As the first customer is number 4 , and every 8^{th} customer is being selected after, we continue to add 8 to the previous value in the sequence until we have selected the 114 customers (items of data).
This follows the sequence 8n-4 .
Using the incorrect sampling method to select data (such as using systematic sampling or non random sampling)
If every 10^\text{th} item is being chosen for a sample, the first item of data must be generated using a random number generator from the first 10 items of data in the ordered list.
If every 7^\text{th} value is chosen for a sample, the first item of data must be generated using a random number generator from the first 7 items of data in the ordered list. If this doesnβt happen, you will reduce the sample size.
When you are finding the median value in a set of data, the data must be in order, otherwise the number being picked is not the median, itβs just the middle number in a random list. This is the same for a systematic sample. Every item of data is structured in an order from a sampling frame (age, postcode, alphabetical order etc), and then the sample is taken.
Letβs assume we have 1000 items of data. 5\% of 1000 is 50 and so we need 50 items of data. Those 50 items of data must be spread equally across all of the ordered data and so by dividing the number in the population by the sample size, we find the interval between each item of data. Here we would have \frac{1000}{50}=20 . Here we would choose every 20^\text{th} item in the list of 1000 items to get a sample of 50 .
1. The total items of data in a list is equal to 1350 . Describe how you would take a systematic sample of 20\% of the population.
Order and number the items in the list. Find 20\% of the total population. Calculate the interval. Select the first number using a random number generator. Select every 5^\text{th} item in the list afterwards.
Split the total population into smaller categories. Calculate 20\% of each category. Use a random number generator to select items in each category, proportional to the total.
Order the population and assign each item of data a unique number. Use a random number generator to select every 20\text{th} item in the list.
Select the first 20\% of items of data in the list.
For a sample of 20%, we need to calculate 20% of the population. Here, 20% of 1350 is 270 and so we need 270 listed items.Β
As the sampling method is systematic, we need to calculate the interval (the sequence) for which the items in the list will be selected. Here, as we want 20% of the population, this is equivalent to every fifth item of data in the list ( 20\%=\frac{20}{100}=\frac{1}{5} ).
To determine which item in the list is first, we need to use a random number generator to select one of the first five items in the list only. Here, a random number generator selected the first item in the list to be the 3rd item listed.Β
So, by starting at the randomly selected 3rd item in the ordered population list, and selecting every 5th item in the population as we want a sample size of 20% (270 items), we generate a systematic sample.
2. A company wants to survey 25\% of its staff members. The company has 36 employees which are listed in alphabetical order of their surnames. If the first member in the list chosen is number 3 , what other members will be chosen for a systematic sample?
12, 21, and 30
4, 8, 12, 16, 20, 24, 28, and 32
7, 11, 15, 19, 23, 27, 31 and 35
4,5,6,7,8,9,10, and 11
25% of 36 = 9 so the sample size contains 9 members of staff.
\frac{9}{36}=\frac{1}{4} so every 4\text{th} person is chosen after person 3 .
3. A field is divided into equal sized squares. Each square is ordered from 1-120 . The farmer would like to study the amount of weeds in the crop and so he takes a systematic sample of 5\% of the population. Given that the final number in the list is square 118 , determine the number of the first square in the sample.
5% of 120 = 6 items of data
\frac{120}{6}=20 so every 20^\text{th} item of data is chosen
118\div{20}=5.9 so we can subtract 20 from 118 five times, leaving us with the number 18 as the first in the list.
4. Jodie records the number of steps she takes per day over 30 days. She wants to take a sample of data to find out how she has progressed over the month. She decides to take a systematic sample of half of the data. The first item of data is randomly chosen as day 1 . What day is the last item of data in the sample? Use the table below to help you.
Mon |
Tues | Weds | Thurs | Fri | Sat | Sun |
1 | 2 | 3 | 4 | 5 | 6 | 7 |
8 | 9 | 10 | 11 | 12 | 13 | 14 |
15 | 16 | 17 | 18 | 19 | 20 | 21 |
22 | 23 | 24 | 25 | 26 | 27 | 28 |
29 | 30 |
Tuesday
Saturday
Thursday
Monday
Half of the data means every other day, which gives us every odd number in the month. The last odd number in the month is 29 , which is a Monday.
5. A school contains 2243 students. A research group takes a systematic sample of 16\% of students. They order the list in order of the studentβs age. If the first student randomly selected from the list is number 3 , determine the nth term of the sequence that selects the remaining sample of students.
The first number in the sequence is 3 .
The number of students in the sample is 16\% of 2243 , which equal to:
(2243\div100)\times16=359 (0dp)
The interval is equal to \text{Interval}=\frac{\text{Population size}}{\text{Sample size}}=\frac{2243}{359}=6\text{ (0dp)}
The first 5 terms in the sequence are therefore: 3, 9, 15, 21, 27, β¦
The common difference in the sequence is +6 , so we have the sequence 6n .
The first term in the sequence 6n is 6\times1=6 . We need the first term to equal 3 , so we have to subtract 3 from 6n , giving us the nth term
6n-3.
6. A hotel has 18 floors. Each floor has 4 apartments, except for the ground floor which has 2 apartments, and the top floor which is one single apartment. Each apartment is given a unique reference number according to the floor level and the apartment number (e.g. apartment 1 on floor 18 is number 181 ).
A hotel inspector is required to inspect 4\% of apartments, chosen using a systematic sample. The first apartment that is inspected is randomly selected to be number 002 . What is the number of the last apartment to be inspected?
There are 16 floors with 4 apartments (16\times4=64) .
There is 1 floor with 2 apartments (1\times2=2) .
There is 1 floor with 1 apartment (1) .
Adding these together, we have the total number of apartments to be 64+2+1=67 .
We need a sample of 4\% of 67 : (67\div100)\times4=3 (0dp) or 3 rooms.
The interval is equal to \text{Interval}=\frac{\text{Population size}}{\text{Sample size}}=\frac{67}{3}=22\text{ (0dp)}
The position of the 3 rooms in the list are: 2 (this was given)
2+22=2424+22=46
The 67^\text{th} room of the hotel is number 181 .
The 66^\text{th} room of the hotel will be number 174 .
Every 4^\text{th} room in the list is 1 floor below.
66 | 62 | 58 | 54 | 50 | 46 |
174 | 164 | 154 | 144 | 134 | 124 |
The last room that will be inspected in apartment 4 on floor 12 .
1. Explain what is meant by a systematic sample
(3 marks)
First item of data selected at random
(1)
Following items in the sample selected follow a sequence
(1)
The interval is equal to the population size divided by the sample size
(1)
2. (a) The Paddles rowing club has 20 members. The coach wants to find out about how many hours members spend in the gym per week. He decides to take a systematic sample of 10 members for his research. Each person is written in a list in order of how long they have been a member at the club. Given that the first item of data is person number 2 on the list, determine the reference number of the other members in the sample. Use the table below to help you.
1 | 6 | 11 | 16 |
2 | 7 | 12 | 17 |
3 | 8 | 13 | 18 |
4 | 9 | 14 | 19 |
5 | 10 | 15 | 20 |
(b) The Boatyard is another rowing club. Their coach carries out the same research study, sampling 10 members from their list of 100 members. Which rowing club would achieve a better estimate of their population data? Explain your answer.
(5 marks)
(a)
\frac{10}{20}\times{100}=50\% of members
(1)
Every even number selected (2,4,6,8,10,12,14,16,18,20)
(1)
(b)
The Paddles
(1)
They have a sample size of 50\% whereas The Boatyard has a sample size of 10\%
(1)
The larger the sample size, the more representative the data is towards the population
(1)
3. (a) A population contains 1600 people. Zairah wants to take a sample of 80 people for a market research study using systematic sampling. Calculate the interval size.
(b) If the first person chosen at random is person number 6 in the list, determine the nth term of the sequence for the sample selection.
(4 marks)
(a)
1600\div80(1)
Every 20^\text{th} person
(1)
(b)
20n(1)
-14(1)
You have now learned how to:
Prepare your KS4 students for maths GCSEs success with Third Space Learning. Weekly online one to one GCSE maths revision lessons delivered by expert maths tutors.
Find out more about our GCSE maths tuition programme.