GCSE Tutoring Programme

"Our chosen students improved 1.19 of a grade on average - 0.45 more than those who didn't have the tutoring."

In order to access this I need to be confident with:

Arithmetic Calculator skills Sequences Fraction of an amount Percentages of an amount Rearranging equationsThis topic is relevant for:

Here we will learn about sampling methods, including random sampling, non-random, stratified sampling, systematic sampling and capture/recapture.

There are also types of sampling methods worksheets based on Edexcel, AQA and OCR exam questions, along with further guidance on where to go next if you’re still stuck.

**Sampling methods** are ways to select a sample of data from a given **population** (every individual in the whole group).

It is unrealistic to collect data from the entire population because it:

- is too big
- takes too much time
- costs too much money

We therefore take an appropriate sized** sample** as a way of representing the population.

Depending on the situation, some sampling methods will be more suitable than others. Whichever sampling method is used it is important to **justify** why that method has been used.

**Note:** Understanding sampling is important for GCSE Mathematics, but not all of the methods on this page will be examined. They do, however, come up in GCSE Statistics and A level Maths. Capture Recapture will be examined in GCSE Maths.

In order to collect data there are several types of probability sampling methods and non-probability sampling methods we can use:

- Random sampling
- Stratified sampling
- Systematic sampling
- Non random sampling
- Capture recapture

Below is a brief summary of each sampling method.

Sampling method | Description | Example |
---|---|---|

Random sampling (aka simple random sampling) | Gathering a representative sample from a population where each member in the population has an equal chance of being selected. | Using a random number generator to select students in a class to complete a task. |

Stratified sampling | Smaller groups or strata within the sample are represented proportionally to the population. | Finding out a favourite soap opera from different age categories of people in a year group. |

Systematic sampling | Every member in the population is given a number. After the first member is chosen at random, the remaining members are chosen from a given interval. | A list of people with their first names in alphabetical order are numbered. The 5th person is chosen randomly, followed by every subsequent 8th person. |

Non random sampling | Convenience sampling is used for ease of data collection. Volunteers usually collect data. | Asking people at a given location about how long their commute to work is. |

Capture recapture | Collecting a sample of data from one location at different points in time, marking the individuals to estimate a population size. | A sample of woodlice were captured, marked and released. Another sample of woodlice was captured 5 days later and the number of marked woodlice was counted. |

**See also:** Collecting data

Following any particular sampling methodology has a variety of advantages and disadvantages:

Sampling method | Advantages | Disadvantages |
---|---|---|

Random sampling | Random selection means the results can be generalised for a population. It is more time efficient than asking the entire population. Reduced bias. | Expensive. Time consuming. Not always possible if there is no sampling frame or list to sample from. |

Stratified sampling | Proportional representation means the sample is representative of the population so the results can be generalised. It is more time efficient than asking the entire population. Minorities given fair representation. | Requires another sampling method to select individual items of data from a list (random / systematic etc.) |

Systemic sampling | It is more time efficient than asking the entire population. The sample can be selected proportional to the total population (stratified). | Every member of the population must be listed. The first member of the population must be chosen at random to avoid bias. |

Non random sampling | Volunteers are accessible and quick to return data. It is more time efficient than asking the entire population. | Subject to bias (skewed results) leading to an unfair representation of the population. |

Capture recapture | Estimate population size. Track changes over time (e.g. seasonal, health, climate change). | Individuals have to remain local to the area of research with a definite boundary (no radical changes in the population due to births / deaths / migration). Markers are not lost or removed. |

A sample should be a **representation of a population** and so the **more individuals **that are in the sample**, **the **more accurately it will represent the population**.

E.g.

According to the 2011 UK Census, there were 53,012,456 people living in England. 157,743 of these people lived in a “*rural village in a sparse setting”* and 43,668,600 people lived in an *“urban dwelling”*.

If we were to gather data from every resident in the “*rural village” *subgroup, we would only be asking only 0.29\% of the total population of England.

On the other hand, if we gathered data from everyone who lived in the *“urban dwelling”* subset, we would be asking 82.37\% of the population of England.

Asking a greater number of people a question will obtain a much more accurate representation of the population, however, it will take a very long time, can be very expensive and depending on what you are researching, can hold a significant bias. For example it is more likely to have a good wi-fi signal in an urban area because there is higher demand for the services, but it is more likely to find taller trees and a wider range of wildlife in rural areas.

There is **no specific value** that gives you a representative sample size compared to the population. We need to be able to justify why a particular size of sample is good or bad.

As a rule of thumb, the **bigger the sample**, the more **reliable** it will be.

**Non-random sampling** is a non-probability sample method. There are two types of non-random sampling: quota sampling and opportunity sampling.

In quota sampling, the person collecting the data selects a sample that reflects the characteristics of the whole population.

To do this, participants are assessed and then allocated into the appropriate quota. Once the quotas are full, no more items of data can be added to the quota, and so they are ignored.

There is a fundamental assumption made for quota sampling. This assumption is the proportion of the total population that is in a category.

E.g.

It is estimated that 10\% of students in a school are left-handed (and hence 90\% of students are right-handed). We want a sample of 100 students and so we ask students as they enter the school canteen at lunch whether they were left- or right-handed. Once we have 10 left-handed students, and 90 right-handed students, the two quotas are full and the sample is selected.

Opportunity sampling (or convenience sampling) takes a sample from one place at one time, for example, sampling the age of cars in a car park at 1pm on a Tuesday.

Get your free sampling methods worksheet of 20+ questions and answers. Includes reasoning and applied questions.

DOWNLOAD FREEGet your free sampling methods worksheet of 20+ questions and answers. Includes reasoning and applied questions.

DOWNLOAD FREEHayley wants to carry out some research on her class. She wants a sample of 12 people out of the 30 in her class. Use a random sampling technique to determine the reference number of the students in the class who should be included in the sample. Do not include duplicated data.

**List every member of the population**.

As there are 30 students in the class, we have the numbers 1 to 30 .

1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|

6 | 7 | 8 | 9 | 10 |

11 | 12 | 13 | 14 | 15 |

16 | 17 | 18 | 19 | 20 |

21 | 22 | 23 | 24 | 25 |

26 | 27 | 28 | 29 | 30 |

2**Associate each member of the population with a unique reference number.**

Each student is already numbered from 1 to 30 .

3**Use a random number generator to select the number of data points in the sample. **

Using a Casio FX series calculator, we can generate random numbers between 1 and 30 by pressing the key combination: 3, 0, × , shift, . , = . ** **This should show the expression: 30 × RAN #=. Round any decimal to the nearest integer (whole number). Ignore any duplicates as stated in the question. Repeat this 12 times:

30 x RAN#= | 8.38 | 5.33 | 18.2 | 7.49 | 13.1 | 2.93 |
---|---|---|---|---|---|---|

Rounded | 8 | 5 | 18 | 7 | 13 | 3 |

30 x RAN#= | 28 | 17.3 | 21.4 | 1.05 | 13.9 | 1.92 |
---|---|---|---|---|---|---|

Rounded | 28 | 17 | 21 | 1 | 14 | 2 |

The 12 people in the sample are numbers: 1, 2, 3, 5, 7, 8, 13, 14, 17, 18, 21, and 28 .

**Step-by-step guide:** Random sampling

A drinks company produces 1200 bottles of pop every 30 minutes. For quality control purposes, 12 bottles are selected and checked. Each bottle passes through the machine in a single file. Using a systematic sampling technique, determine the bottles that will be selected for the sample.

**Order the population and give each data entry a unique reference number.**

As each bottle passes through the machine in a single file, we can assume that the first bottle has a reference number 1 , the second number 2 , etc.

2**Calculate the number of items of data in the sample.**

As we want a sample of 12 bottles and we are using a systematic sample, we need to choose the bottles using a sequence. The interval for the data selection is:

\text{Interval}=\frac{\text{Population size}}{\text{Sample size}}=\frac{1200}{12}=100So we need to pick every 100^\text{th} term in the data.

3**Use a random number generator to select the first item of data.**

As we need to pick every 100^\text{th} term, the first number that will form the starting point in the sample selection must be randomly chosen from the first 100 terms. Using a random number generator, we get the number 27 , so we choose the first item of data in the sample to be the 27^\text{th} bottle.

4**Select the remaining items of data following the given sequence.**

As we are selecting every 100^\text{th} item in regular intervals, the next bottle will be number 127, 227, 327, and so on until we reach the 1200^\text{th} bottle in 30 minutes.

The sample will therefore contain the bottles with the reference numbers:

27, 127, 227, 327, 427, 527, 627, 727, 827, 927, 1027, and 1127.

These numbers are in the sequence 100n-73 .

**Step-by-step guide**: Systematic sampling

The Student Council is carrying out a survey. They want to collect a stratified sample of 120 students in Years 7-11 . Calculate the number of students in each Year Group that will take part in the survey.

Year 7 | Year 8 | Year 9 | Year 10 | Year 11 | Total |
---|---|---|---|---|---|

342 | 330 | 316 | 346 | 318 | 1652 |

**Calculate how many items of data will be selected for the sample.**

The question states that the sample size is 120 .

2**Calculate how many items of data will be selected in each subcategory.**

We need to calculate the proportion of each category that will be used in the sample.

We do this by dividing the number in the sample by the population size and multiplying by the number in the group.

\frac{\text{Sample size}}{\text{Population size}}=\frac{120}{1652}=\frac{30}{413}Year 7 | Year 8 | Year 9 | Year 10 | Year 11 | Total | |
---|---|---|---|---|---|---|

342 | 330 | 316 | 341 | 318 | 1652 | |

\frac{30}{413}\times group size | 24.8 | 23.97 | 22.95 | 24.77 | 23.10 | |

Rounded to the nearest integer | 25 | 24 | 23 | 25 | 23 |

3**Check that that the number of items of data matches the sample size.**

Adding the number of students in each year group, we get:

25+24+23+25+23=120which matches the sample size.

**Step-by-step guide: **Stratified sampling

Jacob is completing a research study on the number of whales in a pod. During one day, photos of individual whales are taken and their markings are recorded to identify them in sample 2 .

3 days later, Jacob repeats the study to identify how many whales in sample 2 were in sample 1 . His results are shown in the table below:

Frequency | |
---|---|

Sample size 1 | 38 |

Sample size 2 | 44 |

Identified whales in sample 2 | 35 |

**Substitute values of M (total marked), R (number of marked recaptured), and T (total recaptured on second visit) into the formula**N=\frac{MT}{R}

We know that, M=38, R=35, and T=44 .

Substituting these values into the formula, we get:

N=\frac{MT}{R}=\frac{38\times{44}}{35}2**Solve for N.**

The estimated population size of whales in the pod is 48 .

**Step-by-step guide:** Capture recapture

**Mixing up a sampling methods**

A common error is to use the incorrect sampling method to select data (such as using systematic sampling or non random sampling)

**Incorrect percentages / fraction of an amount (stratified sampling)**

When finding a sample of 60\% of the population, you need to find 60\% of each category, and not 60\% of the population and divide it equally between the number of categories. The larger the frequency in the category, the larger the sample taken. This is proportional representation.

**Incorrect number of items in the sample**

Remember to check that the sample size matches the total sum of the samples from each category in the population.

**Using the RAN# button incorrectly (random sampling)**

The RAN# button returns a number to 3 decimal places. Different calculators will have different ways to generate random numbers over an interval and so make sure you know how your calculator performs this function.

**Starting with a random number (systematic sampling)**

If every 10^{th} item is being chosen for a sample, the first item of data must be generated using a random number generator from the first 10 items of data in the ordered list.

If every 7^{th} value is chosen for a sample, the first item of data must be generated using a random number generator from the first 7 items of data in the ordered list. If this doesn’t happen, you will reduce the sample size.

**Not in order (systematic / random sampling)**

When you are finding the median value in a set of data, the data must be in order, otherwise the number being picked is not the median, it’s just the middle number in a random list. This is the same for a systematic sample. Every item of data is structured in an order from a sampling frame (age, postcode, alphabetical order etc), and then the values are taken.

1. Kirsty would like to survey what people enjoy eating for lunch in the canteen. She lists all the members in the population and gives them a unique reference number. After selecting the first member at random, she then selects the remainder of the sample over a specified interval. What type of sampling method is Kirsty using?

Stratified sampling

Systematic sampling

Random sampling

Capture recapture

Systematic sampling requires an ordered list of data with members of the sample chosen over a given interval.

2. Darren would like to find out how many foxes live in the woodland on his farm. He captures 3 foxes in one night, marks them and releases them. The next night, one of the 4 foxes captured is marked. Which sampling method is he using?

Stratified sample

Systematic sample

Capture recapture

Non-random sample

The foxes are being captured, marked and recaptured.

3. Peter is carrying out a survey. He would like to find out how many miles people drive per day. He divides the population into a two way table, and calculates the proportion of each category required for his sample. Which sampling technique is Peter using?

Stratified sampling

Capture recapture

Random sampling

Non-random sampling

Stratified sampling is proportional to the population

4. Laura has a sampling frame of all the people who are members of a club. She gives each member a unique reference number and uses a random number generator to select the sample. Which sampling method is Laura using?

Stratified sampling

Systematic sampling

Non-random sampling

Random sampling

A simple random sample requires a sampling frame and a random number generator to select the sample.

1. James takes a systematic sample of 30 people from a population of 150 . What is the interval size for this sample?

**(1 mark)**

Show answer

5

**(1)**

2. (a) Elliott wants to collect some market research about the products in his bakery. He would like 40 customers to participate in his sample. Describe how Elliott should take a random sample.

(b) He decides to take a systematic sample instead. How does this alter his sampling process compared with part a)?

**(5 marks)**

Show answer

(a)

List every member of the population

**(1)**

Give each member in the population a unique reference number

**(1)**

Use a random number generator to select 40 numbers, with no duplicates

**(1)**

(b)

The first number is selected using a random number generator only

**(1)**

The remaining 39 items in the sample are selected with a given regular intervals between them

**(1)**

3. Below is a two way table describing the population of people who are competing at a university athletics championship:

Male | Female | |

Track | 462 | 388 |

Field | 326 | 285 |

Explain how to take a stratified sample of 500 competitors from the population.

**(5 marks)**

Show answer

Total in population = 1461

**(1)**

**(1)**

**(1)**

158 [Male Track], 133 [Female Track]

**(1)**

112 [Male Field], 98 [Female Field]

**(1)**

You have now learned how to:

- Infer properties of populations or distributions from a sample, whilst knowing the limitations of sampling

Prepare your KS4 students for maths GCSEs success with Third Space Learning. Weekly online one to one GCSE maths revision lessons delivered by expert maths tutors.

Find out more about our GCSE maths tuition programme.