Sampling in Quan/Qualitative Research


Yuxin Zhang


Research Design Lab, Week 9,
02 Dec. 2024

Recap & exercises on:

Non-probability sampling

Probability sampling

Sampling Weights

Recap 1

Recap 2

Exercises 1: What types of sample?

1.1 A student council surveys 100 students by taking random samples of 25 freshmen, 25 sophomores, 25 juniors, and 25 seniors.

A) Simple random sample
B) Stratified random sample
C) Cluster random sample
D) Systematic random sample
E) None of the above
  • B). This is a stratified random sample because the population was divided into strata (class levels), and random samples were taken from each.

1.2 A quality control worker at a factory selects the first 5 items she sees as her sample for the day.

A) Simple random sample
B) Stratified random sample
C) Cluster random sample
D) Systematic random sample
E) None of the above
  • E). This is a convenience sample.

1.3 A principal orders t-shirts and wants to check some of them to make sure they were printed properly. She randomly selects 3 of the 10 boxes of shirts and checks every shirt in those 3 boxes.

A) Simple random sample
B) Stratified random sample
C) Cluster random sample
D) Systematic random sample
E) None of the above
  • C). The shirts were divided into clusters (boxes) and the principal sampled every shirt from some of the clusters.

1.4 A researcher recruits participants from his social media followers who are willing to complete a questionnaire.

A) Quota sample
B) Purposive sample
C) Convenience sample
D) Snowball sample
E) None of the above
  • C). This is a subtype of convenience sample: voluntary (self-selection) sample.

1.5 A teacher puts her students’ names in a box and pick without looking to get 2 students to answer the question at class

A) Simple random sample
B) Convenience sample
C) Voluntary sample
D) None of the above
  • A). No grouping, and all students in the class have an equal chance of being selected.

1.6 A conference organizer gathers feedback from every participant on the conference.

Simple random sample
A) Stratified random sample
B) Cluster random sample
C) Systematic random sample
D) Purposive sample
E) Convenience sample
F) Voluntary sample
G) Quota sample
H) None of the above
  • H). This is not a sample, but the population itself, as everyone is selected.

1.7 A restaurant manager implements a discount program by randomly selecting one of the first 100 clients to receive a special discount. Subsequently, every 100th client is offered the same discount.

A) Simple random sample
B) Stratified random sample
C) Cluster random sample
D) Systematic random sample
E) None of the above
  • D). This is an example of systematic random sampling. The manager uses a fixed interval (every 100th client) after randomly selecting a starting point within the first 100 clients.

1.8 A middle school principal wants to know how students experience their school life in this semester, so he selected some students at random in each school grade.

A) Simple random sample
B) Stratified random sample
C) Cluster random sample
D) Systematic random sample
E) None of the above
  • B). This is a stratified random sample because the population was divided into strata (school grades), and a random sample was taken from each group (students per grade).

1.9 Which sampling strategy achieves the intended design?

A college professor wants to survey a sample of students taking her large lecture course. There are about 150 students in this course, and 10 of those students are graduate students. She wants to take a systematic random sample of approximately 30 students.

A) Randomly select 15 men and 15 women from the class for the survey.
B) Randomly select 2 graduate students and 28 other students for the survey.
C) Randomly select one of the first 5 students to arrive to class, and every 5th student thereafter to take the survey.
D) Randomly select one of the first 15 students to arrive to class, and every 15th student thereafter to take the survey.
E) Assign each student a number and use a computer to randomly select 30 students for the survey.
  • C). This is a systematic sample since the students arrive in some order, she selects the \(k^{th}\) student, and then selects every \(k^{th}\) student thereafter. Also, \(\frac{150}{5}=30\) students.

1.10 Why use a simple random sample instead of a systematic random sample in the following example?

A large bakery produces cakes in an assembly line. Sergio noticed a potential defect that seems to be showing in every other cake. He decides that in the next production run, he’ll take a random sample of cakes to get a better idea of how many cakes might have this defect.

A) If the defect is in every other cake, a systematic random sample could miss it completely depending on what Sergio chooses for k.
B) A simple random sample can save time since it is less involved than a systematic random sample.
C) A simple random sample guarantees that Sergio will get some cakes from each hour of production.
  • A). Suppose he starts with the \(8^{th}\) cake and \(k=10\) He would pick cakes 8, 18, 28, 38…. If the defect is in every other cake starting with the cake 1, then none of the cakes in Sergio’s sample would have the defect.

1.11 Why use a cluster random sample instead of a simple random sample? (Choose all that apply)

An airline offers a certain flight once per day that usually contains about 250 passengers. The flight offers seats in first class (most expensive), business class, and economy class (least expensive). The airline wants to survey 500 passengers of this flight about their overall satisfaction. The passengers will be selected using a cluster random sample where each flight is a cluster.

A) A cluster sample chooses some passengers from each flight, so the airline will be more likely to get a representative sample.
B) Contacting all of the passengers on a few flights will probably be more efficient than contacting passengers spread out across many flights.
C) A simple random sample might result in a disproportionate number of passengers from one or more class, but a cluster sample is less likely to suffer this issue. 
  • B and C.

1.12 Which sampling strategy achieves the intended design?

A researcher aims to study school performance among high school students in the USA. They define a sample size of 1000 respondents. To achieve this, the researcher randomly selects 10 states out of 50. From each selected state, they randomly choose 5 high schools. Within each school, they survey 2 entire classes picked at random.

  • Multistage cluster sampling.

    Stage-1: Random selection of states: ensures geographical diversity by selecting 10 states randomly, reducing bias in regional representation.

    Stage-2: Random selection of high schools: narrows the sample by randomly selecting 5 schools within each chosen state, which ensures variation within states.

    Stage-3: Random selection of classes within schools: involves randomly choosing 2 entire classes per school, ensuring diverse student representation within each school.

    In this design, clusters (states/schools/classes) are the primary focus of selection, rather than individual students directly.

1.13 Which of the following best describes how the chosen sampling is applied in this context?

A researcher aims to study the healthcare challenges faced by sex workers in a metropolitan city. Given the sensitive nature of the topic and the difficulty of accessing this population, they decide to use snowball sampling.

A) The researcher randomly selects sex workers from a city directory and invites them to participate.
B) The researcher begins with a few participants from NGOs and asks them to refer other sex workers willing to participate.
C) The researcher divides the city into zones and surveys sex workers found in specific areas.
D) The researcher conducts a public survey and asks respondents if they know any sex workers who can be contacted.
  • B). Snowball sampling starts by identifying an initial small pool of participants who are accessible. In this case, the researcher connects with sex workers through NGOs or community organizations that already have established trust and relationships with the target population.

Image source here

Exercises 2: In-group discussion (40 mins)

Please gather in the groups you formed over the past few weeks for the research design lab.

Discuss and determine your unit of analysis, population, sample, sampling technique and data collection method for each example below. Then submit your answers to Moodle (Research Design), in-class assignment in our lab section. ♡

There is no unique solution, but the submission should adopt the following structure:

Example 0:

  • Unit of analysis: individual tourists in Trento
  • Population: all tourists visiting Trento in a specified period (e.g., 2024)
  • Sample: 100 tourists randomly encountered in different areas of Trento
  • Sampling method: convenience sampling
  • Data collection method: structured/unstructured interviews in the areas

0. You want to know details about tourists experience Trento.

1. A teacher is interested in the mean math score of their students in the class.

2. A politician is interested in the social characteristics of their voters. 

3. A researcher wants to conduct a study on mothers’ knowledge about child abuse in Saudi Arabia

4. A researcher is interested in studying the local gangs in their city.

5. A researcher is interested in societal influences on body image perceptions and healthcare interventions.

6. A researcher is interested in studying how poverty is related to social issues such as discrimination, immigration, and crime.

7. A researcher is interested in the social struggles of undocumented immigrants in Sicily.

8. A researcher is interested in socioeconomic factors influence fertility rates in urban areas.

Exercises 3: Sampling & population weights

  1. Sampling weights are used to adjust for the unequal probabilities of selection in the sample. In many surveys, some individuals or groups may be more likely to be selected than others.

  2. Population weights map the participant composition to that of the larger population. They are used to adjust the sample to better reflect the true distribution of the population (post-stratification).

Exercise 3.1: Calculate the sampling weights.

Weight = inverse of the selection probability (\(1/p\))

ID Value Selection Probability Weight
1 3 0.5
2 6 0.8
3 1 0.6
4 8 0.1
5 5 0.25

Exercise 3.2: Calculate the simple mean and weighted mean of Value using the calculated Weight.

ID Value Selection Probability Weight
1 3 0.5 2
2 6 0.8 1.25
3 1 0.6 1.67
4 8 0.1 10
5 5 0.25 4

Simple mean = \((3+6+1+8+5)/5 = 4.6\)

Weighted mean = \((3*2+6*1.25+1.67+8*10+5*4)/(2+1.25+1.67+10+4) = 6.087209\)

Exercise 3.3: Calculate population weights

Scenario: Let’s say you have a population of 1000 people, and you take a sample of 100 people out of it.

The population consists of:

•   Female: 700 people (70% of the population)
•   Male: 300 people (30% of the population)

In your sample of 100 people:

•   Female: 50 people (50% of the sample)
•   Male: 50 people (50% of the sample)
  1. Determine the population proportions:
  • Female: \(700/1000 = 0.7\)
  • Male \(300/1000 = 0.3\)
  1. Determine the sample proportions:
  • Female: \(50/100 = 0.5\)
  • Male \(50/100 = 0.5\)

Continue to previous page

  1. Calculate the weight for each group based on the population proportions and sample proportions:

\(Weight_{female} = \frac{population proportion of female}{sample proportion of female} = \frac{0.7}{0.5} = 1.4\)

\(Weight_{male} = \frac{population proportion of male}{sample proportion of male} = \frac{0.3}{0.5} = 0.6\)

  1. Assign the weights to the individuals: For females, the weight is 1.4, while for males, the weight is 0.6.
Group Expected Distribution Actual Distribution Weighting Factor
Female 70% 50% 1.4
Male 30% 50% 0.6
  1. Next, you can apply these weights to the sample to get the weighted counts for each group.