b. However, Non-probability sampling is a method of selecting units from a population using a subjective (i.e. non-random) method. Since non-probability sampling does not require a complete survey frame, it is a fast, easy and inexpensive way of obtaining data.
In simple random sampling, all members of the population have an equal chance of being selected and the selection is done randomly.
Many populations can be divided into smaller groups based on specific characteristics that don’t overlap but represent the entire population when put together.
Systematic sampling is similar to simple random sampling, though it’s usually a bit easier to conduct. Each member of the population is assigned a number, then selected at regular intervals to form a sample.
Like stratified sampling, cluster sampling also involves separating the population into subgroups, or clusters. But that’s where the two probability sampling methods diverge.
It is a non-probability sampling technique where samples are selected from the population only because they are conveniently available to the researcher.
Quota sampling is one of the most common methods for collecting data in surveys and research studies.
The purposive sampling method is about selecting samples from the overall sample size based on the judgment of the survey taker or researcher.
It is for acquiring a sample that uses participants to recruit additional participants.
Part II
The target audience is anyone in the United States who is civilian and not institutionalized, aged 16 or older.People who live in institutions and are active in the armed forces are not included in the sample.The survey is designed for individuals between the ages of 16 and over (without any age limit).
The U.S. Census Bureau conducts the Current Population Survey (CPS), which involves conducting a sample survey of about 60,000 eligible households. The CPS is a probability of sample and cluster sampling uses to create a areas.
3)Do you think CPS is a representative sample of the US entire population after reading about its methodology or your online reserach ?
Because it chooses a multistage probability-based sample of American households, the CPS is a representative sample.The sample size is also set by particular parameters that ensure a trustworthy source for assessing the unemployment rate at the national and state level.
# NOTE: To load data, you must download both the extract's data and the DDI
# and also set the working directory to the folder with these files (or change the path below).
if (!require("ipumsr")) stop("Reading IPUMS data into R requires the ipumsr package. It can be installed using the following command: install.packages('ipumsr')")
## Loading required package: ipumsr
ddi <- read_ipums_ddi("cps_00007.xml")
data <- read_ipums_micro(ddi)
## Use of data from IPUMS CPS is subject to conditions including that users should cite the data appropriately. Use command `ipums_conditions()` for more details.
library(readr)
library(psych)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.3 ✔ purrr 1.0.2
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.3 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ ggplot2::%+%() masks psych::%+%()
## ✖ ggplot2::alpha() masks psych::alpha()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
data1 <- data %>% filter(INCWAGE != 99999999)
library(ggplot2)
ggplot(data = data1,
mapping = aes(x = LABFORCE ,
y = INCWAGE )) + geom_point()
Yes the graph makes sense because we can observe that households that reported employment have a higher income that household that not reported employment.
describe(data1$INCWAGE)
## vars n mean sd median trimmed mad min max range skew
## X1 1 790984 33857.95 62954.22 15000 33857.95 22239 0 2099999 2099999 7.66
## kurtosis se
## X1 105.19 70.78
describe(data1$LABFORCE)
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 790984 1.62 0.5 2 1.62 0 0 2 2 -0.58 -1.38 0
data1 %>%
group_by(LABFORCE) %>%
summarize(mean_income = mean(INCWAGE),
median_income = median(INCWAGE))
## # A tibble: 3 × 3
## LABFORCE mean_income median_income
## <int+lbl> <dbl> <dbl>
## 1 0 [NIU] 57966. 50002
## 2 1 [No, not in the labor force] 2279. 0
## 3 2 [Yes, in the labor force] 52796. 38000
# Set the values for N, x, and p
N <- 100 # Total number of procedures performed
x <- 10 # Number of procedures resulting in death within 30 days
p <- 0.05 # National proportion of deaths in these cases
# Binomial distribution
binom_prob <- dbinom(x, N, p)
binom_cum_prob <- pbinom(x, N, p, lower.tail = FALSE)
# Poisson distribution (approximation to binomial for large N)
lambda <- N * p
poisson_prob <- dpois(x, lambda)
poisson_cum_prob <- ppois(x, lambda, lower.tail = FALSE)
# Print the probabilities
cat("Binomial Probability:", binom_prob, "\n")
## Binomial Probability: 0.01671588
cat("Binomial Cumulative Probability:", binom_cum_prob, "\n")
## Binomial Cumulative Probability: 0.01147241
cat("Poisson Probability:", poisson_prob, "\n")
## Poisson Probability: 0.01813279
cat("Poisson Cumulative Probability:", poisson_cum_prob, "\n")
## Poisson Cumulative Probability: 0.01369527
# Perform hypothesis test
alpha <- 0.05 # Significance level
# Binomial test
binom_test <- binom.test(x, N, p, alternative = "greater")
cat("Binomial Test p-value:", binom_test$p.value, "\n")
## Binomial Test p-value: 0.02818829
cat("Binomial Test Conclusion:", ifelse(binom_test$p.value < alpha, "Reject Null Hypothesis", "Fail to Reject Null Hypothesis"), "\n")
## Binomial Test Conclusion: Reject Null Hypothesis
# Poisson test
poisson_test <- poisson.test(x, T = N, alternative = "greater")
cat("Poisson Test p-value:", poisson_test$p.value, "\n")
## Poisson Test p-value: 1
cat("Poisson Test Conclusion:", ifelse(poisson_test$p.value < alpha, "Reject Null Hypothesis", "Fail to Reject Null Hypothesis"), "\n")
## Poisson Test Conclusion: Fail to Reject Null Hypothesis