Problem 1

This problem asks us to verify a mathematical statement using R and check if it matches the expected value. 1. Substitute ( a = 2.3 ) into the expression

The expression is:

[ 6a + 42 - 3.62 ]

a <- 2.3
result <- 6*a + 42/34.2 - 3.62
result
## [1] 11.40807
# [1] 11.40807

result == 29.50556
## [1] FALSE
# [1] FALSE

With ( a = 2.3 ), the left side is 11.40807, which is not equal to 29.50556.This will return FALSE because 11.40807 is not equal to 29.50556.

##Problem 2 a. Create a vector named numbers with 100 random values from 1 to 200.

# Create a vector named 'numbers' with 100 random values between 1 and 200

# Set a seed for reproducibility (optional)
set.seed(123)

# Generate the random numbers
numbers <- sample(1:200, size = 100, replace = TRUE)

# View the first few values
head(numbers)
## [1] 159 179  14 195 170  50
  1. Create a new vector squares that contains the squared value of each element in numbers.
# Square each element in the 'numbers' vector
squares <- numbers^2

# View the first few squared values
head(squares)
## [1] 25281 32041   196 38025 28900  2500
  1. What does numbers + squares create? What does v <- c(numbers, squares) create? numbers + squares performs element-wise addition.

Each element in numbers is added to the corresponding element in squares.

The result is a new vector of the same length (100), where each value is numbers[i] + squares[i].

# Add the two vectors element-wise
sum_vector <- numbers + squares

# View the first few values
head(sum_vector)
## [1] 25440 32220   210 38220 29070  2550

Explanation continued

What does v <- c(numbers, squares) create? numbers + squares performs element-wise addition.

The result is a single vector of length 200.

The first 100 elements are from numbers, and the next 100 elements are from squares.

This does not perform any arithmetic; it simply joins the two vectors end-to-end.

# Concatenate the two vectors
v <- c(numbers, squares)

# View the length and first few values
length(v)
## [1] 200
head(v)
## [1] 159 179  14 195 170  50
tail(v)
## [1] 39601  4489 22801 14884  6241  7225
  1. Write a for loop that calculates the cumulative sum of both vectors combined. Store all iterations in a separate vector. What is the cumulative sum (last element)?

`# c. Calculating the Cumulative Sum of Combined Vectors

# Combine the two vectors into one
v <- c(numbers, squares)

# Initialize a vector to store the cumulative sums
cumulative_sum <- numeric(length(v))

# Calculate the cumulative sum using a for loop
for (i in 1:length(v)) {
  if (i == 1) {
    cumulative_sum[i] <- v[i]
  } else {
    cumulative_sum[i] <- cumulative_sum[i - 1] + v[i]
  }
}

# View the first few cumulative sums
head(cumulative_sum)
## [1] 159 338 352 547 717 767
# Find the final cumulative sum (last element)
final_cumulative_sum <- cumulative_sum[length(cumulative_sum)]
final_cumulative_sum
## [1] 1261570

##Problem 3

  1. Create a vector named scores with the following values:
scores <- c(88, 62, 90, 75, 58, 83, 92, 67, 77, 81)
scores
##  [1] 88 62 90 75 58 83 92 67 77 81

Write a function score_summary() that returns mean, median, range, min, max, and sd. Name the vector elements. Test with scores and numbers.

score_summary <- function(x) {
  c(
    mean = mean(x),
    median = median(x),
    range = diff(range(x)),
    min = min(x),
    max = max(x),
    sd = sd(x)
  )
}

score_summary(scores)
##     mean   median    range      min      max       sd 
## 77.30000 79.00000 34.00000 58.00000 92.00000 11.81383
score_summary(numbers)
##      mean    median     range       min       max        sd 
##  97.49000  91.00000 196.00000   4.00000 200.00000  55.17566
  1. Modify the function to include number of students that passed (score > 75).
score_summary <- function(x, pass_mark = 75) {
  summary <- c(
    mean = mean(x),
    median = median(x),
    range = diff(range(x)),
    min = min(x),
    max = max(x),
    sd = sd(x),
    passed = sum(x > pass_mark)
  )
  return(summary)
}

score_summary(scores)
##     mean   median    range      min      max       sd   passed 
## 77.30000 79.00000 34.00000 58.00000 92.00000 11.81383  6.00000
score_summary(numbers)
##      mean    median     range       min       max        sd    passed 
##  97.49000  91.00000 196.00000   4.00000 200.00000  55.17566  63.00000
  1. Create a new vector quiz_scores with 12 random scores between 30-100. Use your function and compare.
set.seed(456)
quiz_scores <- sample(30:100, 12, replace = TRUE)
quiz_scores
##  [1] 64 67 56 54 60 72 37 43 42 98 81 59
score_summary(quiz_scores)
##     mean   median    range      min      max       sd   passed 
## 61.08333 59.50000 61.00000 37.00000 98.00000 17.25456  2.00000
score_summary(scores)
##     mean   median    range      min      max       sd   passed 
## 77.30000 79.00000 34.00000 58.00000 92.00000 11.81383  6.00000
  1. Modify your function to set passing score as an argument. Test with >60 and >85.
score_summary(scores, pass_mark = 60)
##     mean   median    range      min      max       sd   passed 
## 77.30000 79.00000 34.00000 58.00000 92.00000 11.81383  9.00000
score_summary(scores, pass_mark = 85)
##     mean   median    range      min      max       sd   passed 
## 77.30000 79.00000 34.00000 58.00000 92.00000 11.81383  3.00000

##Problem 4

Write a function fibonacci that generates the first n Fibonacci numbers.

fibonacci <- function(n) {
  if (n == 0) return(0)
  if (n == 1) return(1)
  fibs <- numeric(n)
  fibs[1] <- 0
  if (n > 1) fibs[2] <- 1
  if (n > 2) {
    for (i in 3:n) {
      fibs[i] <- fibs[i-1] + fibs[i-2]
    }
  }
  return(fibs)
}

fibonacci(15)
##  [1]   0   1   1   2   3   5   8  13  21  34  55  89 144 233 377

Problem 5

This problem uses Table 11-20 from the 2021 Guam Statistical Yearbook, showing the number of active Government of Guam employees by age group and gender. a. Create vectors for Males and Females by age group b. Give these vectors names corresponding to age groups c. Create a total column and verify

Data Entry

# Age groups
age_groups <- c("Under 20", "20-24", "25-29", "30-34", "35-39", "40-44", "45-49", "50-54", "55-59", "60-64", "65-69", "70 and over")

# Males and Females (from your Excel data)
males <- c(6, 75, 193, 188, 225, 270, 197, 207, 113, 65, 20)
females <- c(12, 144, 355, 354, 419, 506, 453, 386, 215, 104, 46)

# Adjusting for missing "Under 20" in females and "70 and over" in males
# (Assuming 0 for missing values)
males <- c(6, 75, 193, 188, 225, 270, 197, 207, 113, 65, 20, 0)
females <- c(0, 12, 144, 355, 354, 419, 506, 453, 386, 215, 104, 46)

names(males) <- age_groups
names(females) <- age_groups

# Total employees by age group
total <- males + females
total
##    Under 20       20-24       25-29       30-34       35-39       40-44 
##           6          87         337         543         579         689 
##       45-49       50-54       55-59       60-64       65-69 70 and over 
##         703         660         499         280         124          46
  1. Create vectors for Males and Females by age group
total
##    Under 20       20-24       25-29       30-34       35-39       40-44 
##           6          87         337         543         579         689 
##       45-49       50-54       55-59       60-64       65-69 70 and over 
##         703         660         499         280         124          46
sum(total) # Total employees
## [1] 4553
  1. Which age group had the most employees? Are there more males or females?
max_group <- age_groups[which.max(total)]
max_group
## [1] "45-49"
if (males[which.max(total)] > females[which.max(total)]) {
  "More males"
} else {
  "More females"
}
## [1] "More females"
  1. Find the age group with the largest difference between males and females
diffs <- abs(males - females)
age_groups[which.max(diffs)]
## [1] "45-49"
  1. Proportion of male and female employees within each age group
prop_male <- males / total
prop_female <- females / total
prop_male
##    Under 20       20-24       25-29       30-34       35-39       40-44 
##   1.0000000   0.8620690   0.5727003   0.3462247   0.3886010   0.3918723 
##       45-49       50-54       55-59       60-64       65-69 70 and over 
##   0.2802276   0.3136364   0.2264529   0.2321429   0.1612903   0.0000000
prop_female
##    Under 20       20-24       25-29       30-34       35-39       40-44 
##   0.0000000   0.1379310   0.4272997   0.6537753   0.6113990   0.6081277 
##       45-49       50-54       55-59       60-64       65-69 70 and over 
##   0.7197724   0.6863636   0.7735471   0.7678571   0.8387097   1.0000000
  1. Are proportions roughly the same across all age groups?
# Print proportions for visual inspection
data.frame(AgeGroup = age_groups, MaleProp = prop_male, FemaleProp = prop_female)
##                AgeGroup  MaleProp FemaleProp
## Under 20       Under 20 1.0000000  0.0000000
## 20-24             20-24 0.8620690  0.1379310
## 25-29             25-29 0.5727003  0.4272997
## 30-34             30-34 0.3462247  0.6537753
## 35-39             35-39 0.3886010  0.6113990
## 40-44             40-44 0.3918723  0.6081277
## 45-49             45-49 0.2802276  0.7197724
## 50-54             50-54 0.3136364  0.6863636
## 55-59             55-59 0.2264529  0.7735471
## 60-64             60-64 0.2321429  0.7678571
## 65-69             65-69 0.1612903  0.8387097
## 70 and over 70 and over 0.0000000  1.0000000
  1. Proportion of employees aged over 40 years for males, females, and total
# Find indices for age groups over 40
over_40_idx <- which(age_groups %in% c("40-44", "45-49", "50-54", "55-59", "60-64", "65-69", "70 and over"))
prop_male_over40 <- sum(males[over_40_idx]) / sum(males)
prop_female_over40 <- sum(females[over_40_idx]) / sum(females)
prop_total_over40 <- sum(total[over_40_idx]) / sum(total)
prop_male_over40
## [1] 0.5593329
prop_female_over40
## [1] 0.7110888
prop_total_over40
## [1] 0.6591259