Homework1

Problem 1

This problem asks us to verify a mathematical statement using R and check if it matches the expected value. 1. Substitute ( a = 2.3 ) into the expression

The expression is:

[ 6a + 42 - 3.62 ]

a <- 2.3
result <- 6*a + 42/34.2 - 3.62
result

## [1] 11.40807

# [1] 11.40807

result == 29.50556

## [1] FALSE

# [1] FALSE

With ( a = 2.3 ), the left side is 11.40807, which is not equal to 29.50556.This will return FALSE because 11.40807 is not equal to 29.50556.

##Problem 2 a. Create a vector named numbers with 100 random values from 1 to 200.

# Create a vector named 'numbers' with 100 random values between 1 and 200

# Set a seed for reproducibility (optional)
set.seed(123)

# Generate the random numbers
numbers <- sample(1:200, size = 100, replace = TRUE)

# View the first few values
head(numbers)

## [1] 159 179  14 195 170  50

Create a new vector squares that contains the squared value of each element in numbers.

# Square each element in the 'numbers' vector
squares <- numbers^2

# View the first few squared values
head(squares)

## [1] 25281 32041   196 38025 28900  2500

What does numbers + squares create? What does v <- c(numbers, squares) create? numbers + squares performs element-wise addition.

Each element in numbers is added to the corresponding element in squares.

The result is a new vector of the same length (100), where each value is numbers[i] + squares[i].

# Add the two vectors element-wise
sum_vector <- numbers + squares

# View the first few values
head(sum_vector)

## [1] 25440 32220   210 38220 29070  2550

Explanation continued

What does v <- c(numbers, squares) create? numbers + squares performs element-wise addition.

The result is a single vector of length 200.

The first 100 elements are from numbers, and the next 100 elements are from squares.

This does not perform any arithmetic; it simply joins the two vectors end-to-end.

# Concatenate the two vectors
v <- c(numbers, squares)

# View the length and first few values
length(v)

## [1] 200

head(v)

## [1] 159 179  14 195 170  50

tail(v)

## [1] 39601  4489 22801 14884  6241  7225

Write a for loop that calculates the cumulative sum of both vectors combined. Store all iterations in a separate vector. What is the cumulative sum (last element)?

`# c. Calculating the Cumulative Sum of Combined Vectors

# Combine the two vectors into one
v <- c(numbers, squares)

# Initialize a vector to store the cumulative sums
cumulative_sum <- numeric(length(v))

# Calculate the cumulative sum using a for loop
for (i in 1:length(v)) {
  if (i == 1) {
    cumulative_sum[i] <- v[i]
  } else {
    cumulative_sum[i] <- cumulative_sum[i - 1] + v[i]
  }
}

# View the first few cumulative sums
head(cumulative_sum)

## [1] 159 338 352 547 717 767

# Find the final cumulative sum (last element)
final_cumulative_sum <- cumulative_sum[length(cumulative_sum)]
final_cumulative_sum

## [1] 1261570

##Problem 3

Create a vector named scores with the following values:

scores <- c(88, 62, 90, 75, 58, 83, 92, 67, 77, 81)
scores

##  [1] 88 62 90 75 58 83 92 67 77 81

Write a function score_summary() that returns mean, median, range, min, max, and sd. Name the vector elements. Test with scores and numbers.

score_summary <- function(x) {
  c(
    mean = mean(x),
    median = median(x),
    range = diff(range(x)),
    min = min(x),
    max = max(x),
    sd = sd(x)
  )
}

score_summary(scores)

##     mean   median    range      min      max       sd 
## 77.30000 79.00000 34.00000 58.00000 92.00000 11.81383

score_summary(numbers)

##      mean    median     range       min       max        sd 
##  97.49000  91.00000 196.00000   4.00000 200.00000  55.17566

Modify the function to include number of students that passed (score > 75).

score_summary <- function(x, pass_mark = 75) {
  summary <- c(
    mean = mean(x),
    median = median(x),
    range = diff(range(x)),
    min = min(x),
    max = max(x),
    sd = sd(x),
    passed = sum(x > pass_mark)
  )
  return(summary)
}

score_summary(scores)

##     mean   median    range      min      max       sd   passed 
## 77.30000 79.00000 34.00000 58.00000 92.00000 11.81383  6.00000

score_summary(numbers)

##      mean    median     range       min       max        sd    passed 
##  97.49000  91.00000 196.00000   4.00000 200.00000  55.17566  63.00000

Create a new vector quiz_scores with 12 random scores between 30-100. Use your function and compare.

set.seed(456)
quiz_scores <- sample(30:100, 12, replace = TRUE)
quiz_scores

##  [1] 64 67 56 54 60 72 37 43 42 98 81 59

score_summary(quiz_scores)

##     mean   median    range      min      max       sd   passed 
## 61.08333 59.50000 61.00000 37.00000 98.00000 17.25456  2.00000

score_summary(scores)

##     mean   median    range      min      max       sd   passed 
## 77.30000 79.00000 34.00000 58.00000 92.00000 11.81383  6.00000

Modify your function to set passing score as an argument. Test with >60 and >85.

score_summary(scores, pass_mark = 60)

##     mean   median    range      min      max       sd   passed 
## 77.30000 79.00000 34.00000 58.00000 92.00000 11.81383  9.00000

score_summary(scores, pass_mark = 85)

##     mean   median    range      min      max       sd   passed 
## 77.30000 79.00000 34.00000 58.00000 92.00000 11.81383  3.00000

##Problem 4

Write a function fibonacci that generates the first n Fibonacci numbers.

fibonacci <- function(n) {
  if (n == 0) return(0)
  if (n == 1) return(1)
  fibs <- numeric(n)
  fibs[1] <- 0
  if (n > 1) fibs[2] <- 1
  if (n > 2) {
    for (i in 3:n) {
      fibs[i] <- fibs[i-1] + fibs[i-2]
    }
  }
  return(fibs)
}

fibonacci(15)

##  [1]   0   1   1   2   3   5   8  13  21  34  55  89 144 233 377

Problem 5

This problem uses Table 11-20 from the 2021 Guam Statistical Yearbook, showing the number of active Government of Guam employees by age group and gender. a. Create vectors for Males and Females by age group b. Give these vectors names corresponding to age groups c. Create a total column and verify

Data Entry

# Age groups
age_groups <- c("Under 20", "20-24", "25-29", "30-34", "35-39", "40-44", "45-49", "50-54", "55-59", "60-64", "65-69", "70 and over")

# Males and Females (from your Excel data)
males <- c(6, 75, 193, 188, 225, 270, 197, 207, 113, 65, 20)
females <- c(12, 144, 355, 354, 419, 506, 453, 386, 215, 104, 46)

# Adjusting for missing "Under 20" in females and "70 and over" in males
# (Assuming 0 for missing values)
males <- c(6, 75, 193, 188, 225, 270, 197, 207, 113, 65, 20, 0)
females <- c(0, 12, 144, 355, 354, 419, 506, 453, 386, 215, 104, 46)

names(males) <- age_groups
names(females) <- age_groups

# Total employees by age group
total <- males + females
total

##    Under 20       20-24       25-29       30-34       35-39       40-44 
##           6          87         337         543         579         689 
##       45-49       50-54       55-59       60-64       65-69 70 and over 
##         703         660         499         280         124          46

Create vectors for Males and Females by age group

total

##    Under 20       20-24       25-29       30-34       35-39       40-44 
##           6          87         337         543         579         689 
##       45-49       50-54       55-59       60-64       65-69 70 and over 
##         703         660         499         280         124          46

sum(total) # Total employees

## [1] 4553

Which age group had the most employees? Are there more males or females?

max_group <- age_groups[which.max(total)]
max_group

## [1] "45-49"

if (males[which.max(total)] > females[which.max(total)]) {
  "More males"
} else {
  "More females"
}

## [1] "More females"

Find the age group with the largest difference between males and females

diffs <- abs(males - females)
age_groups[which.max(diffs)]

## [1] "45-49"

Proportion of male and female employees within each age group

prop_male <- males / total
prop_female <- females / total
prop_male

##    Under 20       20-24       25-29       30-34       35-39       40-44 
##   1.0000000   0.8620690   0.5727003   0.3462247   0.3886010   0.3918723 
##       45-49       50-54       55-59       60-64       65-69 70 and over 
##   0.2802276   0.3136364   0.2264529   0.2321429   0.1612903   0.0000000

prop_female

##    Under 20       20-24       25-29       30-34       35-39       40-44 
##   0.0000000   0.1379310   0.4272997   0.6537753   0.6113990   0.6081277 
##       45-49       50-54       55-59       60-64       65-69 70 and over 
##   0.7197724   0.6863636   0.7735471   0.7678571   0.8387097   1.0000000

Are proportions roughly the same across all age groups?

# Print proportions for visual inspection
data.frame(AgeGroup = age_groups, MaleProp = prop_male, FemaleProp = prop_female)

##                AgeGroup  MaleProp FemaleProp
## Under 20       Under 20 1.0000000  0.0000000
## 20-24             20-24 0.8620690  0.1379310
## 25-29             25-29 0.5727003  0.4272997
## 30-34             30-34 0.3462247  0.6537753
## 35-39             35-39 0.3886010  0.6113990
## 40-44             40-44 0.3918723  0.6081277
## 45-49             45-49 0.2802276  0.7197724
## 50-54             50-54 0.3136364  0.6863636
## 55-59             55-59 0.2264529  0.7735471
## 60-64             60-64 0.2321429  0.7678571
## 65-69             65-69 0.1612903  0.8387097
## 70 and over 70 and over 0.0000000  1.0000000

Proportion of employees aged over 40 years for males, females, and total

# Find indices for age groups over 40
over_40_idx <- which(age_groups %in% c("40-44", "45-49", "50-54", "55-59", "60-64", "65-69", "70 and over"))
prop_male_over40 <- sum(males[over_40_idx]) / sum(males)
prop_female_over40 <- sum(females[over_40_idx]) / sum(females)
prop_total_over40 <- sum(total[over_40_idx]) / sum(total)
prop_male_over40

## [1] 0.5593329

prop_female_over40

## [1] 0.7110888

prop_total_over40

## [1] 0.6591259

Homework1

PadmoreHillR_hw1

2025-08-29

Problem 1

Explanation continued

Problem 5

Data Entry