Intro to Data Science - HW 1

Attribution statement: (choose only one and delete the rest)

# 2. I did this homework with help from the book and the professor and these Internet sources:https://www.youtube.com/watch?v=o_ldRKmgvHo

Define a variable:

x <- 5

Define the following vectors, which represent the population (in thousands) and number of colleges in each of the five counties in Central New York (CNY) – Cayuga, Cortland, Madison, Onondaga, and Oswego, in this order:

population <- c(80, 49, 73, 467, 122)
colleges <- c(2, 2, 3, 9, 2)

Part 1: Calculating statistics using R

  1. Show the number of observations in the population vector with the length() function:
length(population)
## [1] 5
#Population length is 5
  1. Show the number of observations in the colleges vector with the length() function:
length(colleges)
## [1] 5
#Colleges length is 5
  1. Calculate the average CNY population using the mean() function:
mean(population)
## [1] 158.2
#Mean population is 158.2
  1. Calculate the average number of colleges in CNY using the mean() function:
mean(colleges)
## [1] 3.6
#Mean colleges is 3.6
  1. Calculate the total CNY population using the sum() function:
sum(population)
## [1] 791
#Sum of population is 791
  1. Calculate the total number of colleges in CNY using the sum() function:
sum(colleges)
## [1] 18
#Sum of colleges is 18
  1. Calculate the average CNY population again, this time using the results from steps A & E:
sum(population)/length(population)
## [1] 158.2
#The average population is 158.2
  1. Calculate the average number of colleges in CNY again, this time using the results from steps B & F:
sum(colleges)/length(colleges)
## [1] 3.6
#Average number of colleges is 3.6

Part 2: Using the max/min and range functions in {r}

  1. How many colleges does the county with most colleges have? Hint: Use the max() function:
max(colleges)
## [1] 9
#Onondaga County has the most colleges
  1. What is the population of the least populous county in CNY? Hint: Use the min() function:
min(population)
## [1] 49
#Cortland has the lowest population
  1. Display the populations of the least populous and most populous county in the dataset together. Hint: Use the range() function:
range(population)
## [1]  49 467
#Cortland is the least populous, Onondaga is the most

Part 3: Vector Math

  1. Create a new vector called extraPop, which is the current population of a county + 50 (each county has 50,000 more people):
extrapop <- (colleges+50)
  1. Calculate the average of extraPop:
mean(extrapop)
## [1] 53.6
#Mean of ExtraPop is 53.6
  1. In a variable called bigCounties, store all the population numbers from the original population vector which are greater than 120 (using subsetting in R):
BigCoutnies <- population[c(4,5)]
  1. Report the length of bigCounties:
length(BigCoutnies)
## [1] 2
#Length of BigCounties is 2