###Start of Assignment###
#1. Use the summary function to gain an overview of the data set. Then display the mean and median for at least two attributes.
#set the working directory setwd(“C:/Users/magnu/Documents/R”) #import the data and look at the first 6 rows mydata <- read.csv(“bigcity.csv”)
#display summary of dataset summary(mydata) #calculate / display mean of 2 attributes col1mean <- mean(mydata\(u, na.rm=TRUE) col2mean <- mean(mydata\)x, na.rm=TRUE) #calculate / display median of 2 attributes col1median <- median(mydata\(u, na.rm=TRUE) col2median <- median(mydata\)x, na.rm=TRUE)
#2. Create a new data frame with a subset of the columns and rows. Make sure to rename it.
new_df <- data.frame(X = mydata\(u[1:10], Y = mydata\)x[1:10])
#3. Create new column names for the new data frame.
names(new_df)[1] <- “2010.population” names(new_df)[2] <- “2020.population”
#4. Use the summary function to create an overview of your new data frame. Then print the mean and median for the same two attributes. Please compare.
#display summary of NEW dataset summary(new_df)
#calculate / display mean and median of same 2 attributes new_col1mean <- mean(new_df\('2010.population', na.rm=TRUE) new_col2mean <- mean(new_df\)’2020.population’, na.rm=TRUE) new_col1median <- median(new_df\('2010.population', na.rm=TRUE) new_col2median <- median(new_df\)’2020.population’, na.rm=TRUE)
#5. For at least 3 values in a column please rename so that every value in that column is renamed. For example, suppose I have 20 values of the letter “e” in one column. Rename those values so that all 20 would show as “excellent”.
new_df\('2010.population'[new_df\)’2010.population’ == 138] <- 135 new_df\('2010.population'[new_df\)’2010.population’ == 93] <- 95 new_df\('2010.population'[new_df\)’2010.population’ == 61] <- 60
#6. Display enough rows to see examples of all of steps 1-5 above.
head(new_df) #display first 6 rows to show value and header changes
#7. BONUS – place the original .csv in a github file and have R read from the link. This will be a very useful skill as you progress in your data science education and career.
#Reference: https://stackoverflow.com/questions/14441729/read-a-csv-from-github-into-r
library(RCurl) x <- getURL(“https://raw.githubusercontent.com/Magnus-PS/CUNY-Bridge/master/bigcity.csv”) y <- read.csv(text = x)
###End of Assignment###