1 Use the summary function to gain an overview of the data set. Then display the mean and median for at least two attributes

futbolscorelink <- "https://raw.github.com/vincentarelbundock/Rdatasets/master/csv/vcd/UKSoccer.csv"
futbolscore <- read.csv(file = futbolscorelink, header = TRUE, sep = ",")
summary(futbolscore)
##        X           Home        Away        Freq     
##  Min.   : 1   Min.   :0   Min.   :0   Min.   : 0.0  
##  1st Qu.: 7   1st Qu.:1   1st Qu.:1   1st Qu.: 4.0  
##  Median :13   Median :2   Median :2   Median :10.0  
##  Mean   :13   Mean   :2   Mean   :2   Mean   :15.2  
##  3rd Qu.:19   3rd Qu.:3   3rd Qu.:3   3rd Qu.:19.0  
##  Max.   :25   Max.   :4   Max.   :4   Max.   :59.0

2 Create a new data frame with a subset of the columns and rows. Make sure to rename it

nogoalsathome = subset(futbolscore,Home==0 )
nogoalsathome
##     X Home Away Freq
## 1   1    0    0   27
## 6   6    0    1   29
## 11 11    0    2   10
## 16 16    0    3    8
## 21 21    0    4    2

3 Create new column names for the new data frame.

library(plyr)
rename(nogoalsathome, c("X"="Games Played" , "Home" = "Home Goals scored" , "Away" = "Away Goals scored", "Freq" = "Bets Won"))
##    Games Played Home Goals scored Away Goals scored Bets Won
## 1             1                 0                 0       27
## 6             6                 0                 1       29
## 11           11                 0                 2       10
## 16           16                 0                 3        8
## 21           21                 0                 4        2

4 Use the summary function to create an overview of your new data frame. The print the mean and median for the same two attributes. Please compare.

summary(nogoalsathome)
##        X           Home        Away        Freq     
##  Min.   : 1   Min.   :0   Min.   :0   Min.   : 2.0  
##  1st Qu.: 6   1st Qu.:0   1st Qu.:1   1st Qu.: 8.0  
##  Median :11   Median :0   Median :2   Median :10.0  
##  Mean   :11   Mean   :0   Mean   :2   Mean   :15.2  
##  3rd Qu.:16   3rd Qu.:0   3rd Qu.:3   3rd Qu.:27.0  
##  Max.   :21   Max.   :0   Max.   :4   Max.   :29.0

5 For at least 3 values in a column please rename so that every value in that column is renamed. For example, suppose I have 20 values of the letter “e” in one column. Rename those values so that all 20 would show as “excellent”

names(nogoalsathome) <-gsub("e", "X", names(nogoalsathome))
nogoalsathome
##     X HomX Away FrXq
## 1   1    0    0   27
## 6   6    0    1   29
## 11 11    0    2   10
## 16 16    0    3    8
## 21 21    0    4    2

6 Display enough rows to see examples of all of steps 1-5 above.

EXAMPLES SHOWN PER STEP

7 BONUS - place the original .csv in a github file and have R read from the link. This will be a very useful skill as you progress in your data science education and career.

SEE QUESTION 1