Introduction: The article I chose to do my analysis on wasn't an actual pure "news article". "Club Soccer Predictions": https://projects.fivethirtyeight.com/soccer-predictions/ was more of a listing of predictions and current rankings. I chose this dataset because the World Cup is around the corner and I am a huge fan of soccer.
soccer_data<-read.csv('https://raw.githubusercontent.com/Sangeetha-007/R-Practice/master/607/Assignments/Assignment1/spi_global_rankings_intl.csv')
head(soccer_data)
## rank name confed off def spi
## 1 1 Brazil CONMEBOL 3.21 0.23 94.26
## 2 2 Spain UEFA 2.90 0.37 90.58
## 3 3 Germany UEFA 3.32 0.61 90.05
## 4 4 Portugal UEFA 2.89 0.44 89.34
## 5 5 France UEFA 2.81 0.42 89.03
## 6 6 Argentina CONMEBOL 2.68 0.39 88.43
Summary of the International Soccer Power Index data
## rank name confed off
## Min. : 1.00 Length:220 Length:220 Min. :0.200
## 1st Qu.: 55.75 Class :character Class :character 1st Qu.:0.690
## Median :110.50 Mode :character Mode :character Median :1.070
## Mean :110.50 Mean :1.163
## 3rd Qu.:165.25 3rd Qu.:1.560
## Max. :220.00 Max. :3.320
## def spi
## Min. :0.23 Min. : 0.26
## 1st Qu.:0.91 1st Qu.:17.74
## Median :1.34 Median :37.83
## Mean :1.71 Mean :39.38
## 3rd Qu.:2.17 3rd Qu.:58.85
## Max. :6.27 Max. :94.26
Created a subset with the 10 team international teams
soccer_subset<-head(soccer_data, 10)
soccer_subset
## rank name confed off def spi
## 1 1 Brazil CONMEBOL 3.21 0.23 94.26
## 2 2 Spain UEFA 2.90 0.37 90.58
## 3 3 Germany UEFA 3.32 0.61 90.05
## 4 4 Portugal UEFA 2.89 0.44 89.34
## 5 5 France UEFA 2.81 0.42 89.03
## 6 6 Argentina CONMEBOL 2.68 0.39 88.43
## 7 7 Netherlands UEFA 2.94 0.62 86.89
## 8 8 England UEFA 2.44 0.52 83.67
## 9 9 Belgium UEFA 2.76 0.71 83.53
## 10 10 Uruguay CONMEBOL 2.32 0.50 82.64
Renamed the columns of the subset
colnames(soccer_subset) <- c("Ranking", "Country", "Association", "Offense", "Defense", "Soccer Power Index")
head(soccer_subset)
## Ranking Country Association Offense Defense Soccer Power Index
## 1 1 Brazil CONMEBOL 3.21 0.23 94.26
## 2 2 Spain UEFA 2.90 0.37 90.58
## 3 3 Germany UEFA 3.32 0.61 90.05
## 4 4 Portugal UEFA 2.89 0.44 89.34
## 5 5 France UEFA 2.81 0.42 89.03
## 6 6 Argentina CONMEBOL 2.68 0.39 88.43
Looked at the summary of the new subset
summary(soccer_subset)
## Ranking Country Association Offense
## Min. : 1.00 Length:10 Length:10 Min. :2.320
## 1st Qu.: 3.25 Class :character Class :character 1st Qu.:2.700
## Median : 5.50 Mode :character Mode :character Median :2.850
## Mean : 5.50 Mean :2.827
## 3rd Qu.: 7.75 3rd Qu.:2.930
## Max. :10.00 Max. :3.320
## Defense Soccer Power Index
## Min. :0.2300 Min. :82.64
## 1st Qu.:0.3975 1st Qu.:84.47
## Median :0.4700 Median :88.73
## Mean :0.4810 Mean :87.84
## 3rd Qu.:0.5875 3rd Qu.:89.87
## Max. :0.7100 Max. :94.26
Created a new csv of the subset so it can be uploaded to github
write.csv(soccer_subset, file="SoccerSubset.csv", row.names=FALSE)
getwd()
## [1] "/Users/Sangeetha"
Bar graph on Defense per Country
ggplot(soccer_subset, aes(x=Country, y=Defense)) + geom_bar(stat="identity") +
labs(x="Country", y="Defense") +ggtitle("International Soccer Team Defense Values")
Bar graph on Offense per Country
ggplot(soccer_subset, aes(x=Country, y=Offense)) + geom_bar(stat="identity") +
labs(x="Country", y="Offense") +ggtitle("International Soccer Team Offense Values")
The country with the highest defense is Belgium, while the country with the highest offense is Germany. This is interesting to me because I didn't think Belgium is a strong soccer team at all. I would have assumed Argentina or Brazil would have higher values.
Conclusion/Findings and Recommendations:
From my analysis, it was interesting to see how highest defense is Belgium and highest offense is Germany. I can't make further analysis based on the data presented in the zip file to take this analysis further, but if I find supporting data elsewhere, it may be beneficial. Data based on the number of fans or population for each national team would be helpful and I can probably check if there is a correlation to the offenses and defenses.