Part A:
Importing Libraries
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Warning: package 'odbc' was built under R version 4.4.3
library(DBI)
library(RSQLite)
Warning: package 'RSQLite' was built under R version 4.4.3
Loading data
setwd("C:/Users/lydia/OneDrive/Desktop/Data205")
host_city_locations <- read.csv("host_city_locations.csv")
athlete_events <- read.csv("athlete_events.csv")
Connecting to a SQLite database and creating two tables
olympics.db <- dbConnect(RSQLite::SQLite(),"olympics.db")
dbWriteTable(olympics.db, "host_city_locations", host_city_locations, overwrite = TRUE)
dbWriteTable(olympics.db, "athlete_events",athlete_events, overwrite = TRUE)
Part B:
How many records are in athlete_events?
SELECT COUNT(*) AS "Records"
FROM athlete_events;
How many different athletes competed in Olympics held in London?
SELECT COUNT(DISTINCT Name) AS "Number of Athletes"
FROM athlete_events
WHERE City = "London";
Which Olympics host city is furthest north?
Checking lat data type
class(host_city_locations$lat)
SELECT City AS "City furthest North"
FROM host_city_locations
ORDER BY lat DESC
LIMIT 1;
What is the list of host cities (with no duplicates)?
SELECT DISTINCT City AS "Host Cities"
FROM host_city_locations;
Displaying records 1 - 10
| Athina |
| Paris |
| St. Louis |
| London |
| Stockholm |
| Antwerpen |
| Chamonix |
| Amsterdam |
| Sankt Moritz |
| Los Angeles |
Saving the output of Question 4
host_cities <- dbGetQuery(olympics.db, '
SELECT DISTINCT City
FROM host_city_locations;
')
Converting output of Question 4 query as a CSV file
write.csv(host_cities, file = "host_cities_LydiaBaick.csv", row.names = FALSE)
getwd()
[1] "C:/Users/lydia/OneDrive/Desktop/Data205"
Disconnecting
dbDisconnect(olympics.db)