how — title: DATA 606 Data Project Proposal author: —
library(tidyverse)
## -- Attaching core tidyverse packages ------------------------ tidyverse 2.0.0 --
## v dplyr 1.1.0 v readr 2.1.5
## v forcats 1.0.0 v stringr 1.5.0
## v ggplot2 3.4.4 v tibble 3.2.1
## v lubridate 1.9.3 v tidyr 1.3.1
## v purrr 1.0.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
## i Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
# load data
Transit_data <- read.csv("C:\\Users\\Oluwaseyi\\Documents\\Data 606\\Transportation Data\\Monthly_Transportation_Statistics.csv")
You should phrase your research question in a way that matches up with the scope of inference your dataset allows for.
Are fatalities in different modes of transportation, Air, rail and road connected in any way to government spending on, public usage of and other variables on related infrastructure.
What are the cases, and how many are there? 924 obs o
Dataset was downloaded from kaggle
This is an observational study. It is data collected without any influence over the data
What is the response variable? Is it quantitative or qualitative? fatalities by mode of transport
Spending on each of the modes of transport and usage of the modes of transport.
Provide summary statistics for each the variables. Also include appropriate visualizations related to your research question (e.g. scatter plot, boxplots, etc). This step requires the use of R, hence a code chunk is provided below. Insert more code chunks as needed.
Highway_Fatality <- summary(Transit_data$Highway.Fatalities)
Rail_Fatality <- summary(Transit_data$Rail.Fatalities)
Air_Fatality <- summary(Transit_data$Air.Safety...General.Aviation.Fatalities)
Air_Traffic <- summary(Transit_data$U.S..Airline.Traffic...Total...Seasonally.Adjusted)
Rail_Traffic <- summary(Transit_data$Passenger.Rail.Passengers)
Vehicle_Miles <- summary(Transit_data$Highway.Vehicle.Miles.Traveled...All.Systems)
Transit_spending <- summary(Transit_data$State.and.Local.Government.Construction.Spending...Mass.Transit)
Highway_Spending <- summary(Transit_data$State.and.Local.Government.Construction.Spending...Highway.and.Street)
Air_spending <- summary(Transit_data$State.and.Local.Government.Construction.Spending...Air)