carcrash <- CarCrashBase |>
mutate(injury = fct_collapse(`Injury Severity`,
no_apparent = c("No Apparent Injury", "NO APPARENT INJURY"),
possible = c("Possible Injury", "POSSIBLE INJURY"),
minor = c("Suspected Minor Injury", "SUSPECTED MINOR INJURY"),
serious = c("Suspected Serious Injury", "SUSPECTED SERIOUS INJURY"),
fatal = c("Fatal Injury", "FATAL INJURY"),
other_level = "N/A"),
injury = fct_relevel(injury, "no_apparent", "possible", "minor", "serious", "fatal"),
damage = fct_collapse(`Vehicle Damage Extent`,
no_damage = c("No Damage", "NO DAMAGE"),
disabling = c("Disabling", "DISABLING"),
not_at_scene = c("Vehicle Not at Scene"),
functional = c("Functional", "FUNCTIONAL"),
superficial = c("Superficial", "SUPERFICIAL"),
destroyed = c("DESTROYED"),
other_level = "N/A"),
damage = fct_relevel(damage, "no_damage", "superficial", "functional", "disabling", "destroyed", "not_at_scene"))
carcrash |>
count(damage)
## # A tibble: 7 × 2
## damage n
## <fct> <int>
## 1 no_damage 6591
## 2 superficial 52334
## 3 functional 52178
## 4 disabling 75623
## 5 destroyed 7610
## 6 not_at_scene 2214
## 7 N/A 7038
One article we chose to analyze is How Every NFL Team’s Fans Lean Politically | FiveThirtyEight which discussed how NFL fans lean politically relative to the teams they root for. In this article, they first touch on the different political demonstrations that have sparked a lot of controversy in the history of the NFL. Since the NFL fanbase is very bipartisan, especially compared to other major sports fanbases, they want to handle situations like those very well so as to not anger their fanbase. To gather data for this experiment, they asked 2,290 NFL fans what their top 3 favorite teams were and their political identification. Finally, by analysing that data they found that the most popular teams had the most polarizing responses, and that people who are fans of teams but live in different areas can skew the data with their differing political views. The visualization that I found the most compelling was the one that compared the NFL’s political market compared to the other major sports leagues because the NFL is much less polarized than every other league by a large margin. I found the text to be very compelling as well because it presented all its findings in a very clear manner and carefully discussed all potential issues with their findings and why certain results might be the way they are. The earlier visualization that I found compelling did a very good job of using titles and both of the axes because the title “The NFL has appeal everywhere” made it clear what we were supposed to be comparing, and the axes were very clear in describing what variables were being measured. They also added subtext to further describe where they got their data from. They could have added a key to explain what each of the different colored lines represented though because that was not as clear. Two ideas we got from this article that we could translate to our project is giving a very clear overview of what we are measuring and why we chose those variables, and using many different types of graphs.
Another article we chose to analyze is Colorism in High Fashion which examines how representation and diversity have changed over time on the covers of the Vogue magazine. In the article, they analyzed over 1,000 covers from 26 international editions. The authors used facial recognition and data analysis from which they were able to determine the race, gender, and nationality of the models. They found that diversity has improved since 2010, but Vogue continues to be primarily dominated by white/Western faces, especially in the most influential editions. The visualization that I found the most compelling was the interactive world map that showed where Vogue cover models came from, which made the differences in nationality very visible. The text was also very compelling because it clearly explained how even with more diversity and African American representation in the magazines, there was still a substantial difference in the number of African American to white models. In fact, there was not even that much diversity among African American models as one model was featured four separate times. One visualization that stood out was the timeline graph showing changes in racial diversity over time. The title and color scheme made trends easy to follow. I felt as though more detailed results posted closer to the actual graph itself would have made it clearer. One idea this article gives us for our own project is to connect data to a more important, cultural meaning instead of just showing numbers.
Context and Background: We want to analyze car crash data in order to determine what conditions lead to collisions and which conditions lead to more severe crashes. This data set is important as it allows drivers to recognize various factors that may lead to unsafe driving conditions and how keen they should be when looking out for them. This work has been done before, but it is important information to work with, as it leads to changes in how people drive, how cars are made, driving laws, etc. We overall are interested in finding the most dangerous conditions that people should take into consideration when driving.
Description of Data: This data comes from Montgomery County, Maryland and details every instance of car collision from November 10th, 2020 to present. Each instance is recorded by the local police department, and the data is updated weekly. Each instance is recorded with conditions surrounding the collision and details of the collision itself. Conditions such as the weather, light level, speed limit, and vehicle type are recorded, which we would use as explanatory variables. For response variables, we would examine the collision details such as injury, vehicle damage, and collision type.