Choose one of David Robinson’s tidytuesday screencasts, watch the video, and summarise. https://www.youtube.com/channel/UCeiiqmVK07qhY-wvg3IZiZQ
You must follow the instructions below to get credits for this assignment.
The title of the screencast is called, Tidy Tuesday Screencast: analyzing car fuel efficiency in R
It was published on October 15, 2019
Hint: What’s the source of the data; what does the row represent; how many observations?; what are the variables; and what do they mean? There are 41,804 observations, and the variables include cylinders, displacement, drive (awd, rwd fwd), engine id, FE score, make, model, city miles per gallon ect. All the variables contribute to how the car ends up being on the level of efficiency.
Hint: For example, importing data, understanding the data, data exploration, etc. Dave first imported the data and took a look at to see what he knows about the car data just by taking a look. He then found the key for each variable and began going over them so he can understand the data better. He then began to take certain variables and searching them to make his own table so he could better understand it and see the data he wanted to see. He then asked the question of, how does highway and city efficiency relate? He then made a scatter plot to compare the 2.He then gives all cars that do not use electricity the color red and the electric use cars blue. We then see how the electric use car way more fuel efficient that the non electric use cars. He then separated them into their own scatterplots. He then put the data into box plots, based on cars size and efficiency. Then he did it based off of drive (awd, fwd, ect). He then does another blow plot based off of number of cylinders the car has and see’s that the more cylinders, the lower the efficiency. The does it for engine displacement and can see that the more displacement the the worse the efficiency. Using these tools, he was able to understand the data and began to start to predict some of the data outcomes that he wanted to know.
Yes, Dave used ggplot to make a scatter plot just like we where taught in class. He used the scatter plot to compare highway fuel efficiency to city fuel efficiency. Using the graph he was able to determine that there was a correlation between the two, just how to do in class when we look at the scatterplots. He then gives variables colors just how do in class to see the data set easier. This lets him and us understand what we are looking at better.
The major finding based from the analysis is, that the more displacement and cylinders then the less fuel efficient the car is. We find this out through many different scatter plots and box plots. These all support the findings leading to our conclusion that the bigger the cars engine, then the worse gas mileage they get. At the end we also look at how cars have become more fuel efficient as they get more modern, however the same still apply being the bigger the engine then the worse the mileage. We also see that mid size and large cars had big improvements in gas mileage since 2010, which comes from the companies making the cars engine smaller and less cylinders.
The most interesting part for me was when he used the scatter plot and put it into red (non electric use cars) and blue (electric use cars). This is because I have always non that electric use cars are more fuel efficient but I never really knew how more efficient they were. The scatter plot showed me how big of a difference they make and I was amazed to see it. This gave me a way better understanding of electric use cars and gives makes me give them more application.