View the interactive version of this document here

LATE POLICY

⛔ Late assignments will not be accepted for credit!

⚠ You can still receive the worked answers to use for study.

✔ Partial credit will be given for attempts at working through problems,
so it’s best to always submit HW on-time, even if it’s wrong or incomplete!

CODE SHOWCASE

Algebra

PROBLEM
\(103^3\)

CODE

103^3

## [1] 1092727

Trigonometry

PROBLEM
Cosine of my age

CODE

myage <- difftime(Sys.Date(),"1995-09-17")
myage <- as.numeric(myage/365.25)
cos(myage)

## [1] -0.2766089

Word Length

PROBLEM
Count the number of letters in the longest town name in Wales:
Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch

CODE

nchar("Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch")

## [1] 58

CAR ANALYSIS

EXAMINING THE DATASET

Data summary
Name	mpg
Number of rows	234
Number of columns	11
_______________________
Column type frequency:
character	6
numeric	5
________________________
Group variables	None

Variable type: character

skim_variable	complete_rate	min	max	n_unique
manufacturer	1	4	10	15
model	1	2	22	38
trans	1	8	10	10
drv	1	1	1	3
fl	1	1	1	5
class	1	3	10	7

Variable type: numeric

skim_variable	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
displ	1	3.47	1.29	1.6	2.4	3.3	4.6	7	▇▆▆▃▁
year	1	2003.50	4.51	1999.0	1999.0	2003.5	2008.0	2008	▇▁▁▁▇
cyl	1	5.89	1.61	4.0	4.0	6.0	8.0	8	▇▁▇▁▇
cty	1	16.86	4.26	9.0	14.0	17.0	19.0	35	▆▇▃▁▁
hwy	1	23.44	5.95	12.0	18.0	24.0	27.0	44	▅▅▇▁▁

VISUALIZING THE DATA

From the summary statistcs above, we know there are 234 records in our dataset, but the scatter plot appears to be showing a lot less. This is because many points are overlapping.

There are many ways to adjust our visualization. Some simple approaches include: making the points partially transparent (alpha), slightly offsetting them (jitter), or changing the shapes (or some/all/none of the above).

DEFAULT

TRANSPARENCY

JITTER

SHAPES

Note: It’s generally not recommend to use more than six shapes per graph because, as you can see, it becomes very difficult to differentiate. In fact, default ggplot2 will not allow it without a scale_shape_manual() override.

COMBINATION

QUESTIONS

Designing My Own Study

1. The object of observation: Car Manufacturers

2. The object of analysis: Car Manufacturers who released a new model every year between 1999 & 2008

3. The population: Car Manufacturers

4. List the available variables: city miles per gallon, highway miles per gallon.

5. Response variable: Highway miles per gallon (hwy)

6. What are you hoping to find out? For cars of the same class, do those produced by manufacturers specializing in 1 or 2 classes have better fuel efficiency than those made by manufacturers producing a variety of classes?

Average Year

What is the average manufacturing year of the car models in the data set? 2003.5

Engine Size & Highway MPG

## `geom_smooth()` using formula 'y ~ x'

The correlation coefficient is -0.7660 which indicates a relatively strong negative correlation, i.e. as the fuel displacement increases, the average highway mpg decreases. We can also see this on the graph.

This suggests that larger car engines have worse fuel efficiency on the highway.

ACKNOWLEDGEMENTS

Thank you to:
https://ggplot2.tidyverse.org/ for help with graphs
https://shiny.rstudio.com/ for help with interactive graph
https://rmarkdown.rstudio.com/articles_intro.html for help with markdown/html

GEOG-364 LAB 1

NICK SCHLOTTERBECK

2022-09-11