library(tidyverse)
theme_set(theme_light())

wine_ratings <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-05-28/winemag-data-130k-v2.csv")
wine_ratings
## # A tibble: 129,971 x 14
##       X1 country description designation points price province region_1 region_2
##    <dbl> <chr>   <chr>       <chr>        <dbl> <dbl> <chr>    <chr>    <chr>   
##  1     0 Italy   Aromas inc… Vulkà Bian…     87    NA Sicily … Etna     <NA>    
##  2     1 Portug… This is ri… Avidagos        87    15 Douro    <NA>     <NA>    
##  3     2 US      Tart and s… <NA>            87    14 Oregon   Willame… Willame…
##  4     3 US      Pineapple … Reserve La…     87    13 Michigan Lake Mi… <NA>    
##  5     4 US      Much like … Vintner's …     87    65 Oregon   Willame… Willame…
##  6     5 Spain   Blackberry… Ars In Vit…     87    15 Norther… Navarra  <NA>    
##  7     6 Italy   Here's a b… Belsito         87    16 Sicily … Vittoria <NA>    
##  8     7 France  This dry a… <NA>            87    24 Alsace   Alsace   <NA>    
##  9     8 Germany Savory dri… Shine           87    12 Rheinhe… <NA>     <NA>    
## 10     9 France  This has g… Les Natures     87    27 Alsace   Alsace   <NA>    
## # … with 129,961 more rows, and 5 more variables: taster_name <chr>,
## #   taster_twitter_handle <chr>, title <chr>, variety <chr>, winery <chr>

This is an extension of the tidytuesday assignment you have already done. Complete the questions below, using the screencast you chose for the tidytuesday assigment.

Import data

Description of the data and definition of variables

The data shows the ratings of different wines, and the variables include the country that the wine is from, and the notes of the wine.

Visualize data

Hint: One graph of your choice.

ggplot(wine_ratings, aes(price, points)) +
  geom_point(alpha = .1) +
  geom_smooth(method = "lm") +
  scale_x_log10()

summary(lm(points ~ log2(price), wine_ratings))
## 
## Call:
## lm(formula = points ~ log2(price), data = wine_ratings)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.0559  -1.5136   0.1294   1.7133   9.2408 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 78.981419   0.035765    2208   <2e-16 ***
## log2(price)  1.974162   0.007338     269   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.408 on 120973 degrees of freedom
##   (8996 observations deleted due to missingness)
## Multiple R-squared:  0.3744, Adjusted R-squared:  0.3744 
## F-statistic: 7.239e+04 on 1 and 120973 DF,  p-value: < 2.2e-16

What is the story behind the graph?

This graph shows a summary of all of the wine ratings compiled into one.

Hide the messages, but display the code and its results on the webpage.

Write your name for the author at the top.

Use the correct slug.