AFI 2.4

Author

Michael Ernst

library(tidyverse)

Begin by loading the tidyverse package in the code chunk above and adding your name as the author.

The readr package includes functions for reading tabular data into R. Each code chuck below should read in the specified data file, storing it in an appropriately named object, and then print the data. The data are stored in the data folder.

Importing Data

Let’s start by reading in the AllCountries.csv data. Modify this code by filling in the ______ to do so:

all_countries <- read_csv("data/AllCountries.csv")
all_countries

# A tibble: 217 × 25
   Country      LandArea Population Density   GDP Rural   CO2 PumpPrice Military
   <chr>           <dbl>      <dbl>   <dbl> <dbl> <dbl> <dbl>     <dbl>    <dbl>
 1 Afghanistan    653.       30.6      46.8   665  74.1  35.3      1.28     8.65
 2 Albania         27.4       2.90    106.   4460  44.6  12.8      1.81    NA   
 3 Algeria       2382.       39.2      16.5  5361  30.5  24.6      0.29    NA   
 4 American Sa…     0.2       0.055   275      NA  12.7  NA       NA       NA   
 5 Andorra          0.47      0.079   168.     NA  13.8   9.5      1.67    NA   
 6 Angola        1247.       21.5      17.2  5783  57.5  44.8      0.63    13.8 
 7 Antigua and…     0.44      0.09    204.  13342  75.4  16.6     NA       NA   
 8 Argentina     2737.       41.4      15.1 14715   8.5  16.9      1.46    NA   
 9 Armenia         28.5       2.98    105.   3505  37    13.9      1.25    16.8 
10 Aruba            0.18      0.103   572.     NA  57.9  10.4     NA       NA   
# ℹ 207 more rows
# ℹ 16 more variables: Health <dbl>, ArmedForces <dbl>, Internet <dbl>,
#   Cell <dbl>, HIV <dbl>, Hunger <dbl>, Diabetes <dbl>, BirthRate <dbl>,
#   DeathRate <dbl>, ElderlyPop <dbl>, LifeExpectancy <dbl>, FemaleLabor <dbl>,
#   Unemployment <dbl>, EnergyUse <dbl>, Electricity <dbl>, Developed <dbl>

That was easy, right? Note that including the #| message: false at the beginning of the code chunk suppresses unneeded messages and cleans up your output for whoever reads it (me!).

Now try the minn_stp_weather.csv data:

minn_stp_weather <- read_delim("data/minn_stp_weather.csv", skip = 17)
minn_stp_weather

# A tibble: 1,388 × 12
   MonthY MonthS  Year LowTemp HighTemp WarmestMin ColdestHigh AveMin AveMax
    <dbl>  <dbl> <dbl>   <dbl>    <dbl>      <dbl>       <dbl>  <dbl>  <dbl>
 1      1      1  1900     -15       51         36          -4   12.6   30  
 2      2      2  1900     -17       37         20           1   -1.3   18.5
 3      3      3  1900     -10       54         31           8   17.2   33.9
 4      4      4  1900      26       81         60          34   42.5   63  
 5      5      5  1900      33       90         68          50   49.9   74.1
 6      6      6  1900      46       94         69          67   57.4   80.1
 7      7      7  1900      51       95         69          66   60.7   81.2
 8      8      8  1900      58       94         75          74   67     86.3
 9      9      9  1900      36       92         70          54   52.5   70.5
10     10     10  1900      34       79         60          48   48.3   67.2
# ℹ 1,378 more rows
# ℹ 3 more variables: meanTemp <dbl>, TotPrecip <chr>, Max24hrPrecip <chr>

Did you look at the data file before trying?

Now try the white_nonhisp_death_rates_from_1999_to_2013.txt data:

white_nonhisp_death_rates <- read_delim("data/white_nonhisp_death_rates_from_1999_to_2013.txt")
white_nonhisp_death_rates

# A tibble: 150 × 5
     Age  Year Deaths Population  Rate
   <dbl> <dbl>  <dbl>      <dbl> <dbl>
 1    45  1999   8304    3166393  262.
 2    45  2000   8604    3207271  268.
 3    45  2001   8836    3152637  280.
 4    45  2002   9217    3256317  283 
 5    45  2003   9287    3260376  285.
 6    45  2004   9210    3211340  287.
 7    45  2005   9352    3279109  285.
 8    45  2006   9100    3222835  282.
 9    45  2007   8805    3137876  281.
10    45  2008   8751    3074171  285.
# ℹ 140 more rows

Here’s a trickier one. Read in the deaton.txt data:

deaton <- read_table("data/deaton.txt")
deaton

# A tibble: 10 × 4
     age death_rate_1989 death_rate_2013 change
   <dbl>           <dbl>           <dbl>  <dbl>
 1    45            262.            261.   -1.6
 2    46            293.            290.   -3.1
 3    47            306.            324.   17.6
 4    48            337.            343.    5.7
 5    49            359             384.   25.5
 6    50            377.            422.   45.5
 7    51            429             466.   37.1
 8    52            445.            481.   36.4
 9    53            545.            527.  -18.4
10    54            555.            573.   17.4

You might need to search for a function that wasn’t discussed.

Now read in the pga2004.csv data (the help page for read_csv might be useful):

pga2004 <- read_delim("data/pga2004.csv", col_names = FALSE)

Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)

pga2004

# A tibble: 210 × 11
   X1                 X2    X3    X4    X5    X6    X7    X8    X9    X10    X11
   <chr>           <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl>
 1 Aaron Baddeley     23  288   53.1  58.2  1.77  50.9   123    27 6.33e5  23440
 2 Adam Scott         24  295.  57.7  65.6  1.76  59.3     7    16 3.72e6 232812
 3 Alex Cejka         34  286.  64.2  63.8  1.80  50.7    54    24 1.31e6  54729
 4 Andre Stolz        34  298.  59    63    1.79  47.7   101    20 8.08e5  40419
 5 Arjun Atwal        31  289.  60.5  62.5  1.77  43.5   146    30 4.86e5  16202
 6 Arron Oberhols…    29  285.  68.8  67    1.78  50.9    52    23 1.36e6  58932
 7 Bart Bryant        42  282.  74.2  68.9  1.78  40.4    80    23 9.62e5  41833
 8 Ben Crane          28  284.  64.4  64.2  1.74  53.8    75    27 1.04e6  38406
 9 Ben Curtis         27  282.  64.3  63.4  1.81  42.2   141    20 5.01e5  25041
10 Bernhard Langer    47  282.  62.6  65.3  1.78  47.7    83    15 9.44e5  62906
# ℹ 200 more rows

Lastly, read in the noise.txt data:

noise <- read_delim("data/noise.txt", delim = " ", col_names = c("V0", "V1",
                                                                 "V2", "V3",
                                                                 "V4", "V5"), 
                    skip = 1)
noise

# A tibble: 3,020 × 6
      V0      V1      V2       V3     V4      V5
   <dbl>   <dbl>   <dbl>    <dbl>  <dbl>   <dbl>
 1     1 -0.675   0.831  -1.24     0.207  0.290 
 2     2  0.974  -0.0118 -0.415    0.192 -0.167 
 3     3 -0.745  -1.03    1.79     0.464 -1.88  
 4     4  1.06    1.01   -0.203   -0.550  1.21  
 5     5  0.493  -0.215  -0.192   -1.84   0.0446
 6     6 -1.21   -0.873   0.582   -0.190 -0.953 
 7     7  2.00    0.622  -0.967   -1.71  -0.448 
 8     8 -0.0600  0.920  -0.431    0.350 -0.527 
 9     9 -1.13   -0.0190 -0.430   -1.12  -0.224 
10    10  0.988  -0.322  -0.00720 -1.36  -0.324 
# ℹ 3,010 more rows