Preliminaries

rm(list = ls())
aliens <- read.csv ("aliens.csv", header = TRUE, stringsAsFactors = TRUE)
library(skimr)
source('special_functions.R')
my_sample <- make.my.sample(33377932, 30, aliens)

Question 1

2+3
## [1] 5
10-6
## [1] 4
10*5
## [1] 50
20/2
## [1] 10

Question 2

round(24.674, digits = 3)
## [1] 24.674

Question 3

There are 10 individuals represented in this dataframe.

head(aliens, 10)
##    ID age color     island  college income antennae    politics anxiety
## 1   1  33  Blue      Blick Ganymede  27000    Curly Republicant      46
## 2   2  47  Pink      Plume Ganymede 124000 Straight Independone      49
## 3   3  39  Pink      Plume       Io  43000 Straight Democrulite      51
## 4   4  24  Pink      Blick       Io  46000 Straight Republicant      45
## 5   5  53  Pink      Blick       Io  44000 Straight Democrulite      46
## 6   6  36  Blue      Blick   Europa  28000    Curly Republicant      49
## 7   7  58  Pink Nanspucket   Europa  29000    Curly Democrulite      60
## 8   8  25  Pink      Blick       Io  37000 Straight Republicant      51
## 9   9  38  Pink Nanspucket   Europa  35000 Straight Democrulite      52
## 10 10  40  Pink      Plume   Europa  33000 Straight Independone      48
##    depression sociable control memory intelligence time1 time2 time3 food1
## 1          92      108      68     94          119  5.86  4.36  4.11     5
## 2          94      110      72    109          127  5.07  4.35  4.97     8
## 3         119       79      62     83          112  5.66  6.13  6.15     7
## 4          92      117      65     88          115  7.81  8.13  6.12     6
## 5          93      109      56    106          122  5.04  4.55  4.15     8
## 6          98      101      49    103          104  4.81  3.65  5.11    10
## 7         107       89      42     71           86  4.88  4.09  4.35    13
## 8          99      101      55     94          116  4.24  3.80  2.69     7
## 9          83      105      68    102          108  4.69  3.35  3.89     7
## 10        106       93      56    101          104  4.78  3.89  3.04    11
##    sleep food2 reasoning_trials
## 1    6.0     9                1
## 2    7.8    11                1
## 3    4.4     9                1
## 4    6.0     9                1
## 5    4.8     9                1
## 6    5.5     7                1
## 7    4.0     8                1
## 8    6.6     9                1
## 9    3.2     6                1
## 10   5.7     7                1
tail(aliens)
##          ID age color     island  college income antennae    politics anxiety
## 9995   9995  54  Pink Nanspucket Ganymede 176000 Straight Democrulite      48
## 9996   9996  66  Blue      Blick   Europa  52000 Straight Republicant      52
## 9997   9997  33  Pink      Plume Callisto  89000 Straight Republicant      53
## 9998   9998  60  Pink Nanspucket Callisto  23000 Straight Independone      51
## 9999   9999  51  Blue      Blick   Europa  37000 Straight Republicant      39
## 10000 10000  24  Pink Nanspucket       Io  14000 Straight Democrulite      49
##       depression sociable control memory intelligence time1 time2 time3 food1
## 9995          90      115      73     84          115 11.89 11.12  9.91     7
## 9996         110       80      57     91          100  4.79  2.92  2.95     6
## 9997         108       92      74     99          108  3.71  2.88  3.89    11
## 9998         107       89      75     87          102  3.40  3.61  3.76     7
## 9999          92      108      64    106          109  6.58  6.45  5.18     6
## 10000         99      101      69    104          124  5.13  3.32  3.47     8
##       sleep food2 reasoning_trials
## 9995    6.2     8                4
## 9996    5.4     7                4
## 9997    5.5     5                2
## 9998    3.9    10                3
## 9999    5.9     9                2
## 10000   6.8     7                1

Question 4

There are 20 variables represented in this dataframe.

Question 5

Two variables in this dataframe that are numerical are time1 and income. Two variables that are categorical are college and politics.Time is continuous because it can be infinite while income is discrete because it is a set amount. They are both regarded as interval because they can be categorized and ranked and we are able to tell the difference between that variable of each participant.

class(aliens$age)
## [1] "integer"
class(aliens$time1)
## [1] "numeric"
class(aliens$income)
## [1] "numeric"
class(aliens$college)
## [1] "factor"
class(aliens$politics)
## [1] "factor"
help(summary)

Question 6

The output shows the summary of objects in the file. Integers and how many levels of factors should be shown appears. I do not understand when it says “S3 method for class”.

Question 7

Most of the aliens come from the island Blick. The most popular political parties are Republicant and Democrulite. The highest sociability score obtained by an alien is 117. The lowest memory score is 71.

library(skimr)

Question 8

I do not understand what the changes made are.

Question 9

The code is showing the difference between the variables food1 and food2.

aliens$food.diff <- aliens$food1 - aliens$food2
head(aliens)
##   ID age color island  college income antennae    politics anxiety depression
## 1  1  33  Blue  Blick Ganymede  27000    Curly Republicant      46         92
## 2  2  47  Pink  Plume Ganymede 124000 Straight Independone      49         94
## 3  3  39  Pink  Plume       Io  43000 Straight Democrulite      51        119
## 4  4  24  Pink  Blick       Io  46000 Straight Republicant      45         92
## 5  5  53  Pink  Blick       Io  44000 Straight Democrulite      46         93
## 6  6  36  Blue  Blick   Europa  28000    Curly Republicant      49         98
##   sociable control memory intelligence time1 time2 time3 food1 sleep food2
## 1      108      68     94          119  5.86  4.36  4.11     5   6.0     9
## 2      110      72    109          127  5.07  4.35  4.97     8   7.8    11
## 3       79      62     83          112  5.66  6.13  6.15     7   4.4     9
## 4      117      65     88          115  7.81  8.13  6.12     6   6.0     9
## 5      109      56    106          122  5.04  4.55  4.15     8   4.8     9
## 6      101      49    103          104  4.81  3.65  5.11    10   5.5     7
##   reasoning_trials food.diff
## 1                1        -4
## 2                1        -3
## 3                1        -2
## 4                1        -3
## 5                1        -1
## 6                1         3
help(summary)

Question 10

In this code I found the difference between variables time1 and time 3.

aliens$time.diff <- aliens$time1 - aliens$time3

Question 11

Most aliens come from the islands Blick and Nanspucket. The most popular political party is Democrulite. The highest sociability score is 122. The lowest memory score is 79.

head(my_sample, 30)
##        ID age color     island  college income antennae    politics anxiety
## 88     88  27  Pink      Blick   Europa  43000 Straight Republicant      48
## 249   249  26  Pink Nanspucket       Io 136000    Curly Democrulite      48
## 582   582  29  Pink Nanspucket   Europa  78000 Straight Democrulite      52
## 600   600  38  Blue      Blick Ganymede  84000 Straight Democrulite      47
## 852   852  66  Pink Nanspucket Callisto  39000 Straight Independone      50
## 1017 1017  43  Pink      Blick Ganymede  19000 Straight Republicant      48
## 1139 1139  41  Pink      Blick Ganymede  13000 Straight Independone      48
## 1390 1390  65  Blue      Blick Ganymede  95000    Curly Republicant      45
## 2524 2524  34  Pink Nanspucket Callisto 234000 Straight Independone      54
## 2727 2727  42  Pink Nanspucket Ganymede  20000    Curly Democrulite      47
## 2928 2928  54  Pink      Plume   Europa  28000 Straight Republicant      53
## 2988 2988  56  Pink      Plume   Europa  49000 Straight Independone      41
## 3442 3442  53  Pink Nanspucket   Europa  34000 Straight Republicant      46
## 3499 3499  12  Pink      Blick       Io 101000 Straight Republicant      46
## 3884 3884  35  Pink      Plume Callisto  99000 Straight Independone      53
## 4037 4037  15  Blue      Plume Ganymede  22000 Straight Democrulite      51
## 4097 4097  68  Blue      Plume Ganymede  57000    Curly Republicant      57
## 4469 4469  16  Blue Nanspucket   Europa  55000 Straight Independone      55
## 4728 4728  38  Pink Nanspucket Callisto  80000 Straight Independone      50
## 5397 5397  57  Pink      Plume Callisto 180000 Straight Independone      54
## 5405 5405  58  Pink      Plume Callisto  27000    Curly Independone      48
## 5566 5566  46  Pink Nanspucket Ganymede 121000    Curly Republicant      56
## 5661 5661  68  Blue Nanspucket   Europa  45000 Straight Democrulite      51
## 6295 6295  33  Pink Nanspucket Callisto  42000 Straight Independone      54
## 6391 6391  58  Pink      Plume   Europa  17000 Straight Republicant      51
## 7208 7208  55  Pink      Plume Ganymede 136000 Straight Independone      38
## 7488 7488  42  Pink      Plume Ganymede  39000 Straight Independone      47
## 8019 8019  27  Pink      Plume Ganymede 158000 Straight Democrulite      53
## 8225 8225  32  Pink Nanspucket Callisto  96000 Straight Republicant      47
## 9981 9981  40  Pink      Plume       Io  72000 Straight Republicant      51
##      depression sociable control memory intelligence time1 time2 time3 food1
## 88           97      100      64     96          104  7.48  6.95  5.88     8
## 249          81      122      75     84          116  3.47  3.10  3.38    11
## 582         101       98      55     92          100  6.30  5.95  5.98    10
## 600          90      108      56     89          113  5.54  4.19  6.04     8
## 852         102       97      61     81           96  5.89  4.56  5.23     9
## 1017         98      101      57     79          109  8.26  6.33  6.90    12
## 1139        101       98      67     87          115  5.19  4.77  4.64     9
## 1390         91      111      56     80          109  8.95  9.42  9.20     3
## 2524        117       91      52     91           98  7.19  5.48  7.08    11
## 2727        100      100      56     94          116  6.01  6.01  5.92    10
## 2928        109       92      49     98          101  9.25  9.65  7.35    11
## 2988         87      124      68     94          104  4.77  3.35  4.71    10
## 3442         92      111      69     86          100  4.48  4.11  3.15     9
## 3499         96      103      83     95          123  3.46  3.42  2.61    12
## 3884         93      108      63    107          109  2.97  2.00  1.79    11
## 4037         97      101      51     86          111  6.41  6.32  4.75     6
## 4097        116       75      71     95          120  7.03  5.73  7.30     7
## 4469         99      102      57    117          113  6.02  5.18  5.36     6
## 4728         95      103      63     94          103  5.62  5.68  4.47     7
## 5397        109       96      58     75           92  6.89  6.84  5.37    11
## 5405         99      100      45     84           93  6.36  6.62  5.88    10
## 5566        102       98      61     95          118  6.26  4.74  5.05     8
## 5661         93      107      67     89          101  8.23  6.66  8.61     8
## 6295        108       90      58    105          107  6.55  5.58  5.96     6
## 6391        109       86      61    100          105  5.07  3.30  4.10    13
## 7208         79      137      53    105          121  7.40  7.34  6.48     5
## 7488        103       98      73     89          118  4.97  3.65  4.81    12
## 8019        113       80      54    115          126  5.50  4.46  3.71     9
## 8225        101       99      48    100          102  8.63  7.33  7.16     8
## 9981        105       95      41     76          103  7.00  5.55  6.99     6
##      sleep food2 reasoning_trials
## 88     7.5     8                2
## 249    6.1     6                1
## 582    5.2     6                6
## 600    7.4    11                3
## 852    4.2     3                4
## 1017   6.5     9                5
## 1139   5.9     7                2
## 1390   5.2    11                1
## 2524   6.3    10                3
## 2727   6.1    10                3
## 2928   5.6     8                3
## 2988   5.1     6                1
## 3442   6.1     9                1
## 3499   5.8    12                1
## 3884   6.9     6                1
## 4037   7.3     8                3
## 4097   5.0     6                1
## 4469   6.5     9                2
## 4728   6.6     7                3
## 5397   6.8     5                7
## 5405   5.6     7                9
## 5566   6.0     8                2
## 5661   7.2    12                1
## 6295   4.3     7                1
## 6391   5.8     8                2
## 7208   6.1    11                3
## 7488   6.7     3                6
## 8019   6.1    11                3
## 8225   5.9    10                2
## 9981   6.9     8                3

Question 12

One notable difference that I observed is there is more variety in the reasoning trials in my sample than the first ten of the whole population.

library(skimr)

Question 13

Most aliens come from the island Blick. The most popular political party is Republicant. The highest sociability score is 122. The lowest memory score is 76. Nothing in this sample really surprised me because there was not a huge contrast from the size 30 sample.

my_sample_100 <- make.my.sample(33377932, 100, aliens)
head(my_sample_100)
##      ID age color     island  college income antennae    politics anxiety
## 74   74  47  Blue      Blick   Europa  37000 Straight Republicant      45
## 88   88  27  Pink      Blick   Europa  43000 Straight Republicant      48
## 100 100  19  Blue Nanspucket Callisto  71000    Curly Democrulite      44
## 249 249  26  Pink Nanspucket       Io 136000    Curly Democrulite      48
## 339 339  32  Pink      Blick       Io  34000    Curly Democrulite      45
## 424 424  48  Pink      Plume Callisto  29000 Straight Independone      47
##     depression sociable control memory intelligence time1 time2 time3 food1
## 74          87      122      74     80           98  9.53  8.67  8.11    10
## 88          97      100      64     96          104  7.48  6.95  5.88     8
## 100         82      117      66    103          108  3.08  1.53  2.57     7
## 249         81      122      75     84          116  3.47  3.10  3.38    11
## 339         99      102      62    111          126  7.52  5.87  6.19    10
## 424         94      106      61     99          105  7.88  6.73  6.59    10
##     sleep food2 reasoning_trials food.diff time.diff
## 74    5.4     7               11         3      1.42
## 88    7.5     8                2         0      1.60
## 100   6.0     7                1         0      0.51
## 249   6.1     6                1         5      0.09
## 339   6.9     7                6         3      1.33
## 424   6.9    10                2         0      1.29
tail(my_sample_100)
##        ID age color     island  college income antennae    politics anxiety
## 8858 8858  41  Pink Nanspucket Callisto  19000 Straight Democrulite      51
## 8940 8940  62  Blue      Plume Callisto  57000 Straight Independone      42
## 9142 9142  19  Blue Nanspucket       Io  58000 Straight Republicant      51
## 9391 9391  39  Pink      Blick Callisto  59000 Straight Republicant      51
## 9683 9683  51  Blue      Blick   Europa  68000    Curly Republicant      55
## 9981 9981  40  Pink      Plume       Io  72000 Straight Republicant      51
##      depression sociable control memory intelligence time1 time2 time3 food1
## 8858        107       88      48     97          100  3.44  1.63  1.60    12
## 8940         96      107      49     87           96  7.10  6.49  6.01     7
## 9142         88      107      64    102          122  8.35  6.77  7.72     9
## 9391        102       97      78    100          110  4.18  4.31  2.90     8
## 9683         90      111      62     97          104  3.59  2.90  3.78     9
## 9981        105       95      41     76          103  7.00  5.55  6.99     6
##      sleep food2 reasoning_trials food.diff time.diff
## 8858   5.0     7                3         5      1.84
## 8940   5.7    11                3        -4      1.09
## 9142   5.8    10                2        -1      0.63
## 9391   6.5     7                1         1      1.28
## 9683   5.6     8                3         1     -0.19
## 9981   6.9     8                3        -2      0.01

About this document

Knitting your file

Basic formatting

An example of an answer with both words and code

Let’s say my question was, ‘What’s the mean of the numbers 7, 11, 14, and 100? What’s the median? Why is the mean higher than the median?’ You could answer this by writing a little code chunk (don’t worry, I don’t expect you to understand the details of this code yet), and then writing the explanatory part. You insert a code chunk by clicking on the drop down menu above with a letter c, and then R. When you knit the file, the code chunk will be executed, and the results will show up in your .html file right after the code.


mean(c(7,11,14,100))
## [1] 33
median(c(7,11,14, 100))
## [1] 12.5

The mean (33) is much higher than the median (14) because the mean is strongly influenced by the one extreme value (100), while the median is not.


Very important: You can run the code chunk without Knitting the entire file just by clicking the little ‘play’ symbol (triangle pointing to the right) in the upper right of the code chunk itself. It’s a very good idea to run each of your code chunks before Knitting the file, to check that they work the way you intended.

Another example

Let’s say my question was, “What kind of graph should you use to show the relationship between weight and gas mileage in the mtcars data set? Make the graph. Interpret it.” You could answer it like this:


Here we are examining the relationship between two quantitative variables, so we should make a scatterplot. Here is the plot:

data(mtcars)
plot(mtcars$wt, mtcars$mpg)

As weight goes up, gas mileage goes down. The relationship between the two variables appears to be linear.


Again, you can run this little bit of code by clicking the ‘play’ symbol, and you’ll see the plot show up right under the code itself.

What to do if your code has a problem that you can’t fix

If there’s a problem with your code, the file will not Knit properly; you’ll get an error message. First, you should try to fix it. But if you can’t, you still don’t want that to prevent you from doing the rest of the homework. In this case, just put ‘eval = FALSE’ in your code chunk, just like this:

mean(c(7,b,14,100))

This code is ‘broken’ because I included the letter b in the list of values that the mean function was supposed to deal with, and this makes no sense. But because I also included ‘eval = FALSE’, this error won’t stop my file from knitting, because the code in this chunk won’t be run. In a case like this, you should explain in your homework that you had an error in your code, but that you couldn’t figure out how to fix it. (The more you explain to your TA about your attempt to do a problem, the more partial credit you are likely to get.)

Don’t worry!

Your TA is available to help you figure out how to format your homework assignments, and so am I. After the first one or two assignments, it will be very easy.