Preliminaries

aliens <- read.csv ("aliens.csv", header = TRUE, stringsAsFactors = TRUE)
source('special_functions.R')
my_sample <- make.my.sample(34370790, 30, aliens)

Question 1

240+350
## [1] 590
372817-5
## [1] 372812
10.6*80
## [1] 848
50/4
## [1] 12.5

Question 2

round(56.781563, digits=3)
## [1] 56.782
round(56.781563, digits=1)
## [1] 56.8
head(aliens)
##   ID age color island  college income antennae    politics anxiety depression
## 1  1  33  Blue  Blick Ganymede  27000    Curly Republicant      46         92
## 2  2  47  Pink  Plume Ganymede 124000 Straight Independone      49         94
## 3  3  39  Pink  Plume       Io  43000 Straight Democrulite      51        119
## 4  4  24  Pink  Blick       Io  46000 Straight Republicant      45         92
## 5  5  53  Pink  Blick       Io  44000 Straight Democrulite      46         93
## 6  6  36  Blue  Blick   Europa  28000    Curly Republicant      49         98
##   sociable control memory intelligence time1 time2 time3 food1 sleep food2
## 1      108      68     94          119  5.86  4.36  4.11     5   6.0     9
## 2      110      72    109          127  5.07  4.35  4.97     8   7.8    11
## 3       79      62     83          112  5.66  6.13  6.15     7   4.4     9
## 4      117      65     88          115  7.81  8.13  6.12     6   6.0     9
## 5      109      56    106          122  5.04  4.55  4.15     8   4.8     9
## 6      101      49    103          104  4.81  3.65  5.11    10   5.5     7
##   reasoning_trials
## 1                1
## 2                1
## 3                1
## 4                1
## 5                1
## 6                1

Question 3

tail(aliens)
##          ID age color     island  college income antennae    politics anxiety
## 9995   9995  54  Pink Nanspucket Ganymede 176000 Straight Democrulite      48
## 9996   9996  66  Blue      Blick   Europa  52000 Straight Republicant      52
## 9997   9997  33  Pink      Plume Callisto  89000 Straight Republicant      53
## 9998   9998  60  Pink Nanspucket Callisto  23000 Straight Independone      51
## 9999   9999  51  Blue      Blick   Europa  37000 Straight Republicant      39
## 10000 10000  24  Pink Nanspucket       Io  14000 Straight Democrulite      49
##       depression sociable control memory intelligence time1 time2 time3 food1
## 9995          90      115      73     84          115 11.89 11.12  9.91     7
## 9996         110       80      57     91          100  4.79  2.92  2.95     6
## 9997         108       92      74     99          108  3.71  2.88  3.89    11
## 9998         107       89      75     87          102  3.40  3.61  3.76     7
## 9999          92      108      64    106          109  6.58  6.45  5.18     6
## 10000         99      101      69    104          124  5.13  3.32  3.47     8
##       sleep food2 reasoning_trials
## 9995    6.2     8                4
## 9996    5.4     7                4
## 9997    5.5     5                2
## 9998    3.9    10                3
## 9999    5.9     9                2
## 10000   6.8     7                1

10000 individuals are represented in this data frame.

Question 4

There are 21 variables represented in this data frame. I figured this out by clicking on the data of the aliens data set, which told me.

Question 5

class(aliens$color)
## [1] "factor"
class(aliens$income)
## [1] "numeric"
class(aliens$time3)
## [1] "numeric"
class(aliens$politics)
## [1] "factor"

The two categorical variables I chose were politics and color. These are categorical because it has a set number of groups that are assigned based on observation. The two numerical variables I chose were time3 and income. Time3 would be best regarded as a continuous variable because there is an infinite number of possible values, and it is best regarded as an interval variable because its value is measurable and constant. Income would also best be regarded as a continuous variable because it can take on any value within a range, and it is best regarded as a ratio variable because it has a zero point.

Question 6

summary(aliens)
##        ID             age         color             island         college    
##  Min.   :    1   Min.   :10.00   Blue:3064   Blick     :3504   Callisto:2472  
##  1st Qu.: 2501   1st Qu.:26.00   Pink:6936   Nanspucket:3032   Europa  :2533  
##  Median : 5000   Median :40.00               Plume     :3464   Ganymede:2491  
##  Mean   : 5000   Mean   :40.21                                 Io      :2504  
##  3rd Qu.: 7500   3rd Qu.:55.00                                                
##  Max.   :10000   Max.   :70.00                                                
##      income           antennae           politics       anxiety     depression 
##  Min.   :  5000   Curly   :2155   Democrulite:3218   Min.   :29   Min.   : 65  
##  1st Qu.: 34000   Straight:7845   Independone:3452   1st Qu.:47   1st Qu.: 93  
##  Median : 55000                   Republicant:3330   Median :50   Median :100  
##  Mean   : 69708                                      Mean   :50   Mean   :100  
##  3rd Qu.: 90000                                      3rd Qu.:53   3rd Qu.:107  
##  Max.   :559000                                      Max.   :68   Max.   :140  
##     sociable         control          memory        intelligence  
##  Min.   : 35.00   Min.   :21.00   Min.   : 52.00   Min.   : 78.0  
##  1st Qu.: 94.00   1st Qu.:53.00   1st Qu.: 85.00   1st Qu.:101.0  
##  Median :100.00   Median :60.00   Median : 92.00   Median :109.0  
##  Mean   : 99.99   Mean   :60.07   Mean   : 92.06   Mean   :108.5  
##  3rd Qu.:106.00   3rd Qu.:67.00   3rd Qu.: 99.00   3rd Qu.:116.0  
##  Max.   :167.00   Max.   :96.00   Max.   :129.00   Max.   :135.0  
##      time1            time2            time3            food1       
##  Min.   : 1.740   Min.   : 0.570   Min.   : 0.280   Min.   : 1.000  
##  1st Qu.: 5.130   1st Qu.: 4.290   1st Qu.: 4.298   1st Qu.: 7.000  
##  Median : 6.170   Median : 5.490   Median : 5.490   Median : 9.000  
##  Mean   : 6.615   Mean   : 5.867   Mean   : 5.867   Mean   : 8.665  
##  3rd Qu.: 7.620   3rd Qu.: 6.990   3rd Qu.: 6.970   3rd Qu.:10.000  
##  Max.   :24.890   Max.   :23.420   Max.   :25.300   Max.   :17.000  
##      sleep           food2        reasoning_trials
##  Min.   :2.500   Min.   : 0.000   Min.   : 1.000  
##  1st Qu.:5.400   1st Qu.: 7.000   1st Qu.: 1.000  
##  Median :6.000   Median : 9.000   Median : 2.000  
##  Mean   :6.012   Mean   : 8.674   Mean   : 2.958  
##  3rd Qu.:6.700   3rd Qu.:10.000   3rd Qu.: 4.000  
##  Max.   :9.500   Max.   :16.000   Max.   :27.000

The output gives me the minimum, value of the 1st quartile, median, mean, the value of the 3rd quartile, and the maximum for each variable in the aliens data set.

Question 7

  1. Most aliens come from the island “Blick”.
  2. The most popular political party is Republicant.
  3. The highest sociability score obtained by any alien is 167.
  4. The lowest memory score obtained by any alien is 52.

Question 8

aliens$food.diff <- aliens$food1 - aliens$food2
head(aliens$food1)
## [1]  5  8  7  6  8 10
head (aliens$food2)
## [1]  9 11  9  9  9  7
summary(aliens$food1)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   7.000   9.000   8.665  10.000  17.000
summary(aliens$food2)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   7.000   9.000   8.674  10.000  16.000

This code is giving me the values and summary of the data for the variables food1 and food2.

Question 9

aliens$food.diff <- aliens$isolated
head(aliens$isolated)
## NULL
summary(aliens$isolated)
## Length  Class   Mode 
##      0   NULL   NULL

The variable I chose was isolated, the opposite of sociable, which is already one of the variables in the data set. This variable will tell us the scores of isolation obtained by any aliens and allow us to compare the scores to the sociability scores.

Question 10

summary(my_sample)
##        ID            age         color           island       college  
##  Min.   : 131   Min.   :13.00   Blue: 5   Blick     :10   Callisto: 6  
##  1st Qu.:3421   1st Qu.:34.25   Pink:25   Nanspucket: 6   Europa  :13  
##  Median :6246   Median :49.00             Plume     :14   Ganymede: 7  
##  Mean   :5629   Mean   :46.27                             Io      : 4  
##  3rd Qu.:8130   3rd Qu.:58.75                                          
##  Max.   :9815   Max.   :66.00                                          
##      income           antennae         politics     anxiety     
##  Min.   : 14000   Curly   : 4   Democrulite: 7   Min.   :38.00  
##  1st Qu.: 42250   Straight:26   Independone:11   1st Qu.:46.25  
##  Median : 63500                 Republicant:12   Median :49.00  
##  Mean   : 72367                                  Mean   :50.23  
##  3rd Qu.:106000                                  3rd Qu.:53.75  
##  Max.   :162000                                  Max.   :67.00  
##    depression       sociable         control          memory      
##  Min.   : 75.0   Min.   : 81.00   Min.   :27.00   Min.   : 74.00  
##  1st Qu.: 98.0   1st Qu.: 91.25   1st Qu.:54.50   1st Qu.: 83.25  
##  Median :102.0   Median : 97.50   Median :60.00   Median : 88.00  
##  Mean   :102.2   Mean   : 97.57   Mean   :60.07   Mean   : 90.30  
##  3rd Qu.:108.8   3rd Qu.:101.75   3rd Qu.:66.00   3rd Qu.: 96.00  
##  Max.   :120.0   Max.   :125.00   Max.   :74.00   Max.   :119.00  
##   intelligence       time1            time2           time3       
##  Min.   : 91.0   Min.   : 3.720   Min.   :2.120   Min.   : 2.940  
##  1st Qu.: 96.5   1st Qu.: 5.400   1st Qu.:4.603   1st Qu.: 4.695  
##  Median :103.0   Median : 6.765   Median :5.720   Median : 5.520  
##  Mean   :105.7   Mean   : 6.842   Mean   :5.950   Mean   : 6.146  
##  3rd Qu.:114.0   3rd Qu.: 8.127   3rd Qu.:7.548   3rd Qu.: 7.617  
##  Max.   :131.0   Max.   :11.430   Max.   :9.620   Max.   :10.890  
##      food1            sleep           food2        reasoning_trials
##  Min.   : 5.000   Min.   :4.300   Min.   : 7.000   Min.   : 1.000  
##  1st Qu.: 7.250   1st Qu.:5.600   1st Qu.: 8.000   1st Qu.: 1.000  
##  Median : 9.000   Median :6.200   Median : 9.000   Median : 2.500  
##  Mean   : 9.033   Mean   :6.317   Mean   : 9.333   Mean   : 3.533  
##  3rd Qu.:10.000   3rd Qu.:6.975   3rd Qu.:10.000   3rd Qu.: 5.000  
##  Max.   :13.000   Max.   :9.400   Max.   :13.000   Max.   :11.000
  1. Most aliens come from the island “Plume”.
  2. The most popular political party among aliens is Republicant.
  3. The highest sociability score obtained by any alien is 125.
  4. The lowest memory score obtained by any alien is 74.

Question 11

my_sample_2 <- make.my.sample(34370791, 30, aliens)
summary(my_sample_2)
##        ID            age         color           island       college 
##  Min.   :  64   Min.   :15.00   Blue:11   Blick     :13   Callisto:8  
##  1st Qu.:1954   1st Qu.:27.25   Pink:19   Nanspucket: 9   Europa  :8  
##  Median :4186   Median :41.00             Plume     : 8   Ganymede:8  
##  Mean   :4411   Mean   :42.37                             Io      :6  
##  3rd Qu.:6677   3rd Qu.:58.75                                         
##  Max.   :9844   Max.   :69.00                                         
##      income           antennae         politics     anxiety     
##  Min.   : 15000   Curly   : 6   Democrulite:13   Min.   :40.00  
##  1st Qu.: 28500   Straight:24   Independone: 5   1st Qu.:45.25  
##  Median : 49500                 Republicant:12   Median :49.00  
##  Mean   : 79267                                  Mean   :49.43  
##  3rd Qu.:104750                                  3rd Qu.:51.75  
##  Max.   :266000                                  Max.   :60.00  
##    depression       sociable        control         memory      
##  Min.   : 77.0   Min.   : 67.0   Min.   :42.0   Min.   : 74.00  
##  1st Qu.: 95.0   1st Qu.: 94.0   1st Qu.:55.5   1st Qu.: 92.25  
##  Median :102.0   Median : 98.0   Median :62.0   Median : 95.50  
##  Mean   :100.3   Mean   : 98.2   Mean   :61.7   Mean   : 95.83  
##  3rd Qu.:106.0   3rd Qu.:104.8   3rd Qu.:67.0   3rd Qu.:103.75  
##  Max.   :122.0   Max.   :130.0   Max.   :88.0   Max.   :112.00  
##   intelligence       time1            time2            time3       
##  Min.   : 94.0   Min.   : 3.130   Min.   : 2.760   Min.   : 1.170  
##  1st Qu.:100.2   1st Qu.: 5.070   1st Qu.: 4.465   1st Qu.: 4.082  
##  Median :110.0   Median : 5.990   Median : 5.215   Median : 5.055  
##  Mean   :110.3   Mean   : 6.377   Mean   : 5.827   Mean   : 5.619  
##  3rd Qu.:116.8   3rd Qu.: 7.815   3rd Qu.: 7.325   3rd Qu.: 7.140  
##  Max.   :128.0   Max.   :10.300   Max.   :10.310   Max.   :10.420  
##      food1           sleep           food2      reasoning_trials
##  Min.   : 3.00   Min.   :4.000   Min.   : 4.0   Min.   : 1.0    
##  1st Qu.: 6.25   1st Qu.:5.400   1st Qu.: 7.0   1st Qu.: 1.0    
##  Median : 8.00   Median :6.000   Median : 8.0   Median : 2.0    
##  Mean   : 7.70   Mean   :5.937   Mean   : 8.1   Mean   : 2.9    
##  3rd Qu.: 9.00   3rd Qu.:6.525   3rd Qu.:10.0   3rd Qu.: 4.0    
##  Max.   :11.00   Max.   :7.400   Max.   :12.0   Max.   :10.0
  1. The most aliens come from the island “Blick”.
  2. The most popular political party is Democrulite.
  3. The highest sociability score obtained by any alien is 130.
  4. The lowest memory score obtained by any alien is 74.

Question 12

The first sample and the second sample had values that were pretty close to each other, but were very different from the population. The first sample captures the truth about the population more than the second sample because the values in the summary are closer to the population, and the values of the second sample were less than the first. The two samples were fairly consistent with each other, but not with the population. I am not surprised by the discrepancy between the population and two samples because in the aliens data set, there are 10000 objects of 21 variables, and in the first and second data sets, there are only 30 objects of 21 variables. The discrepancy between the first and second sample can also be explained by the addition of 1 to my student ID number in question 10.