In this lab, we will use some in-built R functions to perform basic
dataframe checks. We’ll use the penguins
dataset
“The goal of palmerpenguins
is to provide a great
dataset for data exploration & visualization, as an alternative to
iris
.”
Source: https://allisonhorst.github.io/palmerpenguins/
library(palmerpenguins)
##
## Attaching package: 'palmerpenguins'
## The following objects are masked from 'package:datasets':
##
## penguins, penguins_raw
#you can find out more about this dataset:
?palmerpenguins
#to see the datasets contained in the package
data(package = 'palmerpenguins')
We will use the penguins
dataset, which is the
simplified version of the raw dataset.
Some common functions you can use to explore your dataframe:
#1. Dimension of dataset - gives the number of rows and columns in the dataframe (rows always first and then column)
dim(penguins)
## [1] 344 8
nrow(penguins) #gives the number of rows in penguins
## [1] 344
ncol(penguins) #gives the number of column in penguins
## [1] 8
#2. Structure of the dataset
str(penguins)
## 'data.frame': 344 obs. of 8 variables:
## $ species : Factor w/ 3 levels "Adelie","Chinstrap",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ island : Factor w/ 3 levels "Biscoe","Dream",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ bill_length_mm : num 39.1 39.5 40.3 NA 36.7 39.3 38.9 39.2 34.1 42 ...
## $ bill_depth_mm : num 18.7 17.4 18 NA 19.3 20.6 17.8 19.6 18.1 20.2 ...
## $ flipper_length_mm: int 181 186 195 NA 193 190 181 195 193 190 ...
## $ body_mass_g : int 3750 3800 3250 NA 3450 3650 3625 4675 3475 4250 ...
## $ sex : Factor w/ 2 levels "female","male": 2 1 1 NA 1 2 1 2 NA NA ...
## $ year : int 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 ...
#3. Viewing partial dataset
#to see the first six rows of the dataset
head(penguins)
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18.0 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## sex year
## 1 male 2007
## 2 female 2007
## 3 female 2007
## 4 <NA> 2007
## 5 female 2007
## 6 male 2007
#to see the last six rows of the dataset
tail(penguins)
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## 339 Chinstrap Dream 45.7 17.0 195 3650
## 340 Chinstrap Dream 55.8 19.8 207 4000
## 341 Chinstrap Dream 43.5 18.1 202 3400
## 342 Chinstrap Dream 49.6 18.2 193 3775
## 343 Chinstrap Dream 50.8 19.0 210 4100
## 344 Chinstrap Dream 50.2 18.7 198 3775
## sex year
## 339 female 2009
## 340 male 2009
## 341 female 2009
## 342 male 2009
## 343 male 2009
## 344 female 2009
#4. Get names of variables in your datasets - columns names
colnames(penguins)
## [1] "species" "island" "bill_length_mm"
## [4] "bill_depth_mm" "flipper_length_mm" "body_mass_g"
## [7] "sex" "year"
rownames(penguins)
## [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12"
## [13] "13" "14" "15" "16" "17" "18" "19" "20" "21" "22" "23" "24"
## [25] "25" "26" "27" "28" "29" "30" "31" "32" "33" "34" "35" "36"
## [37] "37" "38" "39" "40" "41" "42" "43" "44" "45" "46" "47" "48"
## [49] "49" "50" "51" "52" "53" "54" "55" "56" "57" "58" "59" "60"
## [61] "61" "62" "63" "64" "65" "66" "67" "68" "69" "70" "71" "72"
## [73] "73" "74" "75" "76" "77" "78" "79" "80" "81" "82" "83" "84"
## [85] "85" "86" "87" "88" "89" "90" "91" "92" "93" "94" "95" "96"
## [97] "97" "98" "99" "100" "101" "102" "103" "104" "105" "106" "107" "108"
## [109] "109" "110" "111" "112" "113" "114" "115" "116" "117" "118" "119" "120"
## [121] "121" "122" "123" "124" "125" "126" "127" "128" "129" "130" "131" "132"
## [133] "133" "134" "135" "136" "137" "138" "139" "140" "141" "142" "143" "144"
## [145] "145" "146" "147" "148" "149" "150" "151" "152" "153" "154" "155" "156"
## [157] "157" "158" "159" "160" "161" "162" "163" "164" "165" "166" "167" "168"
## [169] "169" "170" "171" "172" "173" "174" "175" "176" "177" "178" "179" "180"
## [181] "181" "182" "183" "184" "185" "186" "187" "188" "189" "190" "191" "192"
## [193] "193" "194" "195" "196" "197" "198" "199" "200" "201" "202" "203" "204"
## [205] "205" "206" "207" "208" "209" "210" "211" "212" "213" "214" "215" "216"
## [217] "217" "218" "219" "220" "221" "222" "223" "224" "225" "226" "227" "228"
## [229] "229" "230" "231" "232" "233" "234" "235" "236" "237" "238" "239" "240"
## [241] "241" "242" "243" "244" "245" "246" "247" "248" "249" "250" "251" "252"
## [253] "253" "254" "255" "256" "257" "258" "259" "260" "261" "262" "263" "264"
## [265] "265" "266" "267" "268" "269" "270" "271" "272" "273" "274" "275" "276"
## [277] "277" "278" "279" "280" "281" "282" "283" "284" "285" "286" "287" "288"
## [289] "289" "290" "291" "292" "293" "294" "295" "296" "297" "298" "299" "300"
## [301] "301" "302" "303" "304" "305" "306" "307" "308" "309" "310" "311" "312"
## [313] "313" "314" "315" "316" "317" "318" "319" "320" "321" "322" "323" "324"
## [325] "325" "326" "327" "328" "329" "330" "331" "332" "333" "334" "335" "336"
## [337] "337" "338" "339" "340" "341" "342" "343" "344"
You can change the column names of variables too using the `colnames <- c(“var1”,“var2”…) as long as the length of the vector with new column names matches with the current dataframe.
#see the original column names
colnames(penguins)
## [1] "species" "island" "bill_length_mm"
## [4] "bill_depth_mm" "flipper_length_mm" "body_mass_g"
## [7] "sex" "year"
#change some of the column names
colnames(penguins) <- c("sp",
"island",
"bill_length",
"bill_depth",
"flipper_length",
"body_mass",
"sex",
"year")
#now check the new column names
colnames(penguins)
## [1] "sp" "island" "bill_length" "bill_depth"
## [5] "flipper_length" "body_mass" "sex" "year"
#find out what are the unique species of penguins
unique(penguins$sp)
## [1] Adelie Gentoo Chinstrap
## Levels: Adelie Chinstrap Gentoo
#how many unique species are there?
length(unique(penguins$sp))
## [1] 3
#what are the unique species?
unique(penguins$sp)
## [1] Adelie Gentoo Chinstrap
## Levels: Adelie Chinstrap Gentoo
Useful functions for simple mathematical operations
#The function data() restores built-in datasets to their original state.
#For example, it will reset the variable names changed in the previous code chunk.
data(penguins)
#Check that the names have been changed back
colnames(penguins)
## [1] "species" "island" "bill_length_mm"
## [4] "bill_depth_mm" "flipper_length_mm" "body_mass_g"
## [7] "sex" "year"
#the summary function is also useful to summarize all columns
summary(penguins)
## species island bill_length_mm bill_depth_mm
## Adelie :152 Biscoe :168 Min. :32.10 Min. :13.10
## Chinstrap: 68 Dream :124 1st Qu.:39.23 1st Qu.:15.60
## Gentoo :124 Torgersen: 52 Median :44.45 Median :17.30
## Mean :43.92 Mean :17.15
## 3rd Qu.:48.50 3rd Qu.:18.70
## Max. :59.60 Max. :21.50
## NA's :2 NA's :2
## flipper_length_mm body_mass_g sex year
## Min. :172.0 Min. :2700 female:165 Min. :2007
## 1st Qu.:190.0 1st Qu.:3550 male :168 1st Qu.:2007
## Median :197.0 Median :4050 NA's : 11 Median :2008
## Mean :200.9 Mean :4202 Mean :2008
## 3rd Qu.:213.0 3rd Qu.:4750 3rd Qu.:2009
## Max. :231.0 Max. :6300 Max. :2009
## NA's :2 NA's :2
#to get summaries of individual categorical variable - how many of each species are there?
table(penguins$species)
##
## Adelie Chinstrap Gentoo
## 152 68 124
#you can use the $ to pull out columns by their column names
penguins$bill_length_mm
## [1] 39.1 39.5 40.3 NA 36.7 39.3 38.9 39.2 34.1 42.0 37.8 37.8 41.1 38.6 34.6
## [16] 36.6 38.7 42.5 34.4 46.0 37.8 37.7 35.9 38.2 38.8 35.3 40.6 40.5 37.9 40.5
## [31] 39.5 37.2 39.5 40.9 36.4 39.2 38.8 42.2 37.6 39.8 36.5 40.8 36.0 44.1 37.0
## [46] 39.6 41.1 37.5 36.0 42.3 39.6 40.1 35.0 42.0 34.5 41.4 39.0 40.6 36.5 37.6
## [61] 35.7 41.3 37.6 41.1 36.4 41.6 35.5 41.1 35.9 41.8 33.5 39.7 39.6 45.8 35.5
## [76] 42.8 40.9 37.2 36.2 42.1 34.6 42.9 36.7 35.1 37.3 41.3 36.3 36.9 38.3 38.9
## [91] 35.7 41.1 34.0 39.6 36.2 40.8 38.1 40.3 33.1 43.2 35.0 41.0 37.7 37.8 37.9
## [106] 39.7 38.6 38.2 38.1 43.2 38.1 45.6 39.7 42.2 39.6 42.7 38.6 37.3 35.7 41.1
## [121] 36.2 37.7 40.2 41.4 35.2 40.6 38.8 41.5 39.0 44.1 38.5 43.1 36.8 37.5 38.1
## [136] 41.1 35.6 40.2 37.0 39.7 40.2 40.6 32.1 40.7 37.3 39.0 39.2 36.6 36.0 37.8
## [151] 36.0 41.5 46.1 50.0 48.7 50.0 47.6 46.5 45.4 46.7 43.3 46.8 40.9 49.0 45.5
## [166] 48.4 45.8 49.3 42.0 49.2 46.2 48.7 50.2 45.1 46.5 46.3 42.9 46.1 44.5 47.8
## [181] 48.2 50.0 47.3 42.8 45.1 59.6 49.1 48.4 42.6 44.4 44.0 48.7 42.7 49.6 45.3
## [196] 49.6 50.5 43.6 45.5 50.5 44.9 45.2 46.6 48.5 45.1 50.1 46.5 45.0 43.8 45.5
## [211] 43.2 50.4 45.3 46.2 45.7 54.3 45.8 49.8 46.2 49.5 43.5 50.7 47.7 46.4 48.2
## [226] 46.5 46.4 48.6 47.5 51.1 45.2 45.2 49.1 52.5 47.4 50.0 44.9 50.8 43.4 51.3
## [241] 47.5 52.1 47.5 52.2 45.5 49.5 44.5 50.8 49.4 46.9 48.4 51.1 48.5 55.9 47.2
## [256] 49.1 47.3 46.8 41.7 53.4 43.3 48.1 50.5 49.8 43.5 51.5 46.2 55.1 44.5 48.8
## [271] 47.2 NA 46.8 50.4 45.2 49.9 46.5 50.0 51.3 45.4 52.7 45.2 46.1 51.3 46.0
## [286] 51.3 46.6 51.7 47.0 52.0 45.9 50.5 50.3 58.0 46.4 49.2 42.4 48.5 43.2 50.6
## [301] 46.7 52.0 50.5 49.5 46.4 52.8 40.9 54.2 42.5 51.0 49.7 47.5 47.6 52.0 46.9
## [316] 53.5 49.0 46.2 50.9 45.5 50.9 50.8 50.1 49.0 51.5 49.8 48.1 51.4 45.7 50.7
## [331] 42.5 52.2 45.2 49.3 50.2 45.6 51.9 46.8 45.7 55.8 43.5 49.6 50.8 50.2
#because there are some NAs in the data, we will remove them for simplicity
penguins<-na.omit(penguins)
#range of bill length values
range(penguins$bill_length_mm)
## [1] 32.1 59.6
#get mean of bill length values
mean(penguins$bill_length_mm)
## [1] 43.99279
#standard deviation of bill length values
sd(penguins$bill_length_mm)
## [1] 5.468668
R has a number of in-built function that helps you look at the distribution of numeric data and perform basic mathematical functions. You can check this link to find out about some of the useful in-built R functions. https://www.statmethods.net/management/functions.html
Try these functions using any variable from the penguins
data.
sqrt(penguins$bill_length_mm)
## [1] 6.252999 6.284903 6.348228 6.058052 6.268971 6.236986 6.260990 6.410928
## [9] 6.212890 5.882176 6.049793 6.220932 6.519202 5.865151 6.782330 6.148170
## [17] 6.140033 5.991661 6.180615 6.228965 5.941380 6.371813 6.363961 6.156298
## [25] 6.363961 6.284903 6.099180 6.284903 6.395311 6.033241 6.260990 6.228965
## [33] 6.496153 6.131884 6.308724 6.041523 6.387488 6.000000 6.640783 6.082763
## [41] 6.292853 6.410928 6.000000 6.503845 6.292853 6.332456 5.916080 6.480741
## [49] 5.873670 6.434283 6.244998 6.371813 6.041523 6.131884 5.974948 6.426508
## [57] 6.131884 6.410928 6.033241 6.449806 5.958188 6.410928 5.991661 6.465292
## [65] 5.787918 6.300794 6.292853 6.767570 5.958188 6.542171 6.395311 6.099180
## [73] 6.016644 6.488451 5.882176 6.549809 6.058052 5.924525 6.107373 6.426508
## [81] 6.024948 6.074537 6.188699 6.236986 5.974948 6.410928 5.830952 6.292853
## [89] 6.016644 6.387488 6.172520 6.348228 5.753260 6.572671 5.916080 6.403124
## [97] 6.140033 6.148170 6.156298 6.300794 6.212890 6.180615 6.172520 6.572671
## [105] 6.172520 6.752777 6.300794 6.496153 6.292853 6.534524 6.212890 6.107373
## [113] 5.974948 6.410928 6.016644 6.140033 6.340347 6.434283 5.932959 6.371813
## [121] 6.228965 6.442049 6.244998 6.640783 6.204837 6.565059 6.066300 6.123724
## [129] 6.172520 6.410928 5.966574 6.340347 6.082763 6.300794 6.340347 6.371813
## [137] 5.665686 6.379655 6.107373 6.244998 6.260990 6.049793 6.000000 6.148170
## [145] 6.000000 6.442049 6.789698 7.071068 6.978539 7.071068 6.899275 6.819091
## [153] 6.737952 6.833740 6.580274 6.841053 6.395311 7.000000 6.745369 6.957011
## [161] 6.767570 7.021396 6.480741 7.014271 6.797058 6.978539 7.085196 6.715653
## [169] 6.819091 6.804410 6.549809 6.789698 6.913754 6.942622 7.071068 6.877500
## [177] 6.542171 6.715653 7.720104 7.007139 6.957011 6.526868 6.663332 6.633250
## [185] 6.978539 6.534524 7.042727 6.730527 7.042727 7.106335 6.603030 6.745369
## [193] 7.106335 6.700746 6.723095 6.826419 6.964194 6.715653 7.078135 6.819091
## [201] 6.708204 6.618157 6.745369 6.572671 7.099296 6.730527 6.797058 6.760178
## [209] 7.368853 6.767570 7.056912 7.035624 6.595453 7.120393 6.906519 6.811755
## [217] 6.942622 6.819091 6.811755 6.971370 6.892024 7.148426 6.723095 6.723095
## [225] 7.007139 7.245688 6.884766 7.071068 6.700746 7.127412 6.587868 7.162402
## [233] 6.892024 7.218033 6.892024 7.224957 6.745369 7.035624 6.670832 7.127412
## [241] 7.028513 6.848357 6.957011 7.148426 6.964194 7.476630 6.870226 7.007139
## [249] 6.841053 6.457554 7.307530 6.580274 6.935416 7.106335 7.056912 6.595453
## [257] 7.176350 6.797058 7.422937 6.985700 6.870226 6.841053 7.099296 6.723095
## [265] 7.063993 6.819091 7.071068 7.162402 6.737952 7.259477 6.723095 6.789698
## [273] 7.162402 6.782330 7.162402 6.826419 7.190271 6.855655 7.211103 6.774954
## [281] 7.106335 7.092249 7.615773 6.811755 7.014271 6.511528 6.964194 6.572671
## [289] 7.113368 6.833740 7.211103 7.106335 7.035624 6.811755 7.266361 6.395311
## [297] 7.362065 6.519202 7.141428 7.049823 6.892024 6.899275 7.211103 6.848357
## [305] 7.314369 7.000000 6.797058 7.134424 6.745369 7.134424 7.127412 7.078135
## [313] 7.000000 7.176350 7.056912 6.935416 7.169379 6.760178 7.120393 6.519202
## [321] 7.224957 6.723095 7.021396 7.085196 6.752777 7.204165 6.841053 6.760178
## [329] 7.469940 6.595453 7.042727 7.127412 7.085196
log(mean(penguins$bill_length_mm))
## [1] 3.784026
exp(median(penguins$bill_depth_mm))
## [1] 32605776
sin(median(penguins$body_mass_g))
## [1] -0.4680382
cos(mean(penguins$bill_length_mm))
## [1] 0.9999449
#there's a ton more that you can use - if there's a math function out there, you can safely rely on R to 100% will have it
What is subsetting?
In R, often you need to remove or pull out a part of the dataset that matches a certain criteria.
For the most part, you will be working with dataframe in R. Any
.csv
file that you input will be stored as a dataframe.
There are other ways to store, organize and work with data too (such as
lists, matrices, arrays etc). The basics of navigating these different
kinds of data remain the same.
#creating a dataframe
example.data <- data.frame("ID" = seq(1,10,1),
"Age" = rnorm(10, mean = 11:18, sd = 0.5),
"Sex" = sample(c("M", "F"), 10, T),
"Weight" = rnorm(10, mean = 21:60, sd = 4.5))
#check the new datafile
head(example.data)
## ID Age Sex Weight
## 1 1 10.85355 F 23.28899
## 2 2 11.72379 M 13.25657
## 3 3 12.15807 M 32.68245
## 4 4 12.82198 F 23.71811
## 5 5 14.30573 M 23.58110
## 6 6 15.78180 M 30.40448
str(example.data)
## 'data.frame': 10 obs. of 4 variables:
## $ ID : num 1 2 3 4 5 6 7 8 9 10
## $ Age : num 10.9 11.7 12.2 12.8 14.3 ...
## $ Sex : chr "F" "M" "M" "F" ...
## $ Weight: num 23.3 13.3 32.7 23.7 23.6 ...
dim(example.data)
## [1] 10 4
head(penguins)
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18.0 195 3250
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## sex year
## 1 male 2007
## 2 female 2007
## 3 female 2007
## 5 female 2007
## 6 male 2007
## 7 female 2007
str(penguins)
## 'data.frame': 333 obs. of 8 variables:
## $ species : Factor w/ 3 levels "Adelie","Chinstrap",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ island : Factor w/ 3 levels "Biscoe","Dream",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ bill_length_mm : num 39.1 39.5 40.3 36.7 39.3 38.9 39.2 41.1 38.6 34.6 ...
## $ bill_depth_mm : num 18.7 17.4 18 19.3 20.6 17.8 19.6 17.6 21.2 21.1 ...
## $ flipper_length_mm: int 181 186 195 193 190 181 195 182 191 198 ...
## $ body_mass_g : int 3750 3800 3250 3450 3650 3625 4675 3200 3800 4400 ...
## $ sex : Factor w/ 2 levels "female","male": 2 1 1 1 2 1 2 1 2 2 ...
## $ year : int 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 ...
## - attr(*, "na.action")= 'omit' Named int [1:11] 4 9 10 11 12 48 179 219 257 269 ...
## ..- attr(*, "names")= chr [1:11] "4" "9" "10" "11" ...
dim(penguins)
## [1] 333 8
#pulling out the second row of this dataset
example.data[2,] #[rownumber, columnnumber]
## ID Age Sex Weight
## 2 2 11.72379 M 13.25657
#when we use just the "," it means we want to extract all the columns.
penguins[2,5]
## [1] 186
#pulling out the second column of this dataset
example.data[,2] #[rownumber, columnnumber]
## [1] 10.85355 11.72379 12.15807 12.82198 14.30573 15.78180 16.78907 17.74803
## [9] 10.69106 12.75860
#here we use "," to indicate that we want to extract all the rows.
You can subset using the [] notation and inserting criteria inside []
dataframe[ , dataframe$column1 == "somecriteria"]
—>
this will pull out the data where column1
is equal to
somecriteria only.
dataframe[dataframe$row1 == "somecriteria", ]
—> this
will pull out the data where row1
is equal to somecriteria
only.
Pay attention to the position of the comma.
#subset newdataset where age is >15
example.data[example.data$Age > 15,]
## ID Age Sex Weight
## 6 6 15.78180 M 30.40448
## 7 7 16.78907 M 19.35017
## 8 8 17.74803 M 34.00619
penguins[penguins$flipper_length_mm > 10,]
## species island bill_length_mm bill_depth_mm flipper_length_mm
## 1 Adelie Torgersen 39.1 18.7 181
## 2 Adelie Torgersen 39.5 17.4 186
## 3 Adelie Torgersen 40.3 18.0 195
## 5 Adelie Torgersen 36.7 19.3 193
## 6 Adelie Torgersen 39.3 20.6 190
## 7 Adelie Torgersen 38.9 17.8 181
## 8 Adelie Torgersen 39.2 19.6 195
## 13 Adelie Torgersen 41.1 17.6 182
## 14 Adelie Torgersen 38.6 21.2 191
## 15 Adelie Torgersen 34.6 21.1 198
## 16 Adelie Torgersen 36.6 17.8 185
## 17 Adelie Torgersen 38.7 19.0 195
## 18 Adelie Torgersen 42.5 20.7 197
## 19 Adelie Torgersen 34.4 18.4 184
## 20 Adelie Torgersen 46.0 21.5 194
## 21 Adelie Biscoe 37.8 18.3 174
## 22 Adelie Biscoe 37.7 18.7 180
## 23 Adelie Biscoe 35.9 19.2 189
## 24 Adelie Biscoe 38.2 18.1 185
## 25 Adelie Biscoe 38.8 17.2 180
## 26 Adelie Biscoe 35.3 18.9 187
## 27 Adelie Biscoe 40.6 18.6 183
## 28 Adelie Biscoe 40.5 17.9 187
## 29 Adelie Biscoe 37.9 18.6 172
## 30 Adelie Biscoe 40.5 18.9 180
## 31 Adelie Dream 39.5 16.7 178
## 32 Adelie Dream 37.2 18.1 178
## 33 Adelie Dream 39.5 17.8 188
## 34 Adelie Dream 40.9 18.9 184
## 35 Adelie Dream 36.4 17.0 195
## 36 Adelie Dream 39.2 21.1 196
## 37 Adelie Dream 38.8 20.0 190
## 38 Adelie Dream 42.2 18.5 180
## 39 Adelie Dream 37.6 19.3 181
## 40 Adelie Dream 39.8 19.1 184
## 41 Adelie Dream 36.5 18.0 182
## 42 Adelie Dream 40.8 18.4 195
## 43 Adelie Dream 36.0 18.5 186
## 44 Adelie Dream 44.1 19.7 196
## 45 Adelie Dream 37.0 16.9 185
## 46 Adelie Dream 39.6 18.8 190
## 47 Adelie Dream 41.1 19.0 182
## 49 Adelie Dream 36.0 17.9 190
## 50 Adelie Dream 42.3 21.2 191
## 51 Adelie Biscoe 39.6 17.7 186
## 52 Adelie Biscoe 40.1 18.9 188
## 53 Adelie Biscoe 35.0 17.9 190
## 54 Adelie Biscoe 42.0 19.5 200
## 55 Adelie Biscoe 34.5 18.1 187
## 56 Adelie Biscoe 41.4 18.6 191
## 57 Adelie Biscoe 39.0 17.5 186
## 58 Adelie Biscoe 40.6 18.8 193
## 59 Adelie Biscoe 36.5 16.6 181
## 60 Adelie Biscoe 37.6 19.1 194
## 61 Adelie Biscoe 35.7 16.9 185
## 62 Adelie Biscoe 41.3 21.1 195
## 63 Adelie Biscoe 37.6 17.0 185
## 64 Adelie Biscoe 41.1 18.2 192
## 65 Adelie Biscoe 36.4 17.1 184
## 66 Adelie Biscoe 41.6 18.0 192
## 67 Adelie Biscoe 35.5 16.2 195
## 68 Adelie Biscoe 41.1 19.1 188
## 69 Adelie Torgersen 35.9 16.6 190
## 70 Adelie Torgersen 41.8 19.4 198
## 71 Adelie Torgersen 33.5 19.0 190
## 72 Adelie Torgersen 39.7 18.4 190
## 73 Adelie Torgersen 39.6 17.2 196
## 74 Adelie Torgersen 45.8 18.9 197
## 75 Adelie Torgersen 35.5 17.5 190
## 76 Adelie Torgersen 42.8 18.5 195
## 77 Adelie Torgersen 40.9 16.8 191
## 78 Adelie Torgersen 37.2 19.4 184
## 79 Adelie Torgersen 36.2 16.1 187
## 80 Adelie Torgersen 42.1 19.1 195
## 81 Adelie Torgersen 34.6 17.2 189
## 82 Adelie Torgersen 42.9 17.6 196
## 83 Adelie Torgersen 36.7 18.8 187
## 84 Adelie Torgersen 35.1 19.4 193
## 85 Adelie Dream 37.3 17.8 191
## 86 Adelie Dream 41.3 20.3 194
## 87 Adelie Dream 36.3 19.5 190
## 88 Adelie Dream 36.9 18.6 189
## 89 Adelie Dream 38.3 19.2 189
## 90 Adelie Dream 38.9 18.8 190
## 91 Adelie Dream 35.7 18.0 202
## 92 Adelie Dream 41.1 18.1 205
## 93 Adelie Dream 34.0 17.1 185
## 94 Adelie Dream 39.6 18.1 186
## 95 Adelie Dream 36.2 17.3 187
## 96 Adelie Dream 40.8 18.9 208
## 97 Adelie Dream 38.1 18.6 190
## 98 Adelie Dream 40.3 18.5 196
## 99 Adelie Dream 33.1 16.1 178
## 100 Adelie Dream 43.2 18.5 192
## 101 Adelie Biscoe 35.0 17.9 192
## 102 Adelie Biscoe 41.0 20.0 203
## 103 Adelie Biscoe 37.7 16.0 183
## 104 Adelie Biscoe 37.8 20.0 190
## 105 Adelie Biscoe 37.9 18.6 193
## 106 Adelie Biscoe 39.7 18.9 184
## 107 Adelie Biscoe 38.6 17.2 199
## 108 Adelie Biscoe 38.2 20.0 190
## 109 Adelie Biscoe 38.1 17.0 181
## 110 Adelie Biscoe 43.2 19.0 197
## 111 Adelie Biscoe 38.1 16.5 198
## 112 Adelie Biscoe 45.6 20.3 191
## 113 Adelie Biscoe 39.7 17.7 193
## 114 Adelie Biscoe 42.2 19.5 197
## 115 Adelie Biscoe 39.6 20.7 191
## 116 Adelie Biscoe 42.7 18.3 196
## 117 Adelie Torgersen 38.6 17.0 188
## 118 Adelie Torgersen 37.3 20.5 199
## 119 Adelie Torgersen 35.7 17.0 189
## 120 Adelie Torgersen 41.1 18.6 189
## 121 Adelie Torgersen 36.2 17.2 187
## 122 Adelie Torgersen 37.7 19.8 198
## 123 Adelie Torgersen 40.2 17.0 176
## 124 Adelie Torgersen 41.4 18.5 202
## 125 Adelie Torgersen 35.2 15.9 186
## 126 Adelie Torgersen 40.6 19.0 199
## 127 Adelie Torgersen 38.8 17.6 191
## 128 Adelie Torgersen 41.5 18.3 195
## 129 Adelie Torgersen 39.0 17.1 191
## 130 Adelie Torgersen 44.1 18.0 210
## 131 Adelie Torgersen 38.5 17.9 190
## 132 Adelie Torgersen 43.1 19.2 197
## 133 Adelie Dream 36.8 18.5 193
## 134 Adelie Dream 37.5 18.5 199
## 135 Adelie Dream 38.1 17.6 187
## 136 Adelie Dream 41.1 17.5 190
## 137 Adelie Dream 35.6 17.5 191
## 138 Adelie Dream 40.2 20.1 200
## 139 Adelie Dream 37.0 16.5 185
## 140 Adelie Dream 39.7 17.9 193
## 141 Adelie Dream 40.2 17.1 193
## 142 Adelie Dream 40.6 17.2 187
## 143 Adelie Dream 32.1 15.5 188
## 144 Adelie Dream 40.7 17.0 190
## 145 Adelie Dream 37.3 16.8 192
## 146 Adelie Dream 39.0 18.7 185
## 147 Adelie Dream 39.2 18.6 190
## 148 Adelie Dream 36.6 18.4 184
## 149 Adelie Dream 36.0 17.8 195
## 150 Adelie Dream 37.8 18.1 193
## 151 Adelie Dream 36.0 17.1 187
## 152 Adelie Dream 41.5 18.5 201
## 153 Gentoo Biscoe 46.1 13.2 211
## 154 Gentoo Biscoe 50.0 16.3 230
## 155 Gentoo Biscoe 48.7 14.1 210
## 156 Gentoo Biscoe 50.0 15.2 218
## 157 Gentoo Biscoe 47.6 14.5 215
## 158 Gentoo Biscoe 46.5 13.5 210
## 159 Gentoo Biscoe 45.4 14.6 211
## 160 Gentoo Biscoe 46.7 15.3 219
## 161 Gentoo Biscoe 43.3 13.4 209
## 162 Gentoo Biscoe 46.8 15.4 215
## 163 Gentoo Biscoe 40.9 13.7 214
## 164 Gentoo Biscoe 49.0 16.1 216
## 165 Gentoo Biscoe 45.5 13.7 214
## 166 Gentoo Biscoe 48.4 14.6 213
## 167 Gentoo Biscoe 45.8 14.6 210
## 168 Gentoo Biscoe 49.3 15.7 217
## 169 Gentoo Biscoe 42.0 13.5 210
## 170 Gentoo Biscoe 49.2 15.2 221
## 171 Gentoo Biscoe 46.2 14.5 209
## 172 Gentoo Biscoe 48.7 15.1 222
## 173 Gentoo Biscoe 50.2 14.3 218
## 174 Gentoo Biscoe 45.1 14.5 215
## 175 Gentoo Biscoe 46.5 14.5 213
## 176 Gentoo Biscoe 46.3 15.8 215
## 177 Gentoo Biscoe 42.9 13.1 215
## 178 Gentoo Biscoe 46.1 15.1 215
## 180 Gentoo Biscoe 47.8 15.0 215
## 181 Gentoo Biscoe 48.2 14.3 210
## 182 Gentoo Biscoe 50.0 15.3 220
## 183 Gentoo Biscoe 47.3 15.3 222
## 184 Gentoo Biscoe 42.8 14.2 209
## 185 Gentoo Biscoe 45.1 14.5 207
## 186 Gentoo Biscoe 59.6 17.0 230
## 187 Gentoo Biscoe 49.1 14.8 220
## 188 Gentoo Biscoe 48.4 16.3 220
## 189 Gentoo Biscoe 42.6 13.7 213
## 190 Gentoo Biscoe 44.4 17.3 219
## 191 Gentoo Biscoe 44.0 13.6 208
## 192 Gentoo Biscoe 48.7 15.7 208
## 193 Gentoo Biscoe 42.7 13.7 208
## 194 Gentoo Biscoe 49.6 16.0 225
## 195 Gentoo Biscoe 45.3 13.7 210
## 196 Gentoo Biscoe 49.6 15.0 216
## 197 Gentoo Biscoe 50.5 15.9 222
## 198 Gentoo Biscoe 43.6 13.9 217
## 199 Gentoo Biscoe 45.5 13.9 210
## 200 Gentoo Biscoe 50.5 15.9 225
## 201 Gentoo Biscoe 44.9 13.3 213
## 202 Gentoo Biscoe 45.2 15.8 215
## 203 Gentoo Biscoe 46.6 14.2 210
## 204 Gentoo Biscoe 48.5 14.1 220
## 205 Gentoo Biscoe 45.1 14.4 210
## 206 Gentoo Biscoe 50.1 15.0 225
## 207 Gentoo Biscoe 46.5 14.4 217
## 208 Gentoo Biscoe 45.0 15.4 220
## 209 Gentoo Biscoe 43.8 13.9 208
## 210 Gentoo Biscoe 45.5 15.0 220
## 211 Gentoo Biscoe 43.2 14.5 208
## 212 Gentoo Biscoe 50.4 15.3 224
## 213 Gentoo Biscoe 45.3 13.8 208
## 214 Gentoo Biscoe 46.2 14.9 221
## 215 Gentoo Biscoe 45.7 13.9 214
## 216 Gentoo Biscoe 54.3 15.7 231
## 217 Gentoo Biscoe 45.8 14.2 219
## 218 Gentoo Biscoe 49.8 16.8 230
## 220 Gentoo Biscoe 49.5 16.2 229
## 221 Gentoo Biscoe 43.5 14.2 220
## 222 Gentoo Biscoe 50.7 15.0 223
## 223 Gentoo Biscoe 47.7 15.0 216
## 224 Gentoo Biscoe 46.4 15.6 221
## 225 Gentoo Biscoe 48.2 15.6 221
## 226 Gentoo Biscoe 46.5 14.8 217
## 227 Gentoo Biscoe 46.4 15.0 216
## 228 Gentoo Biscoe 48.6 16.0 230
## 229 Gentoo Biscoe 47.5 14.2 209
## 230 Gentoo Biscoe 51.1 16.3 220
## 231 Gentoo Biscoe 45.2 13.8 215
## 232 Gentoo Biscoe 45.2 16.4 223
## 233 Gentoo Biscoe 49.1 14.5 212
## 234 Gentoo Biscoe 52.5 15.6 221
## 235 Gentoo Biscoe 47.4 14.6 212
## 236 Gentoo Biscoe 50.0 15.9 224
## 237 Gentoo Biscoe 44.9 13.8 212
## 238 Gentoo Biscoe 50.8 17.3 228
## 239 Gentoo Biscoe 43.4 14.4 218
## 240 Gentoo Biscoe 51.3 14.2 218
## 241 Gentoo Biscoe 47.5 14.0 212
## 242 Gentoo Biscoe 52.1 17.0 230
## 243 Gentoo Biscoe 47.5 15.0 218
## 244 Gentoo Biscoe 52.2 17.1 228
## 245 Gentoo Biscoe 45.5 14.5 212
## 246 Gentoo Biscoe 49.5 16.1 224
## 247 Gentoo Biscoe 44.5 14.7 214
## 248 Gentoo Biscoe 50.8 15.7 226
## 249 Gentoo Biscoe 49.4 15.8 216
## 250 Gentoo Biscoe 46.9 14.6 222
## 251 Gentoo Biscoe 48.4 14.4 203
## 252 Gentoo Biscoe 51.1 16.5 225
## 253 Gentoo Biscoe 48.5 15.0 219
## 254 Gentoo Biscoe 55.9 17.0 228
## 255 Gentoo Biscoe 47.2 15.5 215
## 256 Gentoo Biscoe 49.1 15.0 228
## 258 Gentoo Biscoe 46.8 16.1 215
## 259 Gentoo Biscoe 41.7 14.7 210
## 260 Gentoo Biscoe 53.4 15.8 219
## 261 Gentoo Biscoe 43.3 14.0 208
## 262 Gentoo Biscoe 48.1 15.1 209
## 263 Gentoo Biscoe 50.5 15.2 216
## 264 Gentoo Biscoe 49.8 15.9 229
## 265 Gentoo Biscoe 43.5 15.2 213
## 266 Gentoo Biscoe 51.5 16.3 230
## 267 Gentoo Biscoe 46.2 14.1 217
## 268 Gentoo Biscoe 55.1 16.0 230
## 270 Gentoo Biscoe 48.8 16.2 222
## 271 Gentoo Biscoe 47.2 13.7 214
## 273 Gentoo Biscoe 46.8 14.3 215
## 274 Gentoo Biscoe 50.4 15.7 222
## 275 Gentoo Biscoe 45.2 14.8 212
## 276 Gentoo Biscoe 49.9 16.1 213
## 277 Chinstrap Dream 46.5 17.9 192
## 278 Chinstrap Dream 50.0 19.5 196
## 279 Chinstrap Dream 51.3 19.2 193
## 280 Chinstrap Dream 45.4 18.7 188
## 281 Chinstrap Dream 52.7 19.8 197
## 282 Chinstrap Dream 45.2 17.8 198
## 283 Chinstrap Dream 46.1 18.2 178
## 284 Chinstrap Dream 51.3 18.2 197
## 285 Chinstrap Dream 46.0 18.9 195
## 286 Chinstrap Dream 51.3 19.9 198
## 287 Chinstrap Dream 46.6 17.8 193
## 288 Chinstrap Dream 51.7 20.3 194
## 289 Chinstrap Dream 47.0 17.3 185
## 290 Chinstrap Dream 52.0 18.1 201
## 291 Chinstrap Dream 45.9 17.1 190
## 292 Chinstrap Dream 50.5 19.6 201
## 293 Chinstrap Dream 50.3 20.0 197
## 294 Chinstrap Dream 58.0 17.8 181
## 295 Chinstrap Dream 46.4 18.6 190
## 296 Chinstrap Dream 49.2 18.2 195
## 297 Chinstrap Dream 42.4 17.3 181
## 298 Chinstrap Dream 48.5 17.5 191
## 299 Chinstrap Dream 43.2 16.6 187
## 300 Chinstrap Dream 50.6 19.4 193
## 301 Chinstrap Dream 46.7 17.9 195
## 302 Chinstrap Dream 52.0 19.0 197
## 303 Chinstrap Dream 50.5 18.4 200
## 304 Chinstrap Dream 49.5 19.0 200
## 305 Chinstrap Dream 46.4 17.8 191
## 306 Chinstrap Dream 52.8 20.0 205
## 307 Chinstrap Dream 40.9 16.6 187
## 308 Chinstrap Dream 54.2 20.8 201
## 309 Chinstrap Dream 42.5 16.7 187
## 310 Chinstrap Dream 51.0 18.8 203
## 311 Chinstrap Dream 49.7 18.6 195
## 312 Chinstrap Dream 47.5 16.8 199
## 313 Chinstrap Dream 47.6 18.3 195
## 314 Chinstrap Dream 52.0 20.7 210
## 315 Chinstrap Dream 46.9 16.6 192
## 316 Chinstrap Dream 53.5 19.9 205
## 317 Chinstrap Dream 49.0 19.5 210
## 318 Chinstrap Dream 46.2 17.5 187
## 319 Chinstrap Dream 50.9 19.1 196
## 320 Chinstrap Dream 45.5 17.0 196
## 321 Chinstrap Dream 50.9 17.9 196
## 322 Chinstrap Dream 50.8 18.5 201
## 323 Chinstrap Dream 50.1 17.9 190
## 324 Chinstrap Dream 49.0 19.6 212
## 325 Chinstrap Dream 51.5 18.7 187
## 326 Chinstrap Dream 49.8 17.3 198
## 327 Chinstrap Dream 48.1 16.4 199
## 328 Chinstrap Dream 51.4 19.0 201
## 329 Chinstrap Dream 45.7 17.3 193
## 330 Chinstrap Dream 50.7 19.7 203
## 331 Chinstrap Dream 42.5 17.3 187
## 332 Chinstrap Dream 52.2 18.8 197
## 333 Chinstrap Dream 45.2 16.6 191
## 334 Chinstrap Dream 49.3 19.9 203
## 335 Chinstrap Dream 50.2 18.8 202
## 336 Chinstrap Dream 45.6 19.4 194
## 337 Chinstrap Dream 51.9 19.5 206
## 338 Chinstrap Dream 46.8 16.5 189
## 339 Chinstrap Dream 45.7 17.0 195
## 340 Chinstrap Dream 55.8 19.8 207
## 341 Chinstrap Dream 43.5 18.1 202
## 342 Chinstrap Dream 49.6 18.2 193
## 343 Chinstrap Dream 50.8 19.0 210
## 344 Chinstrap Dream 50.2 18.7 198
## body_mass_g sex year
## 1 3750 male 2007
## 2 3800 female 2007
## 3 3250 female 2007
## 5 3450 female 2007
## 6 3650 male 2007
## 7 3625 female 2007
## 8 4675 male 2007
## 13 3200 female 2007
## 14 3800 male 2007
## 15 4400 male 2007
## 16 3700 female 2007
## 17 3450 female 2007
## 18 4500 male 2007
## 19 3325 female 2007
## 20 4200 male 2007
## 21 3400 female 2007
## 22 3600 male 2007
## 23 3800 female 2007
## 24 3950 male 2007
## 25 3800 male 2007
## 26 3800 female 2007
## 27 3550 male 2007
## 28 3200 female 2007
## 29 3150 female 2007
## 30 3950 male 2007
## 31 3250 female 2007
## 32 3900 male 2007
## 33 3300 female 2007
## 34 3900 male 2007
## 35 3325 female 2007
## 36 4150 male 2007
## 37 3950 male 2007
## 38 3550 female 2007
## 39 3300 female 2007
## 40 4650 male 2007
## 41 3150 female 2007
## 42 3900 male 2007
## 43 3100 female 2007
## 44 4400 male 2007
## 45 3000 female 2007
## 46 4600 male 2007
## 47 3425 male 2007
## 49 3450 female 2007
## 50 4150 male 2007
## 51 3500 female 2008
## 52 4300 male 2008
## 53 3450 female 2008
## 54 4050 male 2008
## 55 2900 female 2008
## 56 3700 male 2008
## 57 3550 female 2008
## 58 3800 male 2008
## 59 2850 female 2008
## 60 3750 male 2008
## 61 3150 female 2008
## 62 4400 male 2008
## 63 3600 female 2008
## 64 4050 male 2008
## 65 2850 female 2008
## 66 3950 male 2008
## 67 3350 female 2008
## 68 4100 male 2008
## 69 3050 female 2008
## 70 4450 male 2008
## 71 3600 female 2008
## 72 3900 male 2008
## 73 3550 female 2008
## 74 4150 male 2008
## 75 3700 female 2008
## 76 4250 male 2008
## 77 3700 female 2008
## 78 3900 male 2008
## 79 3550 female 2008
## 80 4000 male 2008
## 81 3200 female 2008
## 82 4700 male 2008
## 83 3800 female 2008
## 84 4200 male 2008
## 85 3350 female 2008
## 86 3550 male 2008
## 87 3800 male 2008
## 88 3500 female 2008
## 89 3950 male 2008
## 90 3600 female 2008
## 91 3550 female 2008
## 92 4300 male 2008
## 93 3400 female 2008
## 94 4450 male 2008
## 95 3300 female 2008
## 96 4300 male 2008
## 97 3700 female 2008
## 98 4350 male 2008
## 99 2900 female 2008
## 100 4100 male 2008
## 101 3725 female 2009
## 102 4725 male 2009
## 103 3075 female 2009
## 104 4250 male 2009
## 105 2925 female 2009
## 106 3550 male 2009
## 107 3750 female 2009
## 108 3900 male 2009
## 109 3175 female 2009
## 110 4775 male 2009
## 111 3825 female 2009
## 112 4600 male 2009
## 113 3200 female 2009
## 114 4275 male 2009
## 115 3900 female 2009
## 116 4075 male 2009
## 117 2900 female 2009
## 118 3775 male 2009
## 119 3350 female 2009
## 120 3325 male 2009
## 121 3150 female 2009
## 122 3500 male 2009
## 123 3450 female 2009
## 124 3875 male 2009
## 125 3050 female 2009
## 126 4000 male 2009
## 127 3275 female 2009
## 128 4300 male 2009
## 129 3050 female 2009
## 130 4000 male 2009
## 131 3325 female 2009
## 132 3500 male 2009
## 133 3500 female 2009
## 134 4475 male 2009
## 135 3425 female 2009
## 136 3900 male 2009
## 137 3175 female 2009
## 138 3975 male 2009
## 139 3400 female 2009
## 140 4250 male 2009
## 141 3400 female 2009
## 142 3475 male 2009
## 143 3050 female 2009
## 144 3725 male 2009
## 145 3000 female 2009
## 146 3650 male 2009
## 147 4250 male 2009
## 148 3475 female 2009
## 149 3450 female 2009
## 150 3750 male 2009
## 151 3700 female 2009
## 152 4000 male 2009
## 153 4500 female 2007
## 154 5700 male 2007
## 155 4450 female 2007
## 156 5700 male 2007
## 157 5400 male 2007
## 158 4550 female 2007
## 159 4800 female 2007
## 160 5200 male 2007
## 161 4400 female 2007
## 162 5150 male 2007
## 163 4650 female 2007
## 164 5550 male 2007
## 165 4650 female 2007
## 166 5850 male 2007
## 167 4200 female 2007
## 168 5850 male 2007
## 169 4150 female 2007
## 170 6300 male 2007
## 171 4800 female 2007
## 172 5350 male 2007
## 173 5700 male 2007
## 174 5000 female 2007
## 175 4400 female 2007
## 176 5050 male 2007
## 177 5000 female 2007
## 178 5100 male 2007
## 180 5650 male 2007
## 181 4600 female 2007
## 182 5550 male 2007
## 183 5250 male 2007
## 184 4700 female 2007
## 185 5050 female 2007
## 186 6050 male 2007
## 187 5150 female 2008
## 188 5400 male 2008
## 189 4950 female 2008
## 190 5250 male 2008
## 191 4350 female 2008
## 192 5350 male 2008
## 193 3950 female 2008
## 194 5700 male 2008
## 195 4300 female 2008
## 196 4750 male 2008
## 197 5550 male 2008
## 198 4900 female 2008
## 199 4200 female 2008
## 200 5400 male 2008
## 201 5100 female 2008
## 202 5300 male 2008
## 203 4850 female 2008
## 204 5300 male 2008
## 205 4400 female 2008
## 206 5000 male 2008
## 207 4900 female 2008
## 208 5050 male 2008
## 209 4300 female 2008
## 210 5000 male 2008
## 211 4450 female 2008
## 212 5550 male 2008
## 213 4200 female 2008
## 214 5300 male 2008
## 215 4400 female 2008
## 216 5650 male 2008
## 217 4700 female 2008
## 218 5700 male 2008
## 220 5800 male 2008
## 221 4700 female 2008
## 222 5550 male 2008
## 223 4750 female 2008
## 224 5000 male 2008
## 225 5100 male 2008
## 226 5200 female 2008
## 227 4700 female 2008
## 228 5800 male 2008
## 229 4600 female 2008
## 230 6000 male 2008
## 231 4750 female 2008
## 232 5950 male 2008
## 233 4625 female 2009
## 234 5450 male 2009
## 235 4725 female 2009
## 236 5350 male 2009
## 237 4750 female 2009
## 238 5600 male 2009
## 239 4600 female 2009
## 240 5300 male 2009
## 241 4875 female 2009
## 242 5550 male 2009
## 243 4950 female 2009
## 244 5400 male 2009
## 245 4750 female 2009
## 246 5650 male 2009
## 247 4850 female 2009
## 248 5200 male 2009
## 249 4925 male 2009
## 250 4875 female 2009
## 251 4625 female 2009
## 252 5250 male 2009
## 253 4850 female 2009
## 254 5600 male 2009
## 255 4975 female 2009
## 256 5500 male 2009
## 258 5500 male 2009
## 259 4700 female 2009
## 260 5500 male 2009
## 261 4575 female 2009
## 262 5500 male 2009
## 263 5000 female 2009
## 264 5950 male 2009
## 265 4650 female 2009
## 266 5500 male 2009
## 267 4375 female 2009
## 268 5850 male 2009
## 270 6000 male 2009
## 271 4925 female 2009
## 273 4850 female 2009
## 274 5750 male 2009
## 275 5200 female 2009
## 276 5400 male 2009
## 277 3500 female 2007
## 278 3900 male 2007
## 279 3650 male 2007
## 280 3525 female 2007
## 281 3725 male 2007
## 282 3950 female 2007
## 283 3250 female 2007
## 284 3750 male 2007
## 285 4150 female 2007
## 286 3700 male 2007
## 287 3800 female 2007
## 288 3775 male 2007
## 289 3700 female 2007
## 290 4050 male 2007
## 291 3575 female 2007
## 292 4050 male 2007
## 293 3300 male 2007
## 294 3700 female 2007
## 295 3450 female 2007
## 296 4400 male 2007
## 297 3600 female 2007
## 298 3400 male 2007
## 299 2900 female 2007
## 300 3800 male 2007
## 301 3300 female 2007
## 302 4150 male 2007
## 303 3400 female 2008
## 304 3800 male 2008
## 305 3700 female 2008
## 306 4550 male 2008
## 307 3200 female 2008
## 308 4300 male 2008
## 309 3350 female 2008
## 310 4100 male 2008
## 311 3600 male 2008
## 312 3900 female 2008
## 313 3850 female 2008
## 314 4800 male 2008
## 315 2700 female 2008
## 316 4500 male 2008
## 317 3950 male 2008
## 318 3650 female 2008
## 319 3550 male 2008
## 320 3500 female 2008
## 321 3675 female 2009
## 322 4450 male 2009
## 323 3400 female 2009
## 324 4300 male 2009
## 325 3250 male 2009
## 326 3675 female 2009
## 327 3325 female 2009
## 328 3950 male 2009
## 329 3600 female 2009
## 330 4050 male 2009
## 331 3350 female 2009
## 332 3450 male 2009
## 333 3250 female 2009
## 334 4050 male 2009
## 335 3800 male 2009
## 336 3525 female 2009
## 337 3950 male 2009
## 338 3650 female 2009
## 339 3650 female 2009
## 340 4000 male 2009
## 341 3400 female 2009
## 342 3775 male 2009
## 343 4100 male 2009
## 344 3775 female 2009
#subset newdataset where Sex is F only
example.data[example.data$Sex == "F",]
## ID Age Sex Weight
## 1 1 10.85355 F 23.28899
## 4 4 12.82198 F 23.71811
## 10 10 12.75860 F 23.53283
#subset newdataset where Sex is F and age is <14
example.data[example.data$Sex == "F" & example.data$Age <14,]
## ID Age Sex Weight
## 1 1 10.85355 F 23.28899
## 4 4 12.82198 F 23.71811
## 10 10 12.75860 F 23.53283
#we can use logical operators to combine conditions
#& - and operator [both conditions need to be true]
#! - not operator
#| - or operator [either condition can be true]
#subset newdataset where Sex is M and weight is above mean F weight
example.data[example.data$Sex == "M" & example.data$Weight > mean(example.data$Weight[example.data$Sex == "F"]),]
## ID Age Sex Weight
## 3 3 12.15807 M 32.68245
## 5 5 14.30573 M 23.58110
## 6 6 15.78180 M 30.40448
## 8 8 17.74803 M 34.00619
## 9 9 10.69106 M 33.52551
Let’s unpack this long code.
Sometimes it’s good to figure out what each part of your code is doing by highlighting that part and running it.
#starting from the end of the code, this evaluates whether or not each value in the Sex column is female (F)
example.data$Sex == "F"
## [1] TRUE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE TRUE
#this vector of TRUE and FALSE's can be used to indicate locations as well - wherever we have TRUE, R can pull out the indices of those positions within the vector using a function called which()
which(example.data$Sex == "F")
## [1] 1 4 10
#Alternatively, you can directly send this vector of T & F to subset the dataset - which is what we are doing here
example.data$Weight[example.data$Sex == "F"]
## [1] 23.28899 23.71811 23.53283
#this line pulls out weight values for all females in the dataset
#next we calculate the mean of those values (female only weights)
mean(example.data$Weight[example.data$Sex == "F"])
## [1] 23.51331
#Next, pull out weights where weight is > avg female weight
example.data$Weight > mean(example.data$Weight[example.data$Sex == "F"])
## [1] FALSE FALSE TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE
#again, R will give a vector of TRUE and FALSE's here - so now we have the positions of all weights in our newdata_lab2 dataset where weight > avg female weight (note: this will include female weight too, if there are any female weights above the mean). You can check this by:
#1. printing the actual weights and manually checking
example.data$Weight[example.data$Sex == "F"]
## [1] 23.28899 23.71811 23.53283
#OR asking R by subsetting
example.data$Weight[example.data$Sex == "F"] > mean(example.data$Weight[example.data$Sex == "F"])
## [1] FALSE TRUE TRUE
#so there are individuals with SEX == F that have above mean F weight
#Our final goal is to find individuals that are M and have weight above the avg female weight. We can do this by combining the two criteria using AND operator [&]
example.data$Sex == "M" & example.data$Weight > mean(example.data$Weight[example.data$Sex == "F"])
## [1] FALSE FALSE TRUE FALSE TRUE TRUE FALSE TRUE TRUE FALSE
#now we have a vector with info on where both these conditions are met and where they are not met
#Finally, you subset the dataset using the positional info from the previous line of code
example.data[example.data$Sex == "M" & example.data$Weight > mean(example.data$Weight[example.data$Sex == "F"]),]
## ID Age Sex Weight
## 3 3 12.15807 M 32.68245
## 5 5 14.30573 M 23.58110
## 6 6 15.78180 M 30.40448
## 8 8 17.74803 M 34.00619
## 9 9 10.69106 M 33.52551
table(penguins$species)
##
## Adelie Chinstrap Gentoo
## 146 68 119
#species = Adelie
range(penguins$body_mass_g[penguins$species == "Adelie"])
## [1] 2850 4775
mean(penguins$body_mass_g[penguins$species == "Adelie"])
## [1] 3706.164
median(penguins$body_mass_g[penguins$species == "Adelie"])
## [1] 3700
sd(penguins$body_mass_g[penguins$species == "Adelie"])
## [1] 458.6201
#species = Gentoo
range(penguins$body_mass_g[penguins$species == "Gentoo"])
## [1] 3950 6300
mean(penguins$body_mass_g[penguins$species == "Gentoo"])
## [1] 5092.437
median(penguins$body_mass_g[penguins$species == "Gentoo"])
## [1] 5050
sd(penguins$body_mass_g[penguins$species == "Gentoo"])
## [1] 501.4762
#species = Chinstrap
range(penguins$body_mass_g[penguins$species == "Chinstrap"])
## [1] 2700 4800
mean(penguins$body_mass_g[penguins$species == "Chinstrap"])
## [1] 3733.088
median(penguins$body_mass_g[penguins$species == "Chinstrap"])
## [1] 3700
sd(penguins$body_mass_g[penguins$species == "Chinstrap"])
## [1] 384.3351
458.6201/sqrt(146)
## [1] 37.95567
#=37.95567
384.3351/sqrt(68)
## [1] 46.60748
# =46.60748
501.4762/sqrt(119)
## [1] 45.97025
#=45.97025
subset(penguins, penguins$species == "Gentoo")
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## 153 Gentoo Biscoe 46.1 13.2 211 4500
## 154 Gentoo Biscoe 50.0 16.3 230 5700
## 155 Gentoo Biscoe 48.7 14.1 210 4450
## 156 Gentoo Biscoe 50.0 15.2 218 5700
## 157 Gentoo Biscoe 47.6 14.5 215 5400
## 158 Gentoo Biscoe 46.5 13.5 210 4550
## 159 Gentoo Biscoe 45.4 14.6 211 4800
## 160 Gentoo Biscoe 46.7 15.3 219 5200
## 161 Gentoo Biscoe 43.3 13.4 209 4400
## 162 Gentoo Biscoe 46.8 15.4 215 5150
## 163 Gentoo Biscoe 40.9 13.7 214 4650
## 164 Gentoo Biscoe 49.0 16.1 216 5550
## 165 Gentoo Biscoe 45.5 13.7 214 4650
## 166 Gentoo Biscoe 48.4 14.6 213 5850
## 167 Gentoo Biscoe 45.8 14.6 210 4200
## 168 Gentoo Biscoe 49.3 15.7 217 5850
## 169 Gentoo Biscoe 42.0 13.5 210 4150
## 170 Gentoo Biscoe 49.2 15.2 221 6300
## 171 Gentoo Biscoe 46.2 14.5 209 4800
## 172 Gentoo Biscoe 48.7 15.1 222 5350
## 173 Gentoo Biscoe 50.2 14.3 218 5700
## 174 Gentoo Biscoe 45.1 14.5 215 5000
## 175 Gentoo Biscoe 46.5 14.5 213 4400
## 176 Gentoo Biscoe 46.3 15.8 215 5050
## 177 Gentoo Biscoe 42.9 13.1 215 5000
## 178 Gentoo Biscoe 46.1 15.1 215 5100
## 180 Gentoo Biscoe 47.8 15.0 215 5650
## 181 Gentoo Biscoe 48.2 14.3 210 4600
## 182 Gentoo Biscoe 50.0 15.3 220 5550
## 183 Gentoo Biscoe 47.3 15.3 222 5250
## 184 Gentoo Biscoe 42.8 14.2 209 4700
## 185 Gentoo Biscoe 45.1 14.5 207 5050
## 186 Gentoo Biscoe 59.6 17.0 230 6050
## 187 Gentoo Biscoe 49.1 14.8 220 5150
## 188 Gentoo Biscoe 48.4 16.3 220 5400
## 189 Gentoo Biscoe 42.6 13.7 213 4950
## 190 Gentoo Biscoe 44.4 17.3 219 5250
## 191 Gentoo Biscoe 44.0 13.6 208 4350
## 192 Gentoo Biscoe 48.7 15.7 208 5350
## 193 Gentoo Biscoe 42.7 13.7 208 3950
## 194 Gentoo Biscoe 49.6 16.0 225 5700
## 195 Gentoo Biscoe 45.3 13.7 210 4300
## 196 Gentoo Biscoe 49.6 15.0 216 4750
## 197 Gentoo Biscoe 50.5 15.9 222 5550
## 198 Gentoo Biscoe 43.6 13.9 217 4900
## 199 Gentoo Biscoe 45.5 13.9 210 4200
## 200 Gentoo Biscoe 50.5 15.9 225 5400
## 201 Gentoo Biscoe 44.9 13.3 213 5100
## 202 Gentoo Biscoe 45.2 15.8 215 5300
## 203 Gentoo Biscoe 46.6 14.2 210 4850
## 204 Gentoo Biscoe 48.5 14.1 220 5300
## 205 Gentoo Biscoe 45.1 14.4 210 4400
## 206 Gentoo Biscoe 50.1 15.0 225 5000
## 207 Gentoo Biscoe 46.5 14.4 217 4900
## 208 Gentoo Biscoe 45.0 15.4 220 5050
## 209 Gentoo Biscoe 43.8 13.9 208 4300
## 210 Gentoo Biscoe 45.5 15.0 220 5000
## 211 Gentoo Biscoe 43.2 14.5 208 4450
## 212 Gentoo Biscoe 50.4 15.3 224 5550
## 213 Gentoo Biscoe 45.3 13.8 208 4200
## 214 Gentoo Biscoe 46.2 14.9 221 5300
## 215 Gentoo Biscoe 45.7 13.9 214 4400
## 216 Gentoo Biscoe 54.3 15.7 231 5650
## 217 Gentoo Biscoe 45.8 14.2 219 4700
## 218 Gentoo Biscoe 49.8 16.8 230 5700
## 220 Gentoo Biscoe 49.5 16.2 229 5800
## 221 Gentoo Biscoe 43.5 14.2 220 4700
## 222 Gentoo Biscoe 50.7 15.0 223 5550
## 223 Gentoo Biscoe 47.7 15.0 216 4750
## 224 Gentoo Biscoe 46.4 15.6 221 5000
## 225 Gentoo Biscoe 48.2 15.6 221 5100
## 226 Gentoo Biscoe 46.5 14.8 217 5200
## 227 Gentoo Biscoe 46.4 15.0 216 4700
## 228 Gentoo Biscoe 48.6 16.0 230 5800
## 229 Gentoo Biscoe 47.5 14.2 209 4600
## 230 Gentoo Biscoe 51.1 16.3 220 6000
## 231 Gentoo Biscoe 45.2 13.8 215 4750
## 232 Gentoo Biscoe 45.2 16.4 223 5950
## 233 Gentoo Biscoe 49.1 14.5 212 4625
## 234 Gentoo Biscoe 52.5 15.6 221 5450
## 235 Gentoo Biscoe 47.4 14.6 212 4725
## 236 Gentoo Biscoe 50.0 15.9 224 5350
## 237 Gentoo Biscoe 44.9 13.8 212 4750
## 238 Gentoo Biscoe 50.8 17.3 228 5600
## 239 Gentoo Biscoe 43.4 14.4 218 4600
## 240 Gentoo Biscoe 51.3 14.2 218 5300
## 241 Gentoo Biscoe 47.5 14.0 212 4875
## 242 Gentoo Biscoe 52.1 17.0 230 5550
## 243 Gentoo Biscoe 47.5 15.0 218 4950
## 244 Gentoo Biscoe 52.2 17.1 228 5400
## 245 Gentoo Biscoe 45.5 14.5 212 4750
## 246 Gentoo Biscoe 49.5 16.1 224 5650
## 247 Gentoo Biscoe 44.5 14.7 214 4850
## 248 Gentoo Biscoe 50.8 15.7 226 5200
## 249 Gentoo Biscoe 49.4 15.8 216 4925
## 250 Gentoo Biscoe 46.9 14.6 222 4875
## 251 Gentoo Biscoe 48.4 14.4 203 4625
## 252 Gentoo Biscoe 51.1 16.5 225 5250
## 253 Gentoo Biscoe 48.5 15.0 219 4850
## 254 Gentoo Biscoe 55.9 17.0 228 5600
## 255 Gentoo Biscoe 47.2 15.5 215 4975
## 256 Gentoo Biscoe 49.1 15.0 228 5500
## 258 Gentoo Biscoe 46.8 16.1 215 5500
## 259 Gentoo Biscoe 41.7 14.7 210 4700
## 260 Gentoo Biscoe 53.4 15.8 219 5500
## 261 Gentoo Biscoe 43.3 14.0 208 4575
## 262 Gentoo Biscoe 48.1 15.1 209 5500
## 263 Gentoo Biscoe 50.5 15.2 216 5000
## 264 Gentoo Biscoe 49.8 15.9 229 5950
## 265 Gentoo Biscoe 43.5 15.2 213 4650
## 266 Gentoo Biscoe 51.5 16.3 230 5500
## 267 Gentoo Biscoe 46.2 14.1 217 4375
## 268 Gentoo Biscoe 55.1 16.0 230 5850
## 270 Gentoo Biscoe 48.8 16.2 222 6000
## 271 Gentoo Biscoe 47.2 13.7 214 4925
## 273 Gentoo Biscoe 46.8 14.3 215 4850
## 274 Gentoo Biscoe 50.4 15.7 222 5750
## 275 Gentoo Biscoe 45.2 14.8 212 5200
## 276 Gentoo Biscoe 49.9 16.1 213 5400
## sex year
## 153 female 2007
## 154 male 2007
## 155 female 2007
## 156 male 2007
## 157 male 2007
## 158 female 2007
## 159 female 2007
## 160 male 2007
## 161 female 2007
## 162 male 2007
## 163 female 2007
## 164 male 2007
## 165 female 2007
## 166 male 2007
## 167 female 2007
## 168 male 2007
## 169 female 2007
## 170 male 2007
## 171 female 2007
## 172 male 2007
## 173 male 2007
## 174 female 2007
## 175 female 2007
## 176 male 2007
## 177 female 2007
## 178 male 2007
## 180 male 2007
## 181 female 2007
## 182 male 2007
## 183 male 2007
## 184 female 2007
## 185 female 2007
## 186 male 2007
## 187 female 2008
## 188 male 2008
## 189 female 2008
## 190 male 2008
## 191 female 2008
## 192 male 2008
## 193 female 2008
## 194 male 2008
## 195 female 2008
## 196 male 2008
## 197 male 2008
## 198 female 2008
## 199 female 2008
## 200 male 2008
## 201 female 2008
## 202 male 2008
## 203 female 2008
## 204 male 2008
## 205 female 2008
## 206 male 2008
## 207 female 2008
## 208 male 2008
## 209 female 2008
## 210 male 2008
## 211 female 2008
## 212 male 2008
## 213 female 2008
## 214 male 2008
## 215 female 2008
## 216 male 2008
## 217 female 2008
## 218 male 2008
## 220 male 2008
## 221 female 2008
## 222 male 2008
## 223 female 2008
## 224 male 2008
## 225 male 2008
## 226 female 2008
## 227 female 2008
## 228 male 2008
## 229 female 2008
## 230 male 2008
## 231 female 2008
## 232 male 2008
## 233 female 2009
## 234 male 2009
## 235 female 2009
## 236 male 2009
## 237 female 2009
## 238 male 2009
## 239 female 2009
## 240 male 2009
## 241 female 2009
## 242 male 2009
## 243 female 2009
## 244 male 2009
## 245 female 2009
## 246 male 2009
## 247 female 2009
## 248 male 2009
## 249 male 2009
## 250 female 2009
## 251 female 2009
## 252 male 2009
## 253 female 2009
## 254 male 2009
## 255 female 2009
## 256 male 2009
## 258 male 2009
## 259 female 2009
## 260 male 2009
## 261 female 2009
## 262 male 2009
## 263 female 2009
## 264 male 2009
## 265 female 2009
## 266 male 2009
## 267 female 2009
## 268 male 2009
## 270 male 2009
## 271 female 2009
## 273 female 2009
## 274 male 2009
## 275 female 2009
## 276 male 2009
subset(penguins, penguins$flipper_length_mm > 225, penguins$species == "Gento")
## data frame with 0 columns and 15 rows
which.max(penguins$bill_length_mm)
## [1] 179
penguins[which.max(penguins$bill_length_mm),]
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## 186 Gentoo Biscoe 59.6 17 230 6050
## sex year
## 186 male 2007
Written answer: The species with the largest bill length is Gentoo.
penguins[which.min(penguins$bill_length_mm),]
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## 143 Adelie Dream 32.1 15.5 188 3050
## sex year
## 143 female 2009
Written answer: The species with the smallest bill length is Adelie.