Assignment 1 – Loading Data into a Data Frame

Author

Shawn Ganz

Introduction

For this assignment I chose to use a csv from NYC OpenData called 2018 Central Park Squirrel Census - Squirrel Data provided by “The Squirrel Census”.

Approach

Since the assignment is to transform this dataframe, I want to drop a couple of columns. Below is a list of all the columns:

 [1] "X"                                         
 [2] "Y"                                         
 [3] "Unique.Squirrel.ID"                        
 [4] "Hectare"                                   
 [5] "Shift"                                     
 [6] "Date"                                      
 [7] "Hectare.Squirrel.Number"                   
 [8] "Age"                                       
 [9] "Primary.Fur.Color"                         
[10] "Highlight.Fur.Color"                       
[11] "Combination.of.Primary.and.Highlight.Color"
[12] "Color.notes"                               
[13] "Location"                                  
[14] "Above.Ground.Sighter.Measurement"          
[15] "Specific.Location"                         
[16] "Running"                                   
[17] "Chasing"                                   
[18] "Climbing"                                  
[19] "Eating"                                    
[20] "Foraging"                                  
[21] "Other.Activities"                          
[22] "Kuks"                                      
[23] "Quaas"                                     
[24] "Moans"                                     
[25] "Tail.flags"                                
[26] "Tail.twitches"                             
[27] "Approaches"                                
[28] "Indifferent"                               
[29] "Runs.from"                                 
[30] "Other.Interactions"                        
[31] "Lat.Long"                                  

I want to create a dataframe with only these columns:

 [1] "Unique.Squirrel.ID"                        
 [2] "Primary.Fur.Color"                         
 [3] "Highlight.Fur.Color"                       
 [4] "Combination.of.Primary.and.Highlight.Color"
 [5] "Running"                                   
 [6] "Chasing"                                   
 [7] "Climbing"                                  
 [8] "Eating"                                    
 [9] "Foraging"                                  
[10] "X"                                         
[11] "Y"                                         
[12] "Lat.Long"                                  

Afterwards I want to create the following:

  • A binary “Active” squirrel column using the “Running,” “Chasing,” “Climbing,” “Eating,” and “Foraging” columns.

  • Convert the “Above Ground Sighter Measurement” column to numeric (INT/FLOAT) values only.

The motivation to use this dataset is simple, I just chose the first interesting popular dataset I found on NYC OpenData. This encourages an exploratory approach, which might be useful when learning new skills.

Code-base

TODO: Code-base

Video

TODO: Video