print("hello world !")
## [1] "hello world !"
sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur ... 10.16
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## loaded via a namespace (and not attached):
## [1] digest_0.6.33 R6_2.5.1 jsonlite_1.8.7 evaluate_0.23
## [5] cachem_1.0.8 rlang_1.1.2 cli_3.6.1 rstudioapi_0.14
## [9] jquerylib_0.1.4 bslib_0.5.0 rmarkdown_2.21 tools_4.2.1
## [13] xfun_0.39 yaml_2.3.7 fastmap_1.1.1 compiler_4.2.1
## [17] htmltools_0.5.5 knitr_1.42 sass_0.4.6
Tools -> Global Options -> Appearance
Sessions -> Set Working Directory -> Choose Directory
RStudio IDE
RStudio Markdown
Data Import
ggplot2 (not required)
getOption("defaultPackages")
## [1] "datasets" "utils" "grDevices" "graphics" "stats" "methods"
# install.packages("swirl")
library("swirl")
##
## | Hi! Type swirl() when you are ready to begin.
swirl()
info()
## GETTING HELP when you know the function name
help("mean")
help('mean')
?mean
## GETTING HELP when you DO NOT know the function name (or function not loaded yet)
help.search("mean")
??mean
GOOGLE is your best friend… often, it will redirect you to public forums like StackOverflow ( platform collecting of coding questions & answers ) that collect data on R…
POSTING ON STACKOVERFLOW or other online forums requires you to follow some best practices - tell us your OS, R version, give us your data, code, error, and things you have tried that did not work…
Other websites like InterviewBit, HackerRank, LeetCode usually contain data science questions too, but not usual for R…
There are forums that train students SQL from Udemy for data science jobs…
# Using R as calculator
1+1
## [1] 2
(2+3)*2 # parenthesis
## [1] 10
exp(1) # exponent to the power 1
## [1] 2.718282
2/3 # division
## [1] 0.6666667
options(digits = 3)
2/3 # division rounded off but the entire number calculated - 7 is default
## [1] 0.667
2*3 # multiplication
## [1] 6
2+3 # addition
## [1] 5
2-3 # substraction
## [1] -1
# PEDMAS
48 / (2 * 12)
## [1] 2
48 / 2 * 12
## [1] 288
2*2*2
## [1] 8
2**3 # power
## [1] 8
sqrt(4) # square root
## [1] 2
pi
## [1] 3.14
# CODE SO THAT IT IS EASY TO FOLLOW
(3 + (5 * (2 ^ 2))) # hard to read
## [1] 23
3 + 5 * 2 ^ 2 # clear, if you remember the BODMAS rules
## [1] 23
3 + 5 * (2 ^ 2) # if you forget some rules, this might help
## [1] 23
# Assignment Operator
x <- 1.25
y = x # Best practice is not to use "=" in assignment
y <- x
# x <- z + 1 # Can't assign values of non-existing objects
# Attempting to assign values of non-existing objects
tryCatch({
x <- z + 1
}, error = function(e) {
message("Error: ", e$message)
})
## Error: object 'z' not found
data()
mtcars
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160.0 110 3.90 2.62 16.5 0 1 4 4
## Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.88 17.0 0 1 4 4
## Datsun 710 22.8 4 108.0 93 3.85 2.32 18.6 1 1 4 1
## Hornet 4 Drive 21.4 6 258.0 110 3.08 3.21 19.4 1 0 3 1
## Hornet Sportabout 18.7 8 360.0 175 3.15 3.44 17.0 0 0 3 2
## Valiant 18.1 6 225.0 105 2.76 3.46 20.2 1 0 3 1
## Duster 360 14.3 8 360.0 245 3.21 3.57 15.8 0 0 3 4
## Merc 240D 24.4 4 146.7 62 3.69 3.19 20.0 1 0 4 2
## Merc 230 22.8 4 140.8 95 3.92 3.15 22.9 1 0 4 2
## Merc 280 19.2 6 167.6 123 3.92 3.44 18.3 1 0 4 4
## Merc 280C 17.8 6 167.6 123 3.92 3.44 18.9 1 0 4 4
## Merc 450SE 16.4 8 275.8 180 3.07 4.07 17.4 0 0 3 3
## Merc 450SL 17.3 8 275.8 180 3.07 3.73 17.6 0 0 3 3
## Merc 450SLC 15.2 8 275.8 180 3.07 3.78 18.0 0 0 3 3
## Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.25 18.0 0 0 3 4
## Lincoln Continental 10.4 8 460.0 215 3.00 5.42 17.8 0 0 3 4
## Chrysler Imperial 14.7 8 440.0 230 3.23 5.34 17.4 0 0 3 4
## Fiat 128 32.4 4 78.7 66 4.08 2.20 19.5 1 1 4 1
## Honda Civic 30.4 4 75.7 52 4.93 1.61 18.5 1 1 4 2
## Toyota Corolla 33.9 4 71.1 65 4.22 1.83 19.9 1 1 4 1
## Toyota Corona 21.5 4 120.1 97 3.70 2.46 20.0 1 0 3 1
## Dodge Challenger 15.5 8 318.0 150 2.76 3.52 16.9 0 0 3 2
## AMC Javelin 15.2 8 304.0 150 3.15 3.44 17.3 0 0 3 2
## Camaro Z28 13.3 8 350.0 245 3.73 3.84 15.4 0 0 3 4
## Pontiac Firebird 19.2 8 400.0 175 3.08 3.85 17.1 0 0 3 2
## Fiat X1-9 27.3 4 79.0 66 4.08 1.94 18.9 1 1 4 1
## Porsche 914-2 26.0 4 120.3 91 4.43 2.14 16.7 0 1 5 2
## Lotus Europa 30.4 4 95.1 113 3.77 1.51 16.9 1 1 5 2
## Ford Pantera L 15.8 8 351.0 264 4.22 3.17 14.5 0 1 5 4
## Ferrari Dino 19.7 6 145.0 175 3.62 2.77 15.5 0 1 5 6
## Maserati Bora 15.0 8 301.0 335 3.54 3.57 14.6 0 1 5 8
## Volvo 142E 21.4 4 121.0 109 4.11 2.78 18.6 1 1 4 2
df <- as.data.frame(mtcars)
setwd("/Users/arvindsharma/Dropbox/WCAS/Data Analysis/Data Analysis - Spring II 2024/Data Analysis - Spring II 2024 (shared files)/W1/Week_1-2/titanic/")
train <- read.csv("train.csv")
An object’s class defines how the object is implemented. The class defines object’s internal state and the implementation of its operations.
In contrast, an object’s type only refers to its interface - a set of requests to which it can respond.
An object can have many types, and objects of different classes can have the same type.
class(df) # An object's class defines how the object is implemented. The class defines object's internal state and the implementation of its operations.
## [1] "data.frame"
typeof(df) # In contrast, an object's type only refers to its interface - a set of requests to which it can respond.
## [1] "list"
head(df)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.62 16.5 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.88 17.0 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.21 19.4 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
tail(df)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Porsche 914-2 26.0 4 120.3 91 4.43 2.14 16.7 0 1 5 2
## Lotus Europa 30.4 4 95.1 113 3.77 1.51 16.9 1 1 5 2
## Ford Pantera L 15.8 8 351.0 264 4.22 3.17 14.5 0 1 5 4
## Ferrari Dino 19.7 6 145.0 175 3.62 2.77 15.5 0 1 5 6
## Maserati Bora 15.0 8 301.0 335 3.54 3.57 14.6 0 1 5 8
## Volvo 142E 21.4 4 121.0 109 4.11 2.78 18.6 1 1 4 2
# install.packages("visdat")
??visdat
library("visdat")
vis_dat(df)