Install R, then R Studio. https://posit.co/download/rstudio-desktop/
Reading up on the help files from official R website.
The R Manuals. https://cran.r-project.org/manuals.html
The help()
function and ?
help operator
in R provide access to the documentation pages for R functions, data
sets, and other objects, both for packages in the standard R
distribution and for contributed packages. https://www.r-project.org/help.html
Good coding style is like using correct punctuation. You can manage without it, but it sure makes things easier to read. As with styles of punctuation, there are many possible variations. The following guide describes the style that I use (in this book and elsewhere). It is based on Google’s R style guide, with a few tweaks. You don’t have to use my style, but you really should use a consistent style.
http://adv-r.had.co.nz/Style.html
Files names should be meaningful/informative.
Variable and function names should be lowercase. Use underscore (
_
) to separate words.
=
, +
, -
, <-
,
etc.). Comment your code. Each line of a comment should begin with the
comment symbol and a single space: #
. Comments should
explain the why, not the what.
Use commented lines of -
and =
to break up
your file into easily readable chunks.
Data Type: In R, “data type” refers to the classification of variables or values, such as numeric, integer, character, etc., which determine how data is stored in memory and what operations can be performed on them.
Data Structure: A “data structure” refers to the way data is organized and stored in R objects, such as vectors, lists, matrices
In R, data types refer to the classification of variables or values that determine how they are stored in memory and what operations can be performed on them. Here are the primary data types in R:
Numeric: Used for numeric (real) values.
1.5
, 3
, 0.25
Integer: Used for integer values.
1L
, 2L
, 100L
Logical: Used for Boolean values
(TRUE
or FALSE
).
TRUE
, FALSE
Character: Used for strings of text.
"Hello"
, "R programming"
Factor: A special data type for categorical variables.
factor(c("low", "medium", "high"))
Date: Used for dates.
as.Date("2023-01-01")
POSIXct and POSIXlt: Used for date-time values (seconds since the Unix epoch and date-time components respectively).
as.POSIXct("2023-01-01 12:00:00")
Raw: Used for storing raw bytes.
as.raw(c(0x01, 0x02, 0x03))
Complex: Used for complex numbers with real and imaginary parts.
1 + 2i
These data types help R manage different kinds of information and determine how operations such as arithmetic, comparisons, and transformations are performed on them. Understanding these types is crucial for effective data manipulation, analysis, and programming in R.
R Data types are used to specify the kind of data that can be stored in a variable.
For effective memory consumption and precise computation, the right data type must be selected.
Each R data type has unique properties and associated operations.
In R, data structures are fundamental components used to store and organize data efficiently. Each data structure has specific characteristics that determine how data is stored in memory and what operations can be performed on it.
Understanding these structures is crucial for effectively manipulating and analyzing data in R.
Vectors:
Atomic Vectors: These are one-dimensional arrays that can hold elements of the same data type, such as numeric, character, or logical.
c(1, 2, 3, 4)
creates a numeric vector.Lists: Lists are also one-dimensional but can hold elements of different types or structures.
list(1, "a", TRUE)
creates a list with
numeric, character, and logical elements.Matrices:
Matrices are two-dimensional arrays where all elements are of the same data type (numeric, character, etc.).
Example: matrix(1:6, nrow=2, ncol=3)
creates a 2x3
matrix.
Arrays:
Arrays generalize matrices to multiple dimensions, where all elements are of the same data type.
Example: array(1:24, dim=c(2, 3, 4))
creates a 2x3x4
array.
Data Frames:
Data frames are two-dimensional structures similar to tables in databases or spreadsheets.
Columns can be of different data types (numeric, character, etc.).
Example:
data.frame(id=c(1, 2, 3), name=c("Alice", "Bob", "Charlie"))
creates a data frame with columns “id” and “name”.
Factors:
Factors are used to represent categorical data in R.
They are stored as integers with corresponding levels (categories).
Example:
factor(c("low", "high", "medium", "low"), levels=c("low", "medium", "high"))
creates a factor with levels “low”, “medium”, and “high”.
Lists:
Lists in R are versatile data structures that can contain elements of different types and lengths.
Example: list(1, "a", c(1, 2, 3))
creates a list
with numeric, character, and numeric vector elements.
Data Tables (from data.table
package):
Data tables are enhanced data frames optimized for large datasets and efficient operations.
Example:
data.table(id=c(1, 2, 3), name=c("Alice", "Bob", "Charlie"))
creates a data table with columns “id” and “name”.
Each structure offers different capabilities and efficiencies depending on the nature of the data and the tasks being performed. By leveraging the appropriate data structure, you can optimize your workflow and enhance your ability to work with data in R effectively.
https://swcarpentry.github.io/r-novice-inflammation/13-supp-data-structures.html
Let me begin by introducing basic math operations.
https://www.codecademy.com/resources/docs/r/operators
2+2 # addition
## [1] 4
5 - 4
## [1] 1
2 * 3
## [1] 6
8/3
## [1] 2.666667
# Plot data ---------------------------
hist(mtcars$mpg)
sessionInfo()
## R version 4.5.1 (2025-06-13)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sequoia 15.5
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: America/New_York
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## loaded via a namespace (and not attached):
## [1] digest_0.6.37 R6_2.6.1 fastmap_1.2.0 xfun_0.52
## [5] cachem_1.1.0 knitr_1.50 htmltools_0.5.8.1 rmarkdown_2.29
## [9] lifecycle_1.0.4 cli_3.6.5 sass_0.4.10 jquerylib_0.1.4
## [13] compiler_4.5.1 rstudioapi_0.17.1 tools_4.5.1 evaluate_1.0.4
## [17] bslib_0.9.0 yaml_2.3.10 rlang_1.1.6 jsonlite_2.0.0
4.1 Comments
numeric
- (10.5, 55, 787)integer
- (1L, 55L, 100L, where the letter “L” declares this as an integer)character
(a.k.a. string) - (“k”, “R is exciting”, “FALSE”, “11.5”).logical
(a.k.a. boolean) - (TRUE or FALSE)complex
- (9 + 3i, where “i” is the imaginary part)factor
in R represents categorical data where the possible values (levels) are known and finite.https://www.programiz.com/r/data-types
https://www.geeksforgeeks.org/r-data-types/