R is a project by Ross Ihaka and Robert Gentleman during 1990s. It is simply a text file containing a set of commands and comments, which can be saved and used for re-execution, as well as modification and editing of commands that is interpreted by the computer (https://www.oreilly.com/library/view/hands-on-programming-with/9781449359089/ch01.html).
R has an advantage and disadvantages. The following are the advantages of R are open source, data wrangling, array of packages, quality plotting and graphing, platform independent, machine learning operations, and continuously growing. Some of the disadvantages of R are weak origin, data handling, basic security, complicated language, lesser speed
For further details of the advantages and disadvantages of R kindly refer to https://www.javatpoint.com/r-advantages-and-disadvantages.
The first activity assigned to us was to familiarize simple example of R Script. Before the onset of the activity we were requested to write on the board our Sex, Weight in lbs, and Height in inches. Each of the student participated in the activity and the informations collected serve as our class data, which we used in our next activities.
First step was to Load
code:
library(rcompanion)
In order for me to easily distinguish variables, I usually use capital letters in column variables.
Input =(" AGE WEIGHT
1 4.4
3 5.3
5 7.2
2 5.2
11 8.5
9 7.3
3 6.0
9 10.4
12 10.2
3 6.1 ")
Description:
To run the code in R script, highlight or select the block. Click the Run button in the Script Editor panel toolbar to run the selected block. After clicking the Run button, three things may happen: 1) the code is transferred to the command console, 2) the code is executed, and 3) the cursor moves to the next line in the script (http://mercury.webster.edu/aleshunas/R_learning_infrastructure/R%20scripts.html).
code:
egdata <-read.table(textConnection(Input), header=TRUE)
egdata
Figure 1. Selected code transferred to console after clicking Run button
attach(egdata)
Description
We were instructed to calculate the mean WEIGHT, standard deviation of WEIGHT, and correlation of AGE to WEIGHT using the codes below:
mean(WEIGHT)
sd(WEIGHT)
cor(AGE,WEIGHT)
Figure 2. Mean, Standard deviation of WEIGHT, and correlation of AGE to WEIGHT
code:
plot(AGE,WEIGHT)
Figure 3. Simple plot of AGE and WEIGHT
Description: Advantage and Disadvantage between qplot and ggplot2
There are different ways to plot data. The most commonly used to create and combine easily different types of plots are qplot (quick plot) and ggplot2.
qplot is designed primarily for interactive use: it makes a number of assumptions that speed most cases, but when designing multilayered plots with different data sources it can get in the way (https://ggplot2.tidyverse.org/reference/translate_qplot_ggplot.html). qplot is somehow less flexible compared to ggplot2. While, ggplot2 allows the user to add, remove or alter components in a plot at a high level of abstraction (https://en.wikipedia.org/wiki/Ggplot2).
qplot is similar to the R’s plot() function, while more complex plotting capacity is available via ggplot() which exposes the user to more explicit elements of the grammar (https://en.wikipedia.org/wiki/Ggplot2).
Insight
Both qplot and ggplot2 are useful functions. However, one may have an advantage over the other. In my understanding qplot is like the basic function to create plots in Rstudio, which can be easier and faster because of simple codes. In ggplot2, it is the somehow the new version or modern version of creating plots with lots of components. However, ggplot2 can be more complicated, especially the codes used in creating the desired commands to create the ideal plot. But in terms of the representation of the result (e.g graphs), ggplot2 has better layout with darker background to help readers easily understand the result.
Description
In this activity, we are instructed to use the ggplot2 in creating simple plot of the AGE and WEIGHT data.
code
library(ggplot2)
Description
ggplot2 was used in this activity in creating a simple plot using the codes below to run and display the plot of AGE and WEIGHT data.
egplot <- ggplot(data=egdata, aes(x=AGE,y=WEIGHT))+ geom_point(shape=18,color=“red”)
egplot
Figure 2. Mean, Standard deviation of WEIGHT, and correlation of AGE to WEIGHT
Description
Simple plot using ggplot2 with line, title, and labels using the codes below. Plot with complete details helps the readers to easily understand the result.
egplot<-egplot + ggtitle(“Plot of WEIGHT on AGE”)+ stat_smooth(method=“lm”,se=TRUE)+ scale_x_continuous(name=“AGE”)+ scale_y_continuous(name=“WEIGHT (lbs)”)
egplot
Figure 2. Simple plot using ggplot2 with line, title, and labels
Nowadays reproducible researches emerge rapidly and it is very important to have a neat and clear ways of presenting the analysis of results in order to help other researchers to better understand what has been done in the analysis. R markdown allows us to create neat records of our analysis in a way that it compiles data and analysis like a notebook. R Markdown uses Markdown syntax, a very simple ‘markup’ language for creating documents with headers, images, links, etc. It is also convertible to other file types like .html or .pdf to display functions and components.
source: https://ourcodingclub.github.io/2016/11/24/rmarkdown-1.html
Above is an example of a YAML header at the top of an .Rmd script indicated as output: html_pdf, which creates PDF document.
Description
Above is a code chunk, and I use “include=FALSE” to have the chunk evaluated, but neither the code nor its output displayed. The echo=TRUE option, the code will be shown and results/output shall be displayed. If this is set to FALSE, the code will not be shown in the final document (though any results/output would still be displayed).
Description
This is an R Markdown document. By clicking the Knit button, a document will be generated that includes both the content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
Note
I clean-up first the R environment to ensure that the correct dataset will be run. The function setwd tells R to use this directory that contains the input files and will also be the location where output files are saved.
rm(list=ls())
setwd("C:/Users/April Mae Tabonda/Documents/MS Marine Science/Biostat/PLP/RMDs/PLP_4 Intro to R and R Markdown/")
I checked the working directory if it exists using the codes below.
## [1] "C:/Users/April Mae Tabonda/Documents/MS Marine Science/Biostat/PLP/RMDs/PLP_4 Intro to R and R Markdown"
## character(0)
After working directory was recognized, I attach the class data for this activity using the codes:
htwt.data<-read.csv(file="htwt.csv",
header=TRUE, sep=",")
attach(htwt.data)
The PSYCH PACKAGE can be used for SUMMARY STATITICS BY GROUP
Load the psych package by running the code below.
library(psych)
In this acivity we used Boxplot to compare HEIGHT between male & female from our class data. We run the codes below.
boxplot(htwt.data$HEIGHT~SEX,
main="HEIGHT by SEX",
ylab="HEIGHT(in)",
xlab="SEX")