Intro to R

R is a project by Ross Ihaka and Robert Gentleman during 1990s. It is simply a text file containing a set of commands and comments, which can be saved and used for re-execution, as well as modification and editing of commands that is interpreted by the computer (https://www.oreilly.com/library/view/hands-on-programming-with/9781449359089/ch01.html).

R has an advantage and disadvantages. The following are the advantages of R are open source, data wrangling, array of packages, quality plotting and graphing, platform independent, machine learning operations, and continuously growing. Some of the disadvantages of R are weak origin, data handling, basic security, complicated language, lesser speed

For further details of the advantages and disadvantages of R kindly refer to https://www.javatpoint.com/r-advantages-and-disadvantages.

Simple example of R script activity

The first activity assigned to us was to familiarize simple example of R Script. Before the onset of the activity we were requested to write on the board our Sex, Weight in lbs, and Height in inches. Each of the student participated in the activity and the informations collected serve as our class data, which we used in our next activities.

First step was to Load package

code:

library(rcompanion)

In order for me to easily distinguish variables, I usually use capital letters in column variables.

Input =(" AGE WEIGHT

1 4.4
3 5.3
5 7.2
2 5.2
11 8.5
9 7.3
3 6.0
9 10.4
12 10.2
3 6.1 ")

Description:

To run the code in R script, highlight or select the block. Click the Run button in the Script Editor panel toolbar to run the selected block. After clicking the Run button, three things may happen: 1) the code is transferred to the command console, 2) the code is executed, and 3) the cursor moves to the next line in the script (http://mercury.webster.edu/aleshunas/R_learning_infrastructure/R%20scripts.html).

code:

egdata <-read.table(textConnection(Input), header=TRUE)

egdata

Figure 1. Selected code transferred to console after clicking Run button

Load data in memory using the code below

attach(egdata)

Description

We were instructed to calculate the mean WEIGHT, standard deviation of WEIGHT, and correlation of AGE to WEIGHT using the codes below:

mean(WEIGHT)

sd(WEIGHT)

cor(AGE,WEIGHT)

Figure 2. Mean, Standard deviation of WEIGHT, and correlation of AGE to WEIGHT

Simple plot AGE,WEIGHT

code:

plot(AGE,WEIGHT)

Figure 3. Simple plot of AGE and WEIGHT

Plot using package,ggplot2,Loadpackage

Description: Advantage and Disadvantage between qplot and ggplot2

There are different ways to plot data. The most commonly used to create and combine easily different types of plots are qplot (quick plot) and ggplot2.

qplot is designed primarily for interactive use: it makes a number of assumptions that speed most cases, but when designing multilayered plots with different data sources it can get in the way (https://ggplot2.tidyverse.org/reference/translate_qplot_ggplot.html). qplot is somehow less flexible compared to ggplot2. While, ggplot2 allows the user to add, remove or alter components in a plot at a high level of abstraction (https://en.wikipedia.org/wiki/Ggplot2).

qplot is similar to the R’s plot() function, while more complex plotting capacity is available via ggplot() which exposes the user to more explicit elements of the grammar (https://en.wikipedia.org/wiki/Ggplot2).

Insight

Both qplot and ggplot2 are useful functions. However, one may have an advantage over the other. In my understanding qplot is like the basic function to create plots in Rstudio, which can be easier and faster because of simple codes. In ggplot2, it is the somehow the new version or modern version of creating plots with lots of components. However, ggplot2 can be more complicated, especially the codes used in creating the desired commands to create the ideal plot.

Description

In this activity, we are instructed to use the ggplot2 in creating simple plot of the AGE and WEIGHT data.

code

library(ggplot2)

Simple plot from ggplot2

Description

ggplot2 was used in this activity in creating a simple plot using the codes below to run and display the plot of AGE and WEIGHT data.

egplot <- ggplot(data=egdata, aes(x=AGE,y=WEIGHT))+ geom_point(shape=18,color=“red”)

egplot

Figure 2. Mean, Standard deviation of WEIGHT, and correlation of AGE to WEIGHT

Description

Simple plot using ggplot2 with line, title, and labels using the codes below. Plot with complete details helps the readers to easily understand the result.

egplot<-egplot + ggtitle(“Plot of WEIGHT on AGE”)+ stat_smooth(method=“lm”,se=TRUE)+ scale_x_continuous(name=“AGE”)+ scale_y_continuous(name=“WEIGHT (lbs)”)

egplot

Figure 2. Simple plot using ggplot2 with line, title, and labels

Quit example

Description

After performing functions using R script, save the script into R file, and use code below to quit the R studio.

code

q()