1 Intoduction to R-Studio

R Studio is an integrated development environment(IDE) for R. IDE is a (Graphical User Interface) GUI, where you can write your quotes, see the results and also see the variables that are generated during the course of programming.

  • R Studio is available as both Open source and Commercial software.
  • R Studio is also available as both Desktop and Server versions.
  • R Studio is also available for various platforms such as Windows, Linux, and macOS.

Downlod RStudio

After you download R Studio install it then run the program

Then you will get the window of this type

R_Studio Image In this case the lower left pane is the R console, which can be used just like the standard R console. The upper left pane takes the place of a text editor but is far more powerful. The upper right pane holds information about the workspace, command history, files in the current folder and variable information. The lower right pane displays plots, package information and help files. There are a number of ways to send and execute commands from the editor to the console. To send one line, place the cursor at the desired line and press Ctrl+Enter (Command+Enter on Mac). To insert a selection, simply highlight the selection and press Ctrl+Enter. To run an entire file of code, press Ctrl+Shift+S. When typing code, such as an object name or function name, hitting Tab will autocomplete the code. If more than one object or function matches the letters typed so far, a dialog will pop up giving the matching options, as shown in Figure 2.12. Typing Ctrl+1 moves the cursor to the text editor area and Ctrl+2 moves it to the console. To move to the previous tab in the text editor, press Ctrl+Alt+Left on Windows, Ctrl+PageUp in Linux and Ctrl+Option+Left on Mac. To move to the next tab in the text editor, press Ctrl+Alt+Right in Windows, Ctrl+PageDown in Linux and Ctrl+Option+Right on Mac. On some Windows machines these shortcuts can cause the screen to rotate, so Ctrl+F11 and Ctrl+F12 also move between tabs as does Ctrl+Alt+Left and Ctrl+Alt+Right, though only in the desktop client. For an almost-complete list of shortcuts, click Help >> Keyboard Shortcuts or use the keyboard shortcut Alt+Shift+K on Windows and Linux and Option+Shift+K on Mac. A more complete list can be seen from RStudio IDE under the Tools menu: Tools → Keyboard Shortcuts Help.

2 R Packages

R packages are compilations of functions that can be used for any number of purposes. Packages are extremely useful because the code for functions have already been written, so we don’t need to worry about writing them ourselves. The base installation of R already includes a number of useful functions for data analysis and plotting; but, because R is an open sourced programming language, many other packages have been created by other users for a variety of purposes. Packages can be easily downloaded in RStudio through CRAN or GitHub and will be stored in system library. Packages all have written information about each of the functions included within it.

2.1 install package

if you wants to use the package moments (for skewness kurtosis) you have to install moment package

#install.packages("moments")

Installing package into ‘C:/Users/User/AppData/Local/R/win-library/4.3’ trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.3/moments_0.14.1.zip’ Content type ‘application/zip’ length 56091 bytes (54 KB) downloaded 54 KB

package ‘moments’ successfully unpacked and MD5 sums checked

#Call the package moments
#library(moments)

Now we can calculate skewness, kurtosis and its coefficients.

Now a days you can install packages directly from GitHub repositories, especially to get the development versions of packages. This can be accomplished using devtools.

#install.packages("devtools") 
#library(devtools)
#install_github("gastonstat/plspm")

in above example

First we have to install devtools then call its library

Then we have to install the library which is there in Github

2.2 Unstall package

In the rare instance when a package needs to be uninstalled, it is easiest to click the white X inside a gray circle on the right of the package descriptions in RStudio’s Packages pane. Alternatively, this can be done with remove.packages.

#remove.packages("moments")

2.3 unloading package

Sometimes a package needs to be unloaded. This is simple enough either by clearing the checkbox in RStudio’s Packages pane or by using the detach function. The function takes the package name preceded by package: all in quotes.

#detach("package:moments")

3 Mathematical Operations

Simple Operations Adding, subtracting, multiplying, dividing, and using exponents in R is a very simple process. In fact, it uses the same conventions as many other programs, including Microsoft Excel, to perform these operations. Unsurprisingly, the four arithmetic operations are done by using +,−, ∗, and /. Exponential operations are done using the caret symbol (ˆ) or two asterisks ∗∗.

2+2
## [1] 4
4-2
## [1] 2
3*5
## [1] 15
3**2
## [1] 9

To do the calculations, type the formula into the R console and hit . Text editor can be used to write the command and hit + to execute the formula. Be sure to be careful about your order of operations and, if needed, your parentheses, as R will follow the standard order of operations. R has particular conventions about parentheses in particular, Where the mathematical formula 2(1 + 2) is completely void, R does not know how to read this, and will return an error. In order to enter this in a way that R will understand, you will have to say 2*(1+2)

2*(1+2)
## [1] 6

which has return the answer 6. In addition, for every parenthesis that you open in an equation, a closing parenthesis is necessary. If an open parenthesis does not have a matching close parenthesis, R will not return your answer, but will return an empty line with a + sign at the left. To exit out of the line, hit and reenter your formula. ## Variables Variables are the crux of most of what we will be using in R. These variables are saved data of either numeric or character values. The simplest variable is a variable with a single saved value. In general, to save a variable with a specific value in R, you type variable name = value and then hit + to save the variable. From that point onward that variable will be associated with that value, until you either exit R or overwrite the value. For example, if you wanted to assign the value 51 (inches) to a variable named height, you would enter

height=51

and then hit + . From that point on, height will be associated with the value 51, and can be acted on by mathematical operators. For example, say you wanted to find the value of height in feet. You would want to divide the height 51 inches by 12. To do this in R, you would type

height/12
## [1] 4.25

and hit + to get the height in feet value of 4.25. If you wanted to keep that value around, it could be stored in a new variable by using that exact formula, for example,

heightfeet=height/12

At which point the variable heightfeet will be associated with the value 51/12=4.25. It is important to note that for variable names in R, you can use any combination of letters, numbers, and select special characters such as periods or underscores. Spaces and other special characters such as the exclamation point or commas are not allowed in variable names. If a variable name breaks a naming rule a disallowed character, for example R will return an error, saying “Error: unexpected input” or “Error: unexpected symbol.” One other important aspect of variable names is that they are case sensitive, meaning that capitalization matters to variable names. To R, the variable named height what we defined to be equal to 51 and the variable named Height something we never created in R so it does not exist are two different variables. If you enter Height into the console, R will not return the value 51 but will tell you that the variable “Height” is not found.

Character variables have a similar entering process in R, with one key distinction. Character values must be entered using quotes, otherwise R will return an error. The entry process is again variable name = “value” and then hit + to save the variable. The variable is then associated with that value until exiting R or until it is overwritten. For example,

name = "Bijay Pradhan"

and the variable name will be associated with the value “Bijay Pradhan” from that point onward. Note that if you do not put the value of your variable in quotes, R will return an error. All of the naming conventions for character variable names and numeric variable names are identical.

3.1 Vectors

Variables can be more than just single values. Commonly in our datasets, variables will have one value for every participant in our study. These multiple values are stored in vectors in R, which will be stored in data frames. In R, vectors are defined using the notation vector name = c( value 1 , value 2 , … , last value ) So, for example, if you recorded heights for five people and stored them in a vector named height, it would be

height = c(75, 74, 67, 83, 75)
height
## [1] 75 74 67 83 75

and then hit +. The variable height will now be associated with those five values until exiting R or overwriting. Vectors can be acted on using arithmetic operations similar to single-value variables. For example, to find the height in feet of these five individuals, you would divide the whole height vector by 12, using

height/12
## [1] 6.250000 6.166667 5.583333 6.916667 6.250000

This would return 6.25, 6.17, 5.58, 6.92, and 6.25, the heights of each of these five people on foot. These values can be stored in a vector of their own if desired. In addition, two vectors of the same length can be added together, with the first elements of both vectors being added together, the second elements added together, etc.

Saving vectors of character values is identical to numeric, with the exception of entering the values within quotes. It is important to note that you cannot mix numeric and character values in the same vector. If you attempt to, R will automatically change all numeric value to characters. For example, if we wanted to save the names of five people in a vector called name, we would use the code

name = c("Bijay", "Hari", "Shyam", "Binod", "Krishna")
name
## [1] "Bijay"   "Hari"    "Shyam"   "Binod"   "Krishna"

3.2 Data Frames

The most common way that we will receive our data in R is through data frames. In a data frame, each row gives you all the variable values for one observation in your study, while each column gives you all the values for a variable. To create a data frame with vectors that we already created in R, we use data frame name = data.frame(vector 1 name , … , last vector name) and hit +. Naming conventions are identical for data frames and variables. It is important to note that in creating data frames, we need to ensure the ordering for each vector is the same. In other words, the first values in vector 1 and vector 2 both belong to observation 1, the second values from observation 2, and so on. So, for example, to create a data frame name data1 with the names and heights from the previous section, we would use

data1=data.frame(name, height)
data1
##      name height
## 1   Bijay     75
## 2    Hari     74
## 3   Shyam     67
## 4   Binod     83
## 5 Krishna     75

Printing this out in R, we can see that Bijay is 75 inches tall, Hari is 74 inches tall, etc. Data frames are able to have both character vectors and numeric vectors in the same data frame.

4 Practice some mathematics

#Basic Arithmetic 
45+67
## [1] 112
789-234
## [1] 555
456/23
## [1] 19.82609
234*897
## [1] 209898
234^2
## [1] 54756
sqrt(35)
## [1] 5.91608
seq(1:100)
##   [1]   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18
##  [19]  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36
##  [37]  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54
##  [55]  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72
##  [73]  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90
##  [91]  91  92  93  94  95  96  97  98  99 100
marks<-c(75,80,85,90,96,97,98,99,100,98)  # Take 10 random marks of students)
marks
##  [1]  75  80  85  90  96  97  98  99 100  98
min(marks)
## [1] 75
max(marks)
## [1] 100
mean(marks)
## [1] 91.8
sd(marks)
## [1] 8.891944
median(marks)
## [1] 96.5
mode(marks)
## [1] "numeric"
summary(marks)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   75.00   86.25   96.50   91.80   98.00  100.00

5 Practice Problem

Consider the following set of attributes about the Nepalese movies list.

  1. What code would you use to create a vector named Movie with the values Dreams, Pasupati Prasad, Chhka panja, Jatral, and Kandetar?

  2. What code would you use to create a vector giving the year that the movies in Problem 1 were made named Year with the values 2015, 2016, 2014, 2017, and 2018?

  3. What code would you use to create a vector giving the run times in minutes of the movies in Problem 1 named RunTime with the values 119, 177, 102, 129, and 103?

  4. What code would you use to find the run times of the movies in hours and save them in a vector called RunTimeHours?

  5. What code would you use to create a data frame named MovieInfo containing the vectors created in Problem 1, Problem 2, and Problem 3?