Illya Mowerman, Ph.D.
University of Bridgeport
An object:
Packages are functions that are created be the R community and have made them available to all
Installng packages is easy:
install.packages('dplyr' , repos='http://cran.us.r-project.org')
The downloaded binary packages are in
/var/folders/8c/w4htphd93r7cq7tcyp65xr780000gn/T//Rtmply3yn0/downloaded_packages
To be able to use the functions of a package, you must load it first
library(dplyr)
To get help on a specigic function simply put a ? in front of the function, and the help facility will display the documentation. Note that it is extremely usefull to scroll down to the examples
?sum
There are many ways to get data into R:
The easiest way to import data using the Import Dataset functionality within RStudio. You will find this under the Environment Tab
Import the following files using the Import Dataset button
The files can be found on the course website: https://bridgeport.instructure.com/courses/1511792/files/folder/Data/Class%202
library(readxl)
Grocery <- read_excel("~/Dropbox/Bridgeport/ITKM 560 - Fall 2017/Data/Class 2/Grocery.xlsx")
print(Grocery)
# A tibble: 15 x 4
Item Tops `Wal-Mart` Wegmans
<chr> <dbl> <dbl> <dbl>
1 Bananas (1 lb.) 0.49 0.48 0.49
2 Campbell's soup (10.75 oz) 0.60 0.54 0.77
3 Chicken breasts (3 lbs.) 10.47 8.61 8.07
4 Colgate toothpaste (6.2 oz) 1.99 2.40 1.97
5 Large eggs (1 dozen) 1.59 0.88 0.79
6 Heinz ketchup (36 oz) 2.59 1.78 2.59
7 Jell-O (cherry, 3 oz.) 0.67 0.42 0.65
8 Jif peanut butter (18 oz.) 2.29 1.78 2.09
9 Milk (fat free, 1/2 gal.) 1.34 1.24 1.34
10 Oscar Meyer hot dogs (1 lb.) 3.29 1.50 3.39
11 Ragu past sauce (1 lb., 10 oz.) 2.09 1.50 1.25
12 Ritz crackers (1 lb.) 3.29 2.00 3.39
13 Tide detergent (liquid, 100 oz.) 6.79 5.24 5.99
14 Tropicana orange juice (1/2 gal.) 2.50 2.50 2.50
15 Twizzlers (strawberry, 1 lb.) 1.19 1.27 1.69
Depending on the type of object and the syntac you employ you would:
summary(Grocery)
Item Tops Wal-Mart Wegmans
Length:15 Min. : 0.490 Min. :0.420 Min. :0.490
Class :character 1st Qu.: 1.265 1st Qu.:1.060 1st Qu.:1.020
Mode :character Median : 2.090 Median :1.500 Median :1.970
Mean : 2.745 Mean :2.143 Mean :2.465
3rd Qu.: 2.940 3rd Qu.:2.200 3rd Qu.:2.990
Max. :10.470 Max. :8.610 Max. :8.070
Using the Grocery data, create a new variable that is the variable Tops divided by 2
Grocery$new_var <- Grocery$Tops/2
summary(Grocery$Tops)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.490 1.265 2.090 2.745 2.940 10.470
summary(Grocery$new_var)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.2450 0.6325 1.0450 1.3730 1.4700 5.2350
Using the Grocery data, create a new variable that is the variable Wegmans divided by 2
library(dplyr)
Grocery <- Grocery %>%
mutate(new_var_dplyr = Wegmans/2)
summary(Grocery$Wegmans)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.490 1.020 1.970 2.465 2.990 8.070
summary(Grocery$new_var_dplyr)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.245 0.510 0.985 1.232 1.495 4.035