Agenda
- Introduction
- What is R?
- Capabilities of R
- R Packages
- How we used R in our project
- R Visualisations
- A Bit About Shiny
- A Bit About Slidify
- Questions
Dhaneshwar Lal Batheja, Mohit Prem Dialani
Masters of Information Technology
About Ourselves
About our Project
"R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS" - The R Project Organisation
"R is a data analysis software; R is a programming language; R is an environment for statistical analysis; R is open-source; R is a Community" - www.inside-r.org
"While R was initially a statistical computing language, in 2012 you could call it a complete analytical environment" - R for Business Analytics by Ajay Ohri
"R is extremely powerful" - Creators of R also called Revolution Analytics
For Installation instructions:
Please Visit: http://theRblabber.wordpress.com/2013/05/21/introduction/
The R Language is widely used among statisticians and data miners for developing statistical software and analysing data.
The capabilities of R are extended through user-created packages, which allow statistical techniques, graphical devices, import/export, reporting tools etc
Feature Type | Feature Name |
---|---|
Analytics | Basic Mathematics; Basic Statistics; Probability Distribution; Big Data Analytics; Machine Learning; Statistical Modelling |
Graphics and Visualisation | Static Graphics; Dynamic Graphics |
Programming Language Features | Input/Output; Object-Oriented Programming; Distributed Computing; Included R Packages |
R has a wide variety of data types including scalars, vectors (numerical, character, logical), matrices, data frames, and lists.
Data Types | Description |
---|---|
Vectors | Numerical, Character or Logical Values. |
Matrices | All columns in a matrix must have the same mode(numeric, character, etc.) and the same length. |
Data Frames | A data frame is more general than a matrix, in that different columns can have different modes (numeric, character, factor, etc.) |
Arrarys | Arrays are similar to matrices but can have more than two dimensions. |
Lists | An ordered collection of objects (components). |
a <- c(1, 2, 5.3, 6, -2, 4) # numeric vector
a
## [1] 1.0 2.0 5.3 6.0 -2.0 4.0
b <- c("one", "two", "three") # character vector
b
## [1] "one" "two" "three"
y <- matrix(1:8, nrow = 2, ncol = 4) # generates 2 x 4 numeric matrix
## [,1] [,2] [,3] [,4]
## [1,] 1 3 5 7
## [2,] 2 4 6 8
z <- matrix(1:6, nrow = 2, ncol = 3, byrow = TRUE) # generates 2 x 3 numeric matrix
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 4 5 6
d <- c(1, 2, 3, 4) # Assigning numerical vector to d
e <- c("red", "white", "red", NA) # Assigning character vector to e
f <- c(TRUE, TRUE, TRUE, FALSE) # Assigning logical vector to f
mydata <- data.frame(d, e, f) # Creating Data Frame from vectors
## ID Color Passed
## 1 1 red TRUE
## 2 2 white TRUE
## 3 3 red TRUE
## 4 4 <NA> FALSE
Almost everything in R is done through functions. Some of the built-in functions are:
Category | Function |
---|---|
Numerical | abs(x); sqrt(x); trunc(x); log(x); etc |
Character | substr(x, start=n1, stop=n2); paste(..., sep=""); toupper(x) etc |
Statistical | mean(x); median(x); dnorm(x); rnorm(n, m=0,sd=1) etc |
Data Frame | fix(x); dim(x); rbind(df1,df2); data.frame(df1,df2); aggregate(df ,by=list(),fun=sum) etc |
Other Functions | setwd('path'); getwd(); install.packages("package_name"); update.packages(); ?anything; ??anything(Help from Internet) etc |
More than 5000 packages available today.
More than one package for one type of Analytical Task
There is a high probability that the algorithm you are looking for is already in the repository.
You can build your own algorithm into a package.
Some of the most used packages.
Package | Package Function |
---|---|
sqldf | Selecting from Data Frames using SQL |
forecast | For easy forecasting of time series |
plyr | Data aggregation |
RMySQL, ROracle, RSQLite | Database connection packages |
Rattle | A Simple GUI to perform analytical tasks |
ggplot2 | Data visualization |
JGR | Java GUI for R |
randomForest | Random forest predictive models |
googleVis | Visualisation of Data through Google API |
Shiny | Building Web Applications |
Slidify | Create Interactive HTML5 Slideshows |
To learn how to install a package
Please Visit: http://theRblabber.wordpress.com/2013/05/21/getting-started/
We chose R over other analytical softwares because:
R is an Open Source Project.
Good integration with Programming Language.
Graphics and Data Visualisation.
New and Upcoming.
Frequent Package Releases and Upgrades.
googleVis is a package for R and provides an interface between R and the Google Chart Tools
The functions of the package allow users to visualise data with the Google Chart Tools without uploading their data to Google
The output of googleVis functions is html code that contains the data and references to JavaScript functions hosted by Google
Create wrapper functions in R which generate html files with references to Google's Chart Tools API
Run demo(googleVis) to see examples of all charts and read the vignette for more details.
A simple Table with Data.
require(googleVis) ##Load Library and Data Sets
## Loading required package: googleVis
## Welcome to googleVis version 0.4.3
##
## Please read the Google API Terms of Use before you use the package:
## https://developers.google.com/terms/
##
## Type ?googleVis to access the overall documentation and
## vignette('googleVis') for the package vignette. You can execute a demo of
## the package via: demo(googleVis)
##
## More information is available on the googleVis project web-site:
## http://code.google.com/p/google-motion-charts-with-r/
##
## Contact: <rvisualisation@gmail.com>
##
## To suppress the this message use:
## suppressPackageStartupMessages(library(googleVis))
table1 <- gvisTable(Population, options = list(width = 1000, height = 250)) ## Assign gvisTable function to table1
print(table1, tag = "chart") ## Plot table1 (only for slidify)
A Simple Column Chart - Suncorp Case Study
print(tableRC, "chart")
aa <- read.csv("Minmaxriskperstate.csv", header = TRUE, sep = ",")
print(RiskChart, "chart")
ScatterWomen <- gvisScatterChart(women, options = list(pointSize = 4, vAxis = "{title:'weight (lbs)'}",
hAxis = "{title:'height (in)'}", width = 500, height = 430))
M5 <- gvisMotionChart(Fruits, "Fruit", timevar = "Year", options = list(height = 350))
treeRegions <- gvisTreeMap(Regions, idvar = "Region", parentvar = "Parent",
sizevar = "Val", colorvar = "Fac", options = list(showScale = TRUE, width = 600,
height = 350))
Org <- gvisOrgChart(Regions, idvar = "Region", parentvar = "Parent", tipvar = "TipVal",
options = list(allowCollapse = TRUE, allowHTML = TRUE))
GeoMap <- gvisGeoChart(am, locationvar = "STATE", colorvar = "TOTALCOUNT", options = list(region = "AU",
displayMode = "regions", resolution = "provinces", height = 300, width = 1000,
gvis.editor = "Edit Me!"))
PieC <- gvisPieChart(ac, labelvar = "STATE", numvar = "TOTAL", options = list(width = 500,
height = 300))
table2 <- gvisTable(ac, options = list(width = 500, height = 300))
M1 <- gvisMerge(PieC, table2, horizontal = TRUE)
|
|
|
||
|
Easily build your reports into a dynamic web application.
Customize your reports.
Let users choose input parameters using sliders, drop-downs etc.
No HTML or Java Script necessary. Only a little bit of R knowledge required to turn your analysis into interactive applications.
Create Dynamic Powerful Slideshows.
Automatically include your dynamic charts and maps without using another tool.
Publish directly on the web.
HTML Coding
In its development stage, but a very powerful tool.
This presentation was completely built on R
Ms. Richi Nayak (Project Supervisor, Motivator)
Suncorp Board (Industry Partners)
Eric Tang (Big Data Lab Expert)
Ajay Ohri (Author of R for Business Analytics)
Ramnath Vaidyanathan (Creator of Package Slidify)
Rest of the Team Lin Chen, Sejung (Group Members)
Thank you for attending our presentation