Lynna Jirpongopas
| Time | Topics |
|---|---|
| 9:00 - 9:30 | Intro, Tools Overview & Syntax |
| 9:40 - 9:55 | Loading a .csv File Demo |
| 10:15 - 10:30 | Defining Data Types |
| 10:50 - 11:10 | Simple Statistics |
| 11:30 - 11:50 | Aggregate Functions |
| 12:15 - 12:30 | Wrap up |
R : RStudio, RGui
Packages : ggplot2, reshape
Python : Jupyter Notebook (iPython), Mac Terminal
Libraries : Pandas, NumPy
SQL : PostgreSQL, SQLite, Toad for Oracle, much more…
#This is R
print("Hello World")
#This is Python
print "Hello World"
same arithmatic symbols for adding, subtracting, multiplication, and division
+ - * /
Note: Python output number type = input data type
x to the power of y
#This is R
x^y
#This is Python
pow(x, y)
x ** y
Find a partner. Take turns writing in R & Python.
1) Print Hello World
2) 6 to the power of 4
3) Make x=3, m=8, b=6, and y=mx+b.
What is y?
4) What is 8 divided by 3?
The answer should be in decimal form.
Know the path to your data
#define path to file
dataFile <- "C:/Users/lynnaj/Documents/girldevelopit/
fun_data/vdeOilData.csv"
#load data
vdeOilDF <- read.csv(dataFile)
#view the first 5 lines of the data
head(vdeOilDF)
#https://stat.ethz.ch/R-manual/R-devel/library/utils/html/read.table.html
#You can also double click on the dataframe in Environment tab
import pandas as pd
datafile = 'C:\\Users\\lynnaj\\Documents\\girldevelopit\\fun_data\\vdeOilData.csv'
vdeOilDF = pd.read_csv(datafile)
vdeOilDF.head()
Create an empty table:
CREATE TABLE VDEandOil (vde_prices double, week_number double, month varchar, year double, oil_prices double);
Stuff the data into the table:
COPY VDEandOil FROM '/path/to/VDEandOil.txt' DELIMITER ',' CSV;
http://stackoverflow.com/questions/2987433/how-to-import-csv-file-data-into-a-postgresql-table
1) Read vdeOilData.csv or sensorsData.csv onto your RStudio Software
2) Read the same file onto your iPython Notebook
3) Import the file onto your PostgreSQL DB
vdeOilDF$Date <- as.Date(vdeOilDF$Date, format="%Y-%m-%d")
vdeOilDF$weekNumber <- as.factor(vdeOilDF$weekNumber)
vdeOilDF$vdeClose <- as.numeric(vdeOilDF$vdeClose)
vdeOilDF$pricePerBarrel <- as.numeric(vdeOilDF$pricePerBarrel)
vdeOilDF$year <- as.numeric(vdeOilDF$year)
vdeOilDF$month <- as.factor(vdeOilDF$month)
summary(vdeOilDF)
mean(vdeOilDF$pricePerBarrel)
max()
min()
sd()
summary(cars)
speed dist
Min. : 4.0 Min. : 2.00
1st Qu.:12.0 1st Qu.: 26.00
Median :15.0 Median : 36.00
Mean :15.4 Mean : 42.98
3rd Qu.:19.0 3rd Qu.: 56.00
Max. :25.0 Max. :120.00