UTSA Department of Management Science & Statistics
March 5, 2016
#Step 1: Flip coin ten times
num=sample(0:1,10,replace=TRUE)
num
#Step 2: Total number of heads
total=sum(num)
total
#Step 3: Compute proportion
prop=mean(num)
prop
props=c()
#Step 0: The number of students
n=25
for (i in 1:n){
#Step 1: Each student flips coin ten times
num=sample(0:1,10,replace=TRUE)
#Step 2: Total number of heads per student
total=sum(num)
#Step 3: Compute proportion
prop=mean(num)
#Step 4: Combine all students' results
props=append(props,prop)
}
mean(props)
[1] 0.464
hist(props)
Does your histogram look normal?
props=c()
#Step 0: The number of students
n=100
for (i in 1:n){
#Step 1: Each student flips coin ten times
num=sample(0:1,10,replace=TRUE)
#Step 2: Total number of heads per student
total=sum(num)
#Step 3: Compute proportion
prop=mean(num)
#Step 4: Combine all students' results
props=append(props,prop)
}
mean(props)
[1] 0.497
hist(props)
Does your histogram look normal?
props=c()
#Step 0: The number of students
n=2000
for (i in 1:n){
#Step 1: Each student flips coin ten times
num=sample(0:1,10,replace=TRUE)
#Step 2: Total number of heads per student
total=sum(num)
#Step 3: Compute proportion
prop=mean(num)
#Step 4: Combine all students' results
props=append(props,prop)
}
mean(props)
[1] 0.5016
hist(props,prob=TRUE)
curve(dnorm(x,mean=mean(props)-0.05,sd=sqrt(var(props))),col="blue", add=TRUE)
What about this one?
We collected your data in the morning session. Let's explore it!
data=read.csv("http://pastebin.com/raw/MFYcN8Rk")
fix(data)
Typing the object name “data” will output the data to the R console.
The “fix” function also allows you to look at the data just like an Excel spreadsheet.
summary(data)
hist(data$shoe)
hist(data$shoe,col="red",xlab="Shoe Size (in cm)",
main="Histogram of Shoe Size")
hist(data$height)
hist(data$height,col="green",xlab="Height (in cm)",
main="Histogram of Height")
boxplot(data$height~data$gender,main="Boxplot",
col=rainbow(2),xlab="Gender",ylab="Height")
boxplot(data$shoe~data$gender,main="Boxplot",
col=rainbow(2),xlab="Gender",ylab="Shoe Size")
#Fit the regression model
lm=lm(data$height~data$shoe)
#plot the relationship
plot(x=data$shoe,y=data$height,xlab="Shoe Size",
ylab="Height",main="Height vs. Shoe Size")
#add the regression line
abline(lm,col="red")
Question: Is the number of explosions in the movie correlated to its profit?
| Movie.Title | Number.of.Explosions | Profit | |
|---|---|---|---|
| 1 | The Island | 16 | 162.95 |
| 2 | Bad Boys | 18 | 141.41 |
| 3 | The Rock | 22 | 335.06 |
| 4 | Bad Boys 2 | 31 | 273.34 |
| 5 | Armageddon | 121 | 553.71 |
| 6 | Pearl Harbor | 162 | 449.22 |
| 7 | Transformers | 128 | 709.71 |
| 8 | Transformers: Revenge of the Fallen | 211 | 836.30 |
| 9 | Transformers: Dark of the Moon | 283 | 1119.13 |
| Estimate | Std. Error | t value | Pr(>|t|) | |
|---|---|---|---|---|
| (Intercept) | 154.3877 | 60.7288 | 2.54 | 0.0385 |
| Number.of.Explosions | 3.2171 | 0.4248 | 7.57 | 0.0001 |
Our analysis shows that the profit of a Michael Bay movie increases by 3.21 million dollars for each added explosion.
The effect is very significant.