ISDA609: Mathematical Modeling Techniques for Data Analytics \ Assignment:05

Page 228, Question 1

Consider a model for the long-term dining behavior of the students at College USA. It is found that 25% of the students who eat at the college’s Grease Dining Hall return to eat there again, whereas those who eat at Sweet Dining Hall have a 93% return rate. These are the only two dining halls on campus, and assume that all students eat at one of these halls. Formulate a model to solve for the long-term percentage of students eating at each hall.

Page 228, Answer 1

Basically we have two transitioning states; either to eat in Grease Dining Hall or Sweet Dining Hall. To simplify our model we ll call the dining halls, Grease and Sweet respectively.

First we will build the probability tree diagram

alt text

Then we will build the transition diagram

alt text

Finally we will build the transition matrix

alt text

Model formulation

Let’s define the following variables:

$s_n$ number of students eating in the Sweet dining hall at the end of period n. $g_n$ number of students eating in the Grease dining hall at the end of period n

Using the discrete modeling ideas, we can construct the following probabilistic model:

\[ g_{n+1} = 0.25g_t + 0.07s_n \] \[ s_{n+1} = 0.75g_t + 0.93s_n \]

dining <- data.frame(n=0, g=1, s=0)
n<- 15
for(i in 1:n){
 
    dining <- rbind(dining, c(i, tail(dining$g,1)*0.25 + 
                                tail(dining$s,1)*0.07,
                            tail(dining$g,1)*0.75 + tail(dining$s,1)*0.93))
}

dining

##     n          g         s
## 1   0 1.00000000 0.0000000
## 2   1 0.25000000 0.7500000
## 3   2 0.11500000 0.8850000
## 4   3 0.09070000 0.9093000
## 5   4 0.08632600 0.9136740
## 6   5 0.08553868 0.9144613
## 7   6 0.08539696 0.9146030
## 8   7 0.08537145 0.9146285
## 9   8 0.08536686 0.9146331
## 10  9 0.08536604 0.9146340
## 11 10 0.08536589 0.9146341
## 12 11 0.08536586 0.9146341
## 13 12 0.08536585 0.9146341
## 14 13 0.08536585 0.9146341
## 15 14 0.08536585 0.9146341
## 16 15 0.08536585 0.9146341

You can also embed plots, for example:

## Warning: package 'ggplot2' was built under R version 3.2.3

Thus, it takes approximately 6 steps for the steady state to take place.

Page 232, Question 1

Consider a stereo with a CD player, FM-AM radio tuner, speakers (dual), and power amplifier (PA) components, as displayed with the reliabilities shown in Figure 6.11. Determine the system’s reliability. What assumptions are required in your model?

Page 232, Answer 1

The PA system is standalone system The CD and FM-AM is parallel system The Speaker-1 and Speaker-2 is parallel system.

However, the PA, CD &FM-AM, and Speaker-1 & Speaker-2 are all one Series system.

PA<- .95
CD<- .98
FM_AM<- .97
Speaker_1<- .99
Speaker_2<- .99
####

CD_FM_AM <- (CD+FM_AM) - (CD*FM_AM)
Speaker_1_2<- (Speaker_1+Speaker_2) - (Speaker_1*Speaker_2)

Stereo<- PA * CD_FM_AM * Speaker_1_2

# Hence the reliability of the stereo components is:

Stereo

## [1] 0.9493351

Page 240, Question 1-2

Use the basic linear model y = ax + b to fit the following data sets. Provide the model, provide the values of SSE, SSR, SST, and R2 and provide residual plot.

For table 2.7 predict weight as a function of height
For table 2.7, predict weight as a function of the cube of the height

Page 240, Answer 1-2

m1721hw <- data.frame(height=60:80, weight = c(132, 136, 141, 145, 150, 155, 160, 165, 170,
175, 180, 185, 190, 195, 201, 206, 212, 218,
223, 229, 234))

The linear regression model is defined by:

$y_i = ax_i + b \quad for i = 1,2,,, m data point$

where is \[a = \frac {m \Sigma x_iy_i - \Sigma x_i \Sigma y_i} {m\Sigma x_i^2 (\Sigma x_i)^2}\] and \[b= \frac {\Sigma x_i^2 \Sigma y_i - \Sigma x_i y_i\Sigma x_i} { m\Sigma x_i^2 -(\Sigma x_i)^2} \]

Hence, let’s calculate the slope and intercept a and b respectively

m<- length(m1721hw$height)
a <- (m*sum(m1721hw$height*m1721hw$weight) - sum(m1721hw$height)*sum(m1721hw$weight))/
  (m*sum(m1721hw$height^2) - sum(m1721hw$height)^2)
a

## [1] 5.136364

b <- (sum(m1721hw$height^2)*sum(m1721hw$weight) - sum(m1721hw$height*m1721hw$weight)*sum(m1721hw$height))/
  (m*sum(m1721hw$height^2) - sum(m1721hw$height)^2)

b

## [1] -178.4978

Hence, using least-squared method we get a 5.136364 and b = -178.4978. Therefore, our y = ax + b is y = 5.136364 (x) - 178.4978

lets quick check our result using linear fit model lm function

fitmodel1 <- lm(weight~height, data=m1721hw)
fitmodel1

## 
## Call:
## lm(formula = weight ~ height, data = m1721hw)
## 
## Coefficients:
## (Intercept)       height  
##    -178.498        5.136

# Hence the slope a 
fitmodel1$coefficients[2]

##   height 
## 5.136364

#The intercept b is 

fitmodel1$coefficients[1]

## (Intercept) 
##   -178.4978

From the above results, we see that the lm and least-squared results are the same for the slope and intercept.

Now we will proceed to find:
1-The error sum of squares SSE
2-total corrected sum of squares SST
3-regression sum of sqaures SSR
4-The coeficient of determination $R^2$

SSE <- sum((m1721hw$weight-(a*m1721hw$height + b))^2)
SST <- sum((m1721hw$weight - mean(m1721hw$weight))^2)
SSR <- SST - SSE
R2  <- 1-(SSE/SST)
R2

## [1] 0.9987888

$R^2$ is 0.9987888. It means that 99% of the total variation of the y values is accounted for the linear relationship with the values of x. Thus, since $R^2$ is closer to 1, it means that the regression line model is better fit to the actual data.

Now we will show the above results graphically

require(graphics)

plot(fitmodel1)

## 4 plots on 1 page;
## allow room for printing model formula in outer margin:
par(mfrow = c(2, 2), oma = c(0, 0, 2, 0))
plot(fitmodel1)

plot(fitmodel1, id.n = NULL)                 # no id's

plot(fitmodel1, id.n = 5, labels.id = NULL)  # 5 id numbers

## Was default in R <= 2.1.x:
## Cook's distances instead of Residual-Leverage plot
plot(fitmodel1, which = 1:4)

## Fit a smooth curve, where applicable:
plot(fitmodel1, panel = panel.smooth)

## Gives a smoother curve
plot(fitmodel1, panel = function(x, y) panel.smooth(x, y, span = 1))

For table 2.7, predict weight as a function of the cube of the height

Model formulation :

We will be using a refinement of the proportionality model as follow: \[y = ax^3 + b\]

m1721hw$heightcube <- m1721hw$height^3

a <- (m*sum(m1721hw$heightcube*m1721hw$weight) - sum(m1721hw$heightcube)*sum(m1721hw$weight))/
(m*sum(m1721hw$heightcube^2) - sum(m1721hw$heightcube)^2)
a

## [1] 0.0003467044

b <- (sum(m1721hw$heightcube^2)*sum(m1721hw$weight) - sum(m1721hw$heightcube*m1721hw$weight)*sum(m1721hw$heightcube))/
(m*sum(m1721hw$heightcube^2) - sum(m1721hw$heightcube)^2)

b

## [1] 59.4584

Hence, using least-squared method we get a = 0.0003467044 and b = 59.4584. Therefore, our y = ax + b is y = 0.0003467044 (x) + 59.4584

lets quick check our result using linear fit model lm function

fitmodel2 <- lm(weight~heightcube, data=m1721hw)
fitmodel2

## 
## Call:
## lm(formula = weight ~ heightcube, data = m1721hw)
## 
## Coefficients:
## (Intercept)   heightcube  
##   5.946e+01    3.467e-04

# Hence the slope a 
fitmodel2$coefficients[2]

##   heightcube 
## 0.0003467044

#The intercept b is 

fitmodel2$coefficients[1]

## (Intercept) 
##     59.4584

From the above results, we see that the lm and least-squared results are the same for the slope and intercept.

Now we will proceed to find:
1-The error sum of squares SSE
2-total corrected sum of squares SST
3-regression sum of sqaures SSR
4-The coeficient of determination $R^2$

SSE <- sum((m1721hw$weight-(a*m1721hw$heightcube + b))^2)
SST <- sum((m1721hw$weight - mean(m1721hw$weight))^2)
SSR <- SST - SSE
R2  <- 1-(SSE/SST)
R2

## [1] 0.9980401

it looks $R^2$ =1. That means that the actual data lie perfectly along the regression line.

Now we will show the above results graphically

require(graphics)

plot(fitmodel2)

## 4 plots on 1 page;
## allow room for printing model formula in outer margin:
par(mfrow = c(2, 2), oma = c(0, 0, 2, 0))
plot(fitmodel2)

plot(fitmodel2, id.n = NULL)                 # no id's

plot(fitmodel2, id.n = 5, labels.id = NULL)  # 5 id numbers

## Was default in R <= 2.1.x:
## Cook's distances instead of Residual-Leverage plot
plot(fitmodel2, which = 1:4)

## Fit a smooth curve, where applicable:
plot(fitmodel2, panel = panel.smooth)

## Gives a smoother curve
plot(fitmodel2, panel = function(x, y) panel.smooth(x, y, span = 1))

ISDA609: Mathematical Modeling Techniques for Data Analytics \[ \ \] Assignment:05

Mohamed Elmoudni

March 5, 2016

Page 228, Question 1

Page 228, Answer 1

First we will build the probability tree diagram

Then we will build the transition diagram

Finally we will build the transition matrix

Model formulation

Using the discrete modeling ideas, we can construct the following probabilistic model:

Thus, it takes approximately 6 steps for the steady state to take place.

Page 232, Question 1

Page 232, Answer 1

Page 240, Question 1-2

Page 240, Answer 1-2

From the above results, we see that the lm and least-squared results are the same for the slope and intercept.

Now we will proceed to find:
1-The error sum of squares SSE
2-total corrected sum of squares SST
3-regression sum of sqaures SSR
4-The coeficient of determination \(R^2\)

\(R^2\) is 0.9987888. It means that 99% of the total variation of the y values is accounted for the linear relationship with the values of x. Thus, since \(R^2\) is closer to 1, it means that the regression line model is better fit to the actual data.

Now we will show the above results graphically

For table 2.7, predict weight as a function of the cube of the height

From the above results, we see that the lm and least-squared results are the same for the slope and intercept.

Now we will proceed to find:
1-The error sum of squares SSE
2-total corrected sum of squares SST
3-regression sum of sqaures SSR
4-The coeficient of determination \(R^2\)

Now we will show the above results graphically

Conclusion:
Model \(y = ax^3 + b\) is much better fit model than \(y_i = ax_i + b\)

ISDA609: Mathematical Modeling Techniques for Data Analytics \[ \ \] Assignment:05

Mohamed Elmoudni

March 5, 2016

Page 228, Question 1

Page 228, Answer 1

First we will build the probability tree diagram

Then we will build the transition diagram

Finally we will build the transition matrix

Model formulation

Using the discrete modeling ideas, we can construct the following probabilistic model:

Thus, it takes approximately 6 steps for the steady state to take place.

Page 232, Question 1

Page 232, Answer 1

Page 240, Question 1-2

Page 240, Answer 1-2

From the above results, we see that the lm and least-squared results are the same for the slope and intercept.

Now we will proceed to find:1-The error sum of squares SSE2-total corrected sum of squares SST3-regression sum of sqaures SSR4-The coeficient of determination \(R^2\)

\(R^2\) is 0.9987888. It means that 99% of the total variation of the y values is accounted for the linear relationship with the values of x. Thus, since \(R^2\) is closer to 1, it means that the regression line model is better fit to the actual data.

Now we will show the above results graphically

For table 2.7, predict weight as a function of the cube of the height

From the above results, we see that the lm and least-squared results are the same for the slope and intercept.

Now we will proceed to find:1-The error sum of squares SSE2-total corrected sum of squares SST3-regression sum of sqaures SSR4-The coeficient of determination \(R^2\)

Now we will show the above results graphically

Conclusion:Model \(y = ax^3 + b\) is much better fit model than \(y_i = ax_i + b\)

Now we will proceed to find:
1-The error sum of squares SSE
2-total corrected sum of squares SST
3-regression sum of sqaures SSR
4-The coeficient of determination \(R^2\)

Now we will proceed to find:
1-The error sum of squares SSE
2-total corrected sum of squares SST
3-regression sum of sqaures SSR
4-The coeficient of determination \(R^2\)

Conclusion:
Model \(y = ax^3 + b\) is much better fit model than \(y_i = ax_i + b\)