For the scenarios presented in problem 9-17, identify a problem worth studying and list the variables that affect the behavior you have identified. Which variables would be neglected completely? Which might be considered as constants initially? Can you identify any submodels you would want to study in detail? Identify any data you would want collected.
A company with a fleet of trucks faces increasing maintenance costs as the age and mileage of the trucks increase
Yes, this problem is worth studying as it illustrate the classical optimization problem where to either minimize or maximize the outcome given some constraints. In this problem we need to maximize our profit by minimizing our maintenance cost give the age of trucks.
Lease expense, license, taxes, insurance, number of trucks, number of mechanics, type of fuel, maintenance and repair, labor, number of breakdowns, wait time to repair, loss of revenue and delay penalties, drivers retention and attrition, and number of customer reviews (negative and positive) the service.
Unless there are plans to relocate to different state with different regulations, the following variables can neglected completely: Lease expense, licenses and permits, taxes, insurance, number of trucks, number of mechanics, type of fuel.
Assuming that our mechanics are full time employees, the labor cost can be considered constant. However, the parts and materials associated with the labor are not constant. And any one time cosmetic fixes can be considered constants such as a small paint job or seat cleaning.
The sub model that I want to study in more detail is as follow: \[Truck \ ownership\ cost = truck \ depreciation + truck\ Return\ on\ Investment\ (ROI)\] As the truck depreciation is constant, the main focus will be on truck Return on Investment (ROI). \[Truck \ Return \ on \ Investment\ (ROI) = (the \ gain \ from \ the \ truck - Cost\ of\ investment) /cost\ of\ investment.\] Hence the detailed subsystem can be as follow: \[Cost\ of\ investment = (Fuel\ cost\ + maintenance\ cost\ + breakdown\ cost +wait\ cost).\]
The data you would want collected is maintenance cost and type of maintenance and specifically the tracks and truck parts that break down the most. The wait time needed to fix and maintain the trucks. And finally customer reviews. In other words, I would collect any data that directly or indirectly impact revenues. With the collected data, I would well informed about the best time to decide replacing trucks that are performing very poorly and negatively impacting the bottom-line of the company.
In problems 7-12, determine whether the data set supports the stated proportionality model.
\[ y \propto x^3 \quad Equation 2.1\]
df <- data.frame(y=c(0,1,2,6,14,24,37,58,82,114), x=c(1,2,3,4,5,6,7,8,9,10))
df
## y x
## 1 0 1
## 2 1 2
## 3 2 3
## 4 6 4
## 5 14 5
## 6 24 6
## 7 37 7
## 8 58 8
## 9 82 9
## 10 114 10
First we have to determine whether or not y and \(x^3\) are proportional, i.e., whether or not there is a positive constant k satisfying y = k\(x^3\). If they are not, we don’t have to proceed.
For this purpose, we compute the ratio \(y^{1/3}/x\), because \(y^{1/3}\) and x are proportional if and only if y and \(x^3\) are proportional.
Therefore, we are allowed to say that the given data can be approximated by: \((y^{1/3}, x) \quad\) and \(\quad y^{1/3}/x\)
Hence, our new data is:
df$y_cube_root<- (df$y)^(1/3)
#df$y_cube_root
df$y_cube_root_over_x <- (df$y)^(1/3) / df$x
#df$y_cube_root_over_x
df
## y x y_cube_root y_cube_root_over_x
## 1 0 1 0.000000 0.0000000
## 2 1 2 1.000000 0.5000000
## 3 2 3 1.259921 0.4199737
## 4 6 4 1.817121 0.4542801
## 5 14 5 2.410142 0.4820285
## 6 24 6 2.884499 0.4807499
## 7 37 7 3.332222 0.4760317
## 8 58 8 3.870877 0.4838596
## 9 82 9 4.344481 0.4827202
## 10 114 10 4.848808 0.4848808
We get mean (\(\quad y^{1/3}/x\) ):
mean(df$y_cube_root_over_x)
## [1] 0.4264524
The mean value above is 0.4264524 = 0.42. Therefore:
\[ \begin{aligned} mean(y^{1/3}/x) =&0.42\\ y =&(0.42)^3 x ^3\\ y =&0.074 x^3 \end{aligned} \]
Now let’s poluplate the data for \(y=0.074 x^3\)
df$y_c_xcube <- (.074)*((df$x)^3)
#data.frame(ypred = df$y_c_xcube)
# Given initial data
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.2.3
qplot(x,y, data=df, xlab = "x", ylab = "y")
# Modeled data
qplot(x,y_c_xcube, data=df, xlab= "x", ylab = "y= .074 x^cube " )
Although the graph does not go exactly through the origin, the data supports the proportionality model since the relative error \((y_a - y_p ) y_a\) = 0.337 is small as the slope is small, 0.074.
Hence the data can be modeled against a function that goes through the origin as the case for: \(y =0.074 x^3\)
Lumber Cutters - Lumber cutters wish to use readily available measurements to estimate the number of board feet of lumber in a tree. Assume they measure the diameter of the tree in inches at waist height. Develop a model that predicts board feet as a function of diameter in inches.
Use the following data for your test:
lumber_df <- data.frame(x=c(17, 19, 20, 23, 25, 28, 32, 38, 39, 41),
y=c(19, 25, 32, 57, 71, 113, 123, 252, 259, 294))
lumber_df
## x y
## 1 17 19
## 2 19 25
## 3 20 32
## 4 23 57
## 5 25 71
## 6 28 113
## 7 32 123
## 8 38 252
## 9 39 259
## 10 41 294
The variable x is the diameter of a ponderosa pine in inches, and y is the number of board feet divided by 10.
Consider two separate assumptions, allowing each to lead to a model. Completely analyze each model. \[\ \] i.Assume that all trees are right-circular cylinders, and are approximately the same height. \[\ \] ii.Assume that all trees are right-circular cylinders and that the height of the tree is proportional to the diameter. \[\ \]
Which model appears to be better? Why? Justify your conclusions. \[\ \]
This an example of Geometric Similarity in which f(x) = \(\pi r^2 h\). \[\ \]
The assumption here is that h is constatnt as all trees have the same hight.
Hence our function f(x) = \(\pi r^2 h\) will be depedning or \(r^2\). Therefore, \[ y \propto x^2 \]
First we have to determine whether or not y and \(x^2\) are proportional,i.e., whether or not there is a positive constant k satisfying y = k\(x^2\). If they are not, we don’t have to proceed.
For this purpose, we compute the ratio \(y^{1/2}/x\), because \(y^{1/2}\) and x are proportional if and only if y and \(x^2\) are proportional.
Therefore, we are allowed to say that the given data can be approximated by: \((y^{1/2}, x) \quad\) and \(\quad y^{1/2}/x\)
lumber_df$y_sqrt<- (lumber_df$y)^(1/2)
#df$y_cube_root
lumber_df$y_sqrt_over_x <- (lumber_df$y)^(1/2) / lumber_df$x
#df$y_cube_root_over_x
lumber_df
## x y y_sqrt y_sqrt_over_x
## 1 17 19 4.358899 0.2564058
## 2 19 25 5.000000 0.2631579
## 3 20 32 5.656854 0.2828427
## 4 23 57 7.549834 0.3282537
## 5 25 71 8.426150 0.3370460
## 6 28 113 10.630146 0.3796481
## 7 32 123 11.090537 0.3465793
## 8 38 252 15.874508 0.4177502
## 9 39 259 16.093477 0.4126533
## 10 41 294 17.146428 0.4182056
We get mean (\(\quad y^{1/2}/x\) ):
mean(lumber_df$y_sqrt_over_x)
## [1] 0.3442542
The mean value above is: 0.34
Therefore:
\[ \begin{aligned} mean(y^{1/2}/x) =&0.34 \\ y =&(0.34)^2 x ^2 \\ y =&0.11 x^2 \\ \end{aligned} \]
Now let’s poluplate the data for \(y=0.11 x^2\)
lumber_df$y_c_xsqr <- (.11)*((lumber_df$x)^2)
lumber_df
## x y y_sqrt y_sqrt_over_x y_c_xsqr
## 1 17 19 4.358899 0.2564058 31.79
## 2 19 25 5.000000 0.2631579 39.71
## 3 20 32 5.656854 0.2828427 44.00
## 4 23 57 7.549834 0.3282537 58.19
## 5 25 71 8.426150 0.3370460 68.75
## 6 28 113 10.630146 0.3796481 86.24
## 7 32 123 11.090537 0.3465793 112.64
## 8 38 252 15.874508 0.4177502 158.84
## 9 39 259 16.093477 0.4126533 167.31
## 10 41 294 17.146428 0.4182056 184.91
# Given initial data
library(ggplot2)
qplot(x,y, data=lumber_df, xlab = "x", ylab = "y")
# Modeled data
qplot(x,y_c_xsqr, data=lumber_df, xlab= "x", ylab = "y= .11 x^sqr " )
library(reshape2)
library(ggplot2)
final1 <- data.frame( x= lumber_df$x, y = lumber_df$y, ypredict = lumber_df$y_c_xsqr)
plot(final1)
DF1 <- melt(final1, id= 'x')
ggplot(data = DF1, aes(x = x, y = value, color = variable)) +
geom_point()
The assumption here is that h is not constatnt.
Hence our function f(x) = \(\pi r^2 h\) will be depedning or \(r^2\) and \(h\). Therefore, \[ y \propto x^3 \]
First we have to determine whether or not y and \(x^3\) are proportional, i.e., whether or not there is a positive constant k satisfying y = k\(x^3\). If they are not, we don’t have to proceed.
For this purpose, we compute the ratio \(y^{1/3}/x\), because \(y^{1/3}\) and x are proportional if and only if y and \(x^3\) are proportional.
Therefore, we are allowed to say that the given data can be approximated by: \((y^{1/3}, x) \quad\) and \(\quad y^{1/3}/x\)
Hence, our new data is:
#lumber_df$y_cube_root
lumber_df$y_cube_root<- (lumber_df$y)^(1/3)
#df$y_cube_root_over_x
lumber_df$y_cube_root_over_x <- (lumber_df$y)^(1/3) / lumber_df$x
lumber_df
## x y y_sqrt y_sqrt_over_x y_c_xsqr y_cube_root y_cube_root_over_x
## 1 17 19 4.358899 0.2564058 31.79 2.668402 0.1569648
## 2 19 25 5.000000 0.2631579 39.71 2.924018 0.1538957
## 3 20 32 5.656854 0.2828427 44.00 3.174802 0.1587401
## 4 23 57 7.549834 0.3282537 58.19 3.848501 0.1673261
## 5 25 71 8.426150 0.3370460 68.75 4.140818 0.1656327
## 6 28 113 10.630146 0.3796481 86.24 4.834588 0.1726639
## 7 32 123 11.090537 0.3465793 112.64 4.973190 0.1554122
## 8 38 252 15.874508 0.4177502 158.84 6.316360 0.1662200
## 9 39 259 16.093477 0.4126533 167.31 6.374311 0.1634439
## 10 41 294 17.146428 0.4182056 184.91 6.649400 0.1621805
We get mean (\(\quad y^{1/3}/x\) ):
mean(lumber_df$y_cube_root_over_x)
## [1] 0.162248
The mean value above is: 0.162248 = 0.16
Therefore:
\[ \begin{aligned} mean(y^{1/3}/x) =&0.16 \\ y =&(0.16)^3 x ^3 \\ y =&0.004 x^3 \\ \end{aligned} \]
Now let’s poluplate the data for \(y=0.004 x^3\)
#data.frame(ypred = lumber_df$y_c_xcube)
lumber_df$y_c_xcube <- (.004)*((lumber_df$x)^3)
# Given initial data
library(ggplot2)
qplot(x,y, data=lumber_df, xlab = "x", ylab = "y")
# Modeled data
qplot(x,y_c_xcube, data=lumber_df, xlab= "x", ylab = "y= .004 x^cube " )
library(reshape2)
library(ggplot2)
final2 <- data.frame( x= lumber_df$x, y = lumber_df$y, ypredict = lumber_df$y_c_xcube)
plot(final)
DF2 <- melt(final2, id= 'x')
ggplot(data = DF2, aes(x = x, y = value, color = variable)) +
geom_point()
library(reshape2)
library(ggplot2)
library(grid)
library(gridExtra)
## Warning: package 'gridExtra' was built under R version 3.2.3
p1<- ggplot(data = DF1, aes(x = x, y = value, color = variable)) +
geom_point() + labs(title = "Same height Trees, y = k.x^2")
p2<- ggplot(data = DF2, aes(x = x, y = value, color = variable)) +
geom_point() + labs(title = "High Proportional to Diameter Trees, y =k.x^3", size = .4)
grid.arrange(p1, p2, nrow=2, ncol=1)
From the plot above, the model, in which heights are proportional to diameters, appears to be much better than the one in which trees having the same heights. In other words, geometric similarity of using the volume function in which the diameter and high vary are better model fit than the surface model in which only the diameter is variables.. Therefore, a detail analysis is always recommended before assuming certain variables can be made constants