Question 12.1 - Describe a situation or problem from your
job, everyday life, current events, etc., for which a design of
experiments approach would be appropriate.
In the past, I’ve used examples related to biome and environmental
preservation in my homework prompts. I can continue building on that
theme by applying a Design of Experiments (DOE) approach. With DOE, we
can test multiple factors that influence habitat biodiversity at the
same time. Biodiversity levels are affected by various environmental
conditions such as temperature, precipitation, and soil pH. Instead of
testing these factors one by one, a DOE approach allows us to examine
their combined effects simultaneously.
For example, imagine we’re
conducting a restoration project and want to identify which conditions
best support the regrowth of native plant species. We could use an A/B
testing setup to compare different temperature ranges, precipitation
levels, and soil pH values. By systematically adjusting one factor at a
time while observing the others, we could isolate which specific
environmental conditions most effectively promote plant repopulation and
overall ecosystem recovery.
Question 12.2 - To determine the value of 10 different yes/no features to the market value of a house (large yard, solar roof, etc.), a real estate agent plans to survey 50 potential buyers, showing a fictitious house with different combinations of features. To reduce the survey size, the agent wants to show just 16 fictitious houses. Use R’s FrF2 function (in the FrF2 package) to find a fractional factorial design for this experiment: what set of features should each of the 16 fictitious houses have? Note: the output of FrF2 is “1” (include) or “-1” (don’t include) for each feature.
Methodology - We want to use FrF2 function to separate our fictitious houses among 10 different categories. These we specify how many features we want to use and how many houses we want to simulate. Accordingly, our output will produce a report which specifies if each feature is included in these simulated houses.
library(FrF2)
## Warning: package 'FrF2' was built under R version 4.5.1
## Loading required package: DoE.base
## Warning: package 'DoE.base' was built under R version 4.5.1
## Loading required package: grid
## Loading required package: conf.design
## Registered S3 method overwritten by 'DoE.base':
## method from
## factorize.factor conf.design
##
## Attaching package: 'DoE.base'
## The following objects are masked from 'package:stats':
##
## aov, lm
## The following object is masked from 'package:graphics':
##
## plot.design
## The following object is masked from 'package:base':
##
## lengths
set.seed(336)
sample_houses = FrF2(nruns = 16, nfactors = 10,
factor.names = c( 'Five Plus Acres', 'Fenced in Backyard', 'Double Front Door', 'Concrete Driveway','Two Plus Car Garage', 'In-Unit Laundry', 'Two Plus Bathrooms', 'Trash Service', 'Panoramic Windows','Finished Basement'))
sample_houses
Discussion of Results - As expected, we now get 16 different simulated houses which specifies the features which are included. This would allow a real estate agent to present this to potential homebuyers and ask their opinions so they can isolate the most desirable house features. This information would allow the agency to strategically market certain household features in the future to maximize exposure and interest in the market.
Question 13.1 - For each of the following distributions,
give an example of data that you would expect to follow this
distribution (besides the examples already discussed in
class).
a. Binomial - If a professional Baseball Player is
hitting off a tee, The distribution would show the probability that the
hit ball is a home run or not over 100 tries.
b. Geometric - If a professional Baseball Player is
hitting off a tee, The distribution would reflect the number of home
runs the player hits before not the ball of the
fence.
c. Poisson- If a professional Baseball Player is hitting
off of a pitching machine from 12pm to 2pm and gets a pitch every 10
seconds on average, the distribution would reflect how many pitches the
batter saw from 1:15pm to 1:30pm.
d. Exponential - If a professional Baseball Player is
hitting off of a pitching machine from 12pm to 2pm and gets a pitch
every 10 seconds on average, the distribution would reflect the amount
of time between each pitch I received from 1:15pm to
1:30pm.
e. Weibull - If a professional Baseball Player is taking
batting practice from 12pm to 1pm and gets a pitch every 10 seconds on
average, this distribution would follow the amount of time it takes the
wooden bat to break assuming it is more likely to break after every
piece of contact.
Question 13.2 - In this problem you, can simulate a
simplified airport security system at a busy airport. Passengers arrive
according to a Poisson distribution with λ1 = 5 per minute (i.e., mean
interarrival rate 1 = 0.2 minutes) to the ID/boarding-pass check queue,
where there are several servers who each have exponential service time
with mean rate 2 = 0.75 minutes. [Hint: model them as one block that
has more than one resource.] After that, the passengers are assigned to
the shortest of the several personal-check queues, where they go through
the personal scanner (time is uniformly distributed between 0.5 minutes
and 1 minute).
Use the Arena software (PC users) or Python with SimPy
(PC or Mac users) to build a simulation of the system, and then vary the
number of ID/boarding-pass checkers and personal-check queues to
determine how many are needed to keep average wait times below 15
minutes. [If you’re using SimPy, or if you have access to a non-student
version of Arena, you can use λ1 = 50 to simulate a busier
airport.]
Methodology - To conduct this simulation, I plan on setting on my airport system as follows. Additionally, for each of these results, I ran 100 iterations of a 12 hour day in the airport. Moreover, I started the simulation with only 3 scanners and I worked my way up until the final waiting time was less than or equal 15 minutes!
Discussion of Results - Ultimately, once I upgraded to 5 scanners, my average times finally got below that 15 minute wait time mark. Accordingly, with three scanners, we had average wait times of around ~38 minutes. Then, we we added a fourth scanner, that wait time went down to around ~20 minutes which represents a reduction of ~18 minutes. However, when we add a fifth scanner, the average wait times get down to around ~8 minutes, for this is another reduction of ~12 minutes. Ultimately, these results make sense because we would expect these time reductions to continue to diminish with every additional scanner introduced to our system. Therefore, once we added our fifth scanner to our system, we were able to successfully reduce wait times to under 15 minutes.