Question 12.1 - Describe a situation or problem from your job, everyday life, current events, etc., for which a design of experiments approach would be appropriate.

In the past, I’ve used examples related to biome and environmental preservation in my homework prompts. I can continue building on that theme by applying a Design of Experiments (DOE) approach. With DOE, we can test multiple factors that influence habitat biodiversity at the same time. Biodiversity levels are affected by various environmental conditions such as temperature, precipitation, and soil pH. Instead of testing these factors one by one, a DOE approach allows us to examine their combined effects simultaneously.
For example, imagine we’re conducting a restoration project and want to identify which conditions best support the regrowth of native plant species. We could use an A/B testing setup to compare different temperature ranges, precipitation levels, and soil pH values. By systematically adjusting one factor at a time while observing the others, we could isolate which specific environmental conditions most effectively promote plant repopulation and overall ecosystem recovery.

Question 12.2 - To determine the value of 10 different yes/no features to the market value of a house (large yard, solar roof, etc.), a real estate agent plans to survey 50 potential buyers, showing a fictitious house with different combinations of features. To reduce the survey size, the agent wants to show just 16 fictitious houses. Use R’s FrF2 function (in the FrF2 package) to find a fractional factorial design for this experiment: what set of features should each of the 16 fictitious houses have? Note: the output of FrF2 is “1” (include) or “-1” (don’t include) for each feature.

Methodology - We want to use FrF2 function to separate our fictitious houses among 10 different categories. These we specify how many features we want to use and how many houses we want to simulate. Accordingly, our output will produce a report which specifies if each feature is included in these simulated houses.

library(FrF2)
## Warning: package 'FrF2' was built under R version 4.5.1
## Loading required package: DoE.base
## Warning: package 'DoE.base' was built under R version 4.5.1
## Loading required package: grid
## Loading required package: conf.design
## Registered S3 method overwritten by 'DoE.base':
##   method           from       
##   factorize.factor conf.design
## 
## Attaching package: 'DoE.base'
## The following objects are masked from 'package:stats':
## 
##     aov, lm
## The following object is masked from 'package:graphics':
## 
##     plot.design
## The following object is masked from 'package:base':
## 
##     lengths
set.seed(336)

sample_houses = FrF2(nruns = 16, nfactors = 10, 
                     factor.names = c( 'Five Plus Acres', 'Fenced in Backyard', 'Double Front Door',                                                'Concrete Driveway','Two Plus Car Garage', 'In-Unit Laundry',                                                    'Two Plus Bathrooms', 'Trash Service', 'Panoramic Windows','Finished Basement'))

sample_houses

Discussion of Results - As expected, we now get 16 different simulated houses which specifies the features which are included. This would allow a real estate agent to present this to potential homebuyers and ask their opinions so they can isolate the most desirable house features. This information would allow the agency to strategically market certain household features in the future to maximize exposure and interest in the market.

Question 13.1 - For each of the following distributions, give an example of data that you would expect to follow this distribution (besides the examples already discussed in class).

a. Binomial - If a professional Baseball Player is hitting off a tee, The distribution would show the probability that the hit ball is a home run or not over 100 tries.

b. Geometric - If a professional Baseball Player is hitting off a tee, The distribution would reflect the number of home runs the player hits before not the ball of the fence.

c. Poisson- If a professional Baseball Player is hitting off of a pitching machine from 12pm to 2pm and gets a pitch every 10 seconds on average, the distribution would reflect how many pitches the batter saw from 1:15pm to 1:30pm.

d. Exponential - If a professional Baseball Player is hitting off of a pitching machine from 12pm to 2pm and gets a pitch every 10 seconds on average, the distribution would reflect the amount of time between each pitch I received from 1:15pm to 1:30pm.

e. Weibull - If a professional Baseball Player is taking batting practice from 12pm to 1pm and gets a pitch every 10 seconds on average, this distribution would follow the amount of time it takes the wooden bat to break assuming it is more likely to break after every piece of contact.

Question 13.2 - In this problem you, can simulate a simplified airport security system at a busy airport. Passengers arrive according to a Poisson distribution with λ1 = 5 per minute (i.e., mean interarrival rate 1 = 0.2 minutes) to the ID/boarding-pass check queue, where there are several servers who each have exponential service time with mean rate 2 = 0.75 minutes. [Hint: model them as one block that has more than one resource.] After that, the passengers are assigned to the shortest of the several personal-check queues, where they go through the personal scanner (time is uniformly distributed between 0.5 minutes and 1 minute).

Use the Arena software (PC users) or Python with SimPy (PC or Mac users) to build a simulation of the system, and then vary the number of ID/boarding-pass checkers and personal-check queues to determine how many are needed to keep average wait times below 15 minutes. [If you’re using SimPy, or if you have access to a non-student version of Arena, you can use λ1 = 50 to simulate a busier airport.]

Methodology - To conduct this simulation, I plan on setting on my airport system as follows. Additionally, for each of these results, I ran 100 iterations of a 12 hour day in the airport. Moreover, I started the simulation with only 3 scanners and I worked my way up until the final waiting time was less than or equal 15 minutes!

Final System
Final System
3 Scanners
3 Scanners
4 Scanners
4 Scanners
5 Scanners
5 Scanners

Discussion of Results - Ultimately, once I upgraded to 5 scanners, my average times finally got below that 15 minute wait time mark. Accordingly, with three scanners, we had average wait times of around ~38 minutes. Then, we we added a fourth scanner, that wait time went down to around ~20 minutes which represents a reduction of ~18 minutes. However, when we add a fifth scanner, the average wait times get down to around ~8 minutes, for this is another reduction of ~12 minutes. Ultimately, these results make sense because we would expect these time reductions to continue to diminish with every additional scanner introduced to our system. Therefore, once we added our fifth scanner to our system, we were able to successfully reduce wait times to under 15 minutes.