Background and Limitation

For our Noodle experiemental research study, here we have a manually collected dataset of 26 observations with 31 variables. These 26 observations or trials are conducted over two month periods, due to the current COVID-19 pandemic and the grocery shortage. This experiment took significantly longer to acquire needed ingredients, as flour had became an emergency household stock to buy. There is a shortage of flour in experimenter’s vicinity, and online ordering was significantly back-ordered. The original proposal planned for a variety mix types of flours including high-protein bread flour, white wheat flour, high mountain flour, cake flour as well as blended wheat flour. In the end, the experiment was reduced to 2 types of flour with varying protein content.

EDA

## # A tibble: 6 x 31
##   Trial MakeDate            Temperature_C Humidity ProteinContent
##   <dbl> <dttm>                      <dbl>    <dbl>          <dbl>
## 1     1 2020-04-18 00:00:00            20     0.3           0.104
## 2     2 2020-04-18 00:00:00            20     0.3           0.104
## 3     3 2020-04-18 00:00:00            20     0.3           0.104
## 4     4 2020-04-20 00:00:00            19     0.31          0.104
## 5     5 2020-04-20 00:00:00            19     0.28          0.104
## 6     6 2020-04-21 00:00:00            22     0.3           0.104
## # … with 26 more variables: ActiveTime_Hour <dbl>, InactiveTime_Hour <dbl>,
## #   Lamian <dbl>, DoughHydration <dbl>, SaltPercent <dbl>,
## #   pH_PreAutolyse <dbl>, pH_PostAutolyse <dbl>, FlourWeight <dbl>, Oil <dbl>,
## #   DoughColor_YScale <dbl>, PreKneadMins <dbl>, NumAutolyseHours <dbl>,
## #   Yeast_Glutathione <dbl>, Yratio <dbl>, Kansui <dbl>, Kratio <dbl>,
## #   BakingSoda_SodiumBicarbonate <dbl>, BSRatio <dbl>,
## #   BakedBS_SodiumCarbonate <dbl>, BakedBSRatio <dbl>, Extensibility <dbl>,
## #   KneadMins <dbl>, CookingTime <dbl>, Score <dbl>, Area <chr>, Notes <chr>
##      Trial          MakeDate                   Temperature_C      Humidity     
##  Min.   : 1.00   Min.   :2020-04-18 00:00:00   Min.   :19.00   Min.   :0.1800  
##  1st Qu.: 7.25   1st Qu.:2020-04-22 18:00:00   1st Qu.:20.50   1st Qu.:0.2000  
##  Median :13.50   Median :2020-05-04 00:00:00   Median :23.00   Median :0.2150  
##  Mean   :13.46   Mean   :2020-05-06 12:00:00   Mean   :24.12   Mean   :0.2315  
##  3rd Qu.:19.75   3rd Qu.:2020-05-17 12:00:00   3rd Qu.:28.00   3rd Qu.:0.2475  
##  Max.   :26.00   Max.   :2020-06-06 00:00:00   Max.   :30.00   Max.   :0.3100  
##  ProteinContent   ActiveTime_Hour  InactiveTime_Hour     Lamian      
##  Min.   :0.1040   Min.   :0.1667   Min.   : 0.500    Min.   :0.0000  
##  1st Qu.:0.1040   1st Qu.:0.3333   1st Qu.: 1.000    1st Qu.:0.0000  
##  Median :0.1040   Median :0.3333   Median : 2.000    Median :1.0000  
##  Mean   :0.1061   Mean   :0.4006   Mean   : 3.269    Mean   :0.6154  
##  3rd Qu.:0.1100   3rd Qu.:0.5000   3rd Qu.: 2.750    3rd Qu.:1.0000  
##  Max.   :0.1100   Max.   :0.8333   Max.   :24.000    Max.   :1.0000  
##  DoughHydration    SaltPercent      pH_PreAutolyse  pH_PostAutolyse
##  Min.   :0.5000   Min.   :0.00000   Min.   :5.500   Min.   :5.000  
##  1st Qu.:0.6500   1st Qu.:0.01000   1st Qu.:6.000   1st Qu.:6.400  
##  Median :0.6500   Median :0.01000   Median :6.100   Median :6.750  
##  Mean   :0.6308   Mean   :0.01423   Mean   :6.158   Mean   :6.662  
##  3rd Qu.:0.6500   3rd Qu.:0.01000   3rd Qu.:6.300   3rd Qu.:7.000  
##  Max.   :0.6500   Max.   :0.10000   Max.   :6.700   Max.   :7.800  
##   FlourWeight         Oil         DoughColor_YScale  PreKneadMins   
##  Min.   :100.0   Min.   :0.0000   Min.   :1.000     Min.   : 5.000  
##  1st Qu.:100.0   1st Qu.:1.0000   1st Qu.:2.000     1st Qu.: 5.000  
##  Median :100.0   Median :1.0000   Median :3.000     Median :10.000  
##  Mean   :107.7   Mean   :0.7692   Mean   :2.692     Mean   : 9.423  
##  3rd Qu.:100.0   3rd Qu.:1.0000   3rd Qu.:3.000     3rd Qu.:10.000  
##  Max.   :200.0   Max.   :1.0000   Max.   :5.000     Max.   :20.000  
##  NumAutolyseHours Yeast_Glutathione     Yratio           Kansui      
##  Min.   : 0.500   Min.   :0.0000    Min.   :0.0000   Min.   :0.0000  
##  1st Qu.: 1.000   1st Qu.:0.0000    1st Qu.:0.0000   1st Qu.:0.0000  
##  Median : 2.000   Median :0.0000    Median :0.0000   Median :0.0000  
##  Mean   : 3.269   Mean   :0.2692    Mean   :0.0250   Mean   :0.3846  
##  3rd Qu.: 2.750   3rd Qu.:0.7500    3rd Qu.:0.0375   3rd Qu.:1.0000  
##  Max.   :24.000   Max.   :1.0000    Max.   :0.1000   Max.   :1.0000  
##      Kratio        BakingSoda_SodiumBicarbonate    BSRatio        
##  Min.   :0.00000   Min.   :0.0000               Min.   :0.000000  
##  1st Qu.:0.00000   1st Qu.:0.0000               1st Qu.:0.000000  
##  Median :0.00000   Median :0.0000               Median :0.000000  
##  Mean   :0.02038   Mean   :0.1154               Mean   :0.004615  
##  3rd Qu.:0.02000   3rd Qu.:0.0000               3rd Qu.:0.000000  
##  Max.   :0.10000   Max.   :1.0000               Max.   :0.050000  
##  BakedBS_SodiumCarbonate  BakedBSRatio      Extensibility      KneadMins    
##  Min.   :0.00000         Min.   :0.000000   Min.   :0.1000   Min.   :10.00  
##  1st Qu.:0.00000         1st Qu.:0.000000   1st Qu.:0.3000   1st Qu.:20.00  
##  Median :0.00000         Median :0.000000   Median :0.4000   Median :20.00  
##  Mean   :0.07692         Mean   :0.003846   Mean   :0.4269   Mean   :24.04  
##  3rd Qu.:0.00000         3rd Qu.:0.000000   3rd Qu.:0.5750   3rd Qu.:30.00  
##  Max.   :1.00000         Max.   :0.050000   Max.   :0.8000   Max.   :50.00  
##   CookingTime        Score           Area              Notes          
##  Min.   :0.000   Min.   :0.000   Length:26          Length:26         
##  1st Qu.:2.000   1st Qu.:1.000   Class :character   Class :character  
##  Median :3.000   Median :2.500   Mode  :character   Mode  :character  
##  Mean   :2.346   Mean   :2.615                                        
##  3rd Qu.:3.000   3rd Qu.:4.750                                        
##  Max.   :5.000   Max.   :6.000
## Rows: 26
## Columns: 31
## $ Trial                        <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1…
## $ MakeDate                     <dttm> 2020-04-18, 2020-04-18, 2020-04-18, 202…
## $ Temperature_C                <dbl> 20, 20, 20, 19, 19, 22, 23, 28, 28, 26, …
## $ Humidity                     <dbl> 0.30, 0.30, 0.30, 0.31, 0.28, 0.30, 0.21…
## $ ProteinContent               <dbl> 0.104, 0.104, 0.104, 0.104, 0.104, 0.104…
## $ ActiveTime_Hour              <dbl> 0.4166667, 0.5833333, 0.8333333, 0.33333…
## $ InactiveTime_Hour            <dbl> 1.0, 12.0, 24.0, 1.0, 1.0, 2.0, 6.0, 2.0…
## $ Lamian                       <fct> 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1…
## $ DoughHydration               <dbl> 0.50, 0.50, 0.50, 0.60, 0.65, 0.65, 0.65…
## $ SaltPercent                  <dbl> 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01…
## $ pH_PreAutolyse               <dbl> 6.7, 6.7, 6.7, 6.7, 6.7, 5.5, 5.5, 6.0, …
## $ pH_PostAutolyse              <dbl> 7.0, 7.2, 7.3, 7.0, 7.0, 6.7, 7.0, 5.3, …
## $ FlourWeight                  <dbl> 100, 100, 100, 100, 100, 100, 100, 100, …
## $ Oil                          <fct> 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1…
## $ DoughColor_YScale            <fct> 1, 3, 4, 2, 3, 2, 1, 2, 3, 2, 2, 2, 3, 4…
## $ PreKneadMins                 <dbl> 5, 5, 5, 5, 5, 15, 15, 10, 10, 15, 10, 1…
## $ NumAutolyseHours             <dbl> 1.0, 12.0, 24.0, 1.0, 1.0, 2.0, 6.0, 2.0…
## $ Yeast_Glutathione            <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ Yratio                       <dbl> 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00…
## $ Kansui                       <fct> 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1…
## $ Kratio                       <dbl> 0.00, 0.10, 0.10, 0.05, 0.05, 0.00, 0.00…
## $ BakingSoda_SodiumBicarbonate <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ BSRatio                      <dbl> 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00…
## $ BakedBS_SodiumCarbonate      <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ BakedBSRatio                 <dbl> 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00…
## $ Extensibility                <fct> 0.2, 0.4, 0.5, 0.5, 0.5, 0.3, 0.2, 0.4, …
## $ KneadMins                    <dbl> 25, 35, 50, 20, 40, 35, 50, 20, 20, 25, …
## $ CookingTime                  <dbl> 3, 3, 3, 2, 2, 5, 5, 4, 4, 3, 3, 0, 3, 3…
## $ Score                        <dbl> 3, 4, 4, 3, 3, 2, 1, 2, 2, 3, 2, 0, 1, 1…
## $ Area                         <fct> Control, Autolyse, Autolyse, Hydration, …
## $ Notes                        <chr> "This is the control noodle, it is not p…

In total, there are 31 variables that I am tracking. The variable of experimental study interest is Noodle Tasting Score, the scale is from 1 to 7, and 5 is the optimal score being the noodle is not too bland and soft, nor is it too bitey and slick that does not create ideal mouthfeel or pick up broth flavor. This is a composite score that is based on overall noodle making experience, taste tests, mouthfeels, flavor carry ability and broth absorption.

Visualization

Univariate

Skewness

## [1] 0.139698

Kurtosis

## [1] 1.69196

The variable of study interest is the Noodle Tasting Score (“Score”), the distribution of the score from the experiemt appears to be normally distributed, with a skewness of 0.14, and kurtosis of 1.69. The kurtosis score indicates that the score is rather peaked. It is understandable, since the experiment goal was to understand noodle making process, and make good, delicious noodles. Therefore, it is expected to have a series of mediocre tasting noodles and eventually through trials become great noodles, hence the peakness that was shown in the kurtosis value.

Bivariate

In this graph, we are reviewing the relationship between the number of Autolyse hours and its impact on the resulting noodle tasting score. Autolyse is a process that water and flour are blended, then let rest for an extended period of time to allow gluten formation and relaxations.

The original hypothesis was that the longer time flour-water blend gets to relax, the easier it would be to form a stretchable dough with great extensibility and elasticity. Based on this hypothesis, I experimented with long autolyse of up to 24 hours to truly understand how that would affect the texture. As the graph shown, that the longer autolyse hours produced mediocre noodle. Surpisingly that shorter autolyse hours actually produced ideal noodle.

As we can see from the above graph, longer kneading time does not necessary lead to better tasting noodle. The reason I experimented with long kneading time is that dough with high hydration tend to be sticky, however, extended kneading can alleviate the stickiness and encourage the gluten bond formation. Stronger gluten bond will tend to result in better chew that is common in the chinese-style Lamian. Based on this notion, I wanted to find out whether or not this make sense. As it turns out, longer kneading time resulted in noodle that is too chewy and inedible during some trials. In hindsight, I understood that when I am kneading it, the gluten bond is strecthed in all directions, rather than linearly aligned which is what noodles are known for. Noodles are extended in one axis where as bread are extended in all directions. Therefore, longer kneading time is not ideal for noodle making.

Multivariate

Here we have a multivariate plot with a third variable of Inactive Yeast Addtion. We can see that with the addition of inactive yeast, which is glutathione, the noodle quality taste better, and the kneading time was also the shortest amongst all trials.

Regression

Data Preparation

##  Temperature_C       Humidity        ProteinContent      ActiveTime_Hour   
##  Min.   :-5.115   Min.   :-0.05154   Min.   :-0.002077   Min.   :-0.23397  
##  1st Qu.:-3.615   1st Qu.:-0.03154   1st Qu.:-0.002077   1st Qu.:-0.06731  
##  Median :-1.115   Median :-0.01654   Median :-0.002077   Median :-0.06731  
##  Mean   : 0.000   Mean   : 0.00000   Mean   : 0.000000   Mean   : 0.00000  
##  3rd Qu.: 3.885   3rd Qu.: 0.01596   3rd Qu.: 0.003923   3rd Qu.: 0.09936  
##  Max.   : 5.885   Max.   : 0.07846   Max.   : 0.003923   Max.   : 0.43269  
##  InactiveTime_Hour DoughHydration      SaltPercent        pH_PreAutolyse    
##  Min.   :-2.7692   Min.   :-0.13077   Min.   :-0.014231   Min.   :-0.65769  
##  1st Qu.:-2.2692   1st Qu.: 0.01923   1st Qu.:-0.004231   1st Qu.:-0.15769  
##  Median :-1.2692   Median : 0.01923   Median :-0.004231   Median :-0.05769  
##  Mean   : 0.0000   Mean   : 0.00000   Mean   : 0.000000   Mean   : 0.00000  
##  3rd Qu.:-0.5192   3rd Qu.: 0.01923   3rd Qu.:-0.004231   3rd Qu.: 0.14231  
##  Max.   :20.7308   Max.   : 0.01923   Max.   : 0.085769   Max.   : 0.54231  
##  pH_PostAutolyse     FlourWeight      PreKneadMins     NumAutolyseHours 
##  Min.   :-1.66154   Min.   :-7.692   Min.   :-4.4231   Min.   :-2.7692  
##  1st Qu.:-0.26154   1st Qu.:-7.692   1st Qu.:-4.4231   1st Qu.:-2.2692  
##  Median : 0.08846   Median :-7.692   Median : 0.5769   Median :-1.2692  
##  Mean   : 0.00000   Mean   : 0.000   Mean   : 0.0000   Mean   : 0.0000  
##  3rd Qu.: 0.33846   3rd Qu.:-7.692   3rd Qu.: 0.5769   3rd Qu.:-0.5192  
##  Max.   : 1.13846   Max.   :92.308   Max.   :10.5769   Max.   :20.7308  
##      Yratio            Kratio              BSRatio           BakedBSRatio      
##  Min.   :-0.0250   Min.   :-0.0203846   Min.   :-0.004615   Min.   :-0.003846  
##  1st Qu.:-0.0250   1st Qu.:-0.0203846   1st Qu.:-0.004615   1st Qu.:-0.003846  
##  Median :-0.0250   Median :-0.0203846   Median :-0.004615   Median :-0.003846  
##  Mean   : 0.0000   Mean   : 0.0000000   Mean   : 0.000000   Mean   : 0.000000  
##  3rd Qu.: 0.0125   3rd Qu.:-0.0003846   3rd Qu.:-0.004615   3rd Qu.:-0.003846  
##  Max.   : 0.0750   Max.   : 0.0796154   Max.   : 0.045385   Max.   : 0.046154  
##    KneadMins        CookingTime          Score        
##  Min.   :-14.038   Min.   :-2.3462   Min.   :-2.6154  
##  1st Qu.: -4.038   1st Qu.:-0.3462   1st Qu.:-1.6154  
##  Median : -4.038   Median : 0.6538   Median :-0.1154  
##  Mean   :  0.000   Mean   : 0.0000   Mean   : 0.0000  
##  3rd Qu.:  5.962   3rd Qu.: 0.6538   3rd Qu.: 2.1346  
##  Max.   : 25.962   Max.   : 2.6538   Max.   : 3.3846

We apply the normalization technique here, to ensure that all the numerical data have a mean value of 0, this will help us in developing a regression

MLR Model

Model 1

Variable: Temperature, Humidity, ProteinContent

mlr_model <- lm(Score~Temperature_C+Humidity+ProteinContent,data = noodle)
summary(mlr_model)
## 
## Call:
## lm(formula = Score ~ Temperature_C + Humidity + ProteinContent, 
##     data = noodle)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.9781 -0.8420 -0.2988  0.8851  3.7157 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)  
## (Intercept)      42.0962    16.1143   2.612   0.0159 *
## Temperature_C    -0.2108     0.1249  -1.688   0.1055  
## Humidity        -12.3465    11.6120  -1.063   0.2992  
## ProteinContent -297.3191   138.4903  -2.147   0.0431 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.873 on 22 degrees of freedom
## Multiple R-squared:  0.273,  Adjusted R-squared:  0.1739 
## F-statistic: 2.754 on 3 and 22 DF,  p-value: 0.06676

This model was constructed using three variables: room temperature, humidity and protein content as predicting variables. The reason that I chose these three are because these are environmental variables, and uncontrollable by me through experimentations. However, it appears that these variables do not necessarily correlates to noodle tasting score, protein content is the only one that has little relevance.

Model 2

Variable: Dough Hydration, NumAutolyseHours, Yeast_Glutathion, ProteinContent

mlr_model2 <- lm(Score~DoughHydration+NumAutolyseHours+Yeast_Glutathione+ProteinContent,data = noodle)
summary(mlr_model2)
## 
## Call:
## lm(formula = Score ~ DoughHydration + NumAutolyseHours + Yeast_Glutathione + 
##     ProteinContent, data = noodle)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.6721 -0.5967  0.2381  0.4226  1.5065 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         2.962e+01  6.253e+00   4.737 0.000112 ***
## DoughHydration     -1.348e+01  4.662e+00  -2.892 0.008727 ** 
## NumAutolyseHours    3.982e-03  4.513e-02   0.088 0.930533    
## Yeast_Glutathione1  3.930e+00  3.886e-01  10.113 1.59e-09 ***
## ProteinContent     -1.845e+02  6.036e+01  -3.057 0.005991 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8278 on 21 degrees of freedom
## Multiple R-squared:  0.8644, Adjusted R-squared:  0.8386 
## F-statistic: 33.47 on 4 and 21 DF,  p-value: 7.785e-09

This time, I have added another variable. Protein Content was included from the previous model. These four variables are I believe to be important predictor for determining noodle quality. The summary output reflected my hypothesis, among the chosen variables, Yeast_Glutathione is the one with highest predicatability, and both Dough Hydration and Protein Content are significant as well. The adjusted R-Squared result is 0.84

Model Evaluation

Between Model 1 and Model 2, it is obvious that Model 2 has better performance. The adjusted R-Squared for Model 2 is 0.84, compared to model 1’s result of 0.17. From a practical perspective, this result indicated that environmental variables do not play a significant role when it comes to making ideal noodle product, meaning that this result is not subject to any location, or ambiance limitation. Model 2 variables show that Yeast is a significant attribute of making good noodle.

Factor Analysis

Data Description

dim(noodle)
## [1] 26 31
str(noodle)
## tibble [26 × 31] (S3: tbl_df/tbl/data.frame)
##  $ Trial                       : num [1:26] 1 2 3 4 5 6 7 8 9 10 ...
##  $ MakeDate                    : POSIXct[1:26], format: "2020-04-18" "2020-04-18" ...
##  $ Temperature_C               : num [1:26] 20 20 20 19 19 22 23 28 28 26 ...
##  $ Humidity                    : num [1:26] 0.3 0.3 0.3 0.31 0.28 0.3 0.21 0.2 0.2 0.22 ...
##  $ ProteinContent              : num [1:26] 0.104 0.104 0.104 0.104 0.104 0.104 0.104 0.104 0.104 0.104 ...
##  $ ActiveTime_Hour             : num [1:26] 0.417 0.583 0.833 0.333 0.667 ...
##  $ InactiveTime_Hour           : num [1:26] 1 12 24 1 1 2 6 2 6 1 ...
##  $ Lamian                      : Factor w/ 2 levels "0","1": 1 2 2 2 2 1 1 2 2 2 ...
##  $ DoughHydration              : num [1:26] 0.5 0.5 0.5 0.6 0.65 0.65 0.65 0.65 0.65 0.65 ...
##  $ SaltPercent                 : num [1:26] 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.05 ...
##  $ pH_PreAutolyse              : num [1:26] 6.7 6.7 6.7 6.7 6.7 5.5 5.5 6 6 6.3 ...
##  $ pH_PostAutolyse             : num [1:26] 7 7.2 7.3 7 7 6.7 7 5.3 5 6.5 ...
##  $ FlourWeight                 : num [1:26] 100 100 100 100 100 100 100 100 100 100 ...
##  $ Oil                         : Factor w/ 2 levels "0","1": 2 2 2 2 2 1 1 2 2 2 ...
##  $ DoughColor_YScale           : Factor w/ 5 levels "1","2","3","4",..: 1 3 4 2 3 2 1 2 3 2 ...
##  $ PreKneadMins                : num [1:26] 5 5 5 5 5 15 15 10 10 15 ...
##  $ NumAutolyseHours            : num [1:26] 1 12 24 1 1 2 6 2 6 1 ...
##  $ Yeast_Glutathione           : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Yratio                      : num [1:26] 0 0 0 0 0 0 0 0 0 0 ...
##  $ Kansui                      : Factor w/ 2 levels "0","1": 1 2 2 2 2 1 1 1 1 1 ...
##  $ Kratio                      : num [1:26] 0 0.1 0.1 0.05 0.05 0 0 0 0 0 ...
##  $ BakingSoda_SodiumBicarbonate: Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ BSRatio                     : num [1:26] 0 0 0 0 0 0 0 0 0 0 ...
##  $ BakedBS_SodiumCarbonate     : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ BakedBSRatio                : num [1:26] 0 0 0 0 0 0 0 0 0 0 ...
##  $ Extensibility               : Factor w/ 8 levels "0.1","0.2","0.3",..: 2 4 5 5 5 3 2 4 5 4 ...
##  $ KneadMins                   : num [1:26] 25 35 50 20 40 35 50 20 20 25 ...
##  $ CookingTime                 : num [1:26] 3 3 3 2 2 5 5 4 4 3 ...
##  $ Score                       : num [1:26] 3 4 4 3 3 2 1 2 2 3 ...
##  $ Area                        : Factor w/ 12 levels "Autolyse","BakedBS",..: 4 1 1 7 7 9 9 6 6 10 ...
##  $ Notes                       : chr [1:26] "This is the control noodle, it is not pulled, instead it is rolled out using rolling pin and cut to noodle stri"| __truncated__ "12hr Autolyse trial, noodle extended quite a bit, easier to pull, but still break apart after a few pull" "24hr extended autolyse trial, dough feel very slimy, like playdough, but has great extensibility" "60% dough hydration test to see if extensibility is improved" ...
noodle_X <- select(noodle,num_col)
## Note: Using an external vector in selections is ambiguous.
## ℹ Use `all_of(num_col)` instead of `num_col` to silence this message.
## ℹ See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
## This message is displayed once per session.
head(noodle_X)
## # A tibble: 6 x 19
##   Temperature_C Humidity ProteinContent ActiveTime_Hour InactiveTime_Ho…
##           <dbl>    <dbl>          <dbl>           <dbl>            <dbl>
## 1            20     0.3           0.104           0.417                1
## 2            20     0.3           0.104           0.583               12
## 3            20     0.3           0.104           0.833               24
## 4            19     0.31          0.104           0.333                1
## 5            19     0.28          0.104           0.667                1
## 6            22     0.3           0.104           0.583                2
## # … with 14 more variables: DoughHydration <dbl>, SaltPercent <dbl>,
## #   pH_PreAutolyse <dbl>, pH_PostAutolyse <dbl>, FlourWeight <dbl>,
## #   PreKneadMins <dbl>, NumAutolyseHours <dbl>, Yratio <dbl>, Kratio <dbl>,
## #   BSRatio <dbl>, BakedBSRatio <dbl>, KneadMins <dbl>, CookingTime <dbl>,
## #   Score <dbl>

Correlation Matrix

Initial Corrplot

noodlematrix <- cor(noodle_X)
corrplot(noodlematrix,order = 'hclust', type='upper',tl.srt = 45, tl.cex = 0.7)

Observations pH_PreAutolyse and Humidity are highly correlated Humidity and Kansui Ratio are highly correlated PreKneadMins and Baking Soda Ratio are highly correlated NumAutolyseHours and Kansui Ratio are highly correlated

Updated Corrplot

res2 <- rcorr(as.matrix(noodle_X), type='pearson')
## Warning in sqrt(1 - h * h): NaNs produced
# Correlation Coefficients
res2$r
##                   Temperature_C    Humidity ProteinContent ActiveTime_Hour
## Temperature_C        1.00000000 -0.57821523     0.24554691     -0.22218711
## Humidity            -0.57821523  1.00000000    -0.36745539      0.53449371
## ProteinContent       0.24554691 -0.36745539     1.00000000     -0.16198650
## ActiveTime_Hour     -0.22218711  0.53449371    -0.16198650      1.00000000
## InactiveTime_Hour   -0.19834903  0.33398031    -0.11535156      0.61698838
## DoughHydration       0.46608599 -0.68526786     0.29034675     -0.41057886
## SaltPercent          0.13835585 -0.05337261    -0.20404010     -0.02733468
## pH_PreAutolyse      -0.52214877  0.56821462    -0.22238602      0.11281236
## pH_PostAutolyse     -0.25739749  0.35556222     0.42020202      0.43376072
## FlourWeight         -0.08921599 -0.15371627    -0.21004201     -0.38039412
## PreKneadMins        -0.06760293 -0.42182121     0.18490260     -0.13320193
## NumAutolyseHours    -0.19834903  0.33398031    -0.11535156      0.61698838
## Yratio              -0.04421852 -0.32690476    -0.14376274     -0.58848454
## Kratio              -0.22650178  0.60484824    -0.03298921      0.49897953
## BSRatio              0.53560534 -0.41681828     0.34618847     -0.15459301
## BakedBSRatio        -0.32917624 -0.15371627     0.39674602     -0.10942844
## KneadMins           -0.22218711  0.53449371    -0.16198650      1.00000000
## CookingTime         -0.03538037  0.26073864    -0.59211740      0.32983179
## Score               -0.33673547  0.12488693    -0.42164825     -0.29411744
##                   InactiveTime_Hour DoughHydration SaltPercent pH_PreAutolyse
## Temperature_C           -0.19834903      0.4660860  0.13835585    -0.52214877
## Humidity                 0.33398031     -0.6852679 -0.05337261     0.56821462
## ProteinContent          -0.11535156      0.2903468 -0.20404010    -0.22238602
## ActiveTime_Hour          0.61698838     -0.4105789 -0.02733468     0.11281236
## InactiveTime_Hour        1.00000000     -0.6522371 -0.10178998     0.27269544
## DoughHydration          -0.65223713      1.0000000  0.08862660    -0.64758172
## SaltPercent             -0.10178998      0.0886266  1.00000000     0.11270632
## pH_PreAutolyse           0.27269544     -0.6475817  0.11270632     1.00000000
## pH_PostAutolyse          0.27946896     -0.3120939 -0.18029373     0.21887390
## FlourWeight             -0.16475498      0.1151939 -0.06411406     0.03654747
## PreKneadMins            -0.15290337      0.3960599  0.11936710    -0.37796606
## NumAutolyseHours         1.00000000     -0.6522371 -0.10178998     0.27269544
## Yratio                  -0.29601113      0.2365326 -0.17952019    -0.22513346
## Kratio                   0.56401372     -0.5396694 -0.13723643     0.53875394
## BSRatio                  0.03930285      0.1349013 -0.07508270    -0.27755163
## BakedBSRatio            -0.01601785      0.1151939 -0.06411406     0.07973994
## KneadMins                0.61698838     -0.4105789 -0.02733468     0.11281236
## CookingTime              0.11269775     -0.1456258  0.14890179    -0.28205737
## Score                   -0.01101451     -0.1944490 -0.04765609     0.12399871
##                   pH_PostAutolyse  FlourWeight PreKneadMins NumAutolyseHours
## Temperature_C         -0.25739749 -0.089215989  -0.06760293      -0.19834903
## Humidity               0.35556222 -0.153716271  -0.42182121       0.33398031
## ProteinContent         0.42020202 -0.210042013   0.18490260      -0.11535156
## ActiveTime_Hour        0.43376072 -0.380394118  -0.13320193       0.61698838
## InactiveTime_Hour      0.27946896 -0.164754980  -0.15290337       1.00000000
## DoughHydration        -0.31209392  0.115193919   0.39605990      -0.65223713
## SaltPercent           -0.18029373 -0.064114061   0.11936710      -0.10178998
## pH_PreAutolyse         0.21887390  0.036547473  -0.37796606       0.27269544
## pH_PostAutolyse        1.00000000 -0.120887026  -0.09088671       0.27946896
## FlourWeight           -0.12088703  1.000000000   0.03737175      -0.16475498
## PreKneadMins          -0.09088671  0.037371755   1.00000000      -0.15290337
## NumAutolyseHours       0.27946896 -0.164754980  -0.15290337       1.00000000
## Yratio                -0.26282382  0.513335673  -0.12789503      -0.29601113
## Kratio                 0.50660515 -0.003365573  -0.56096704       0.56401372
## BSRatio                0.24635676 -0.097590007   0.04376532       0.03930285
## BakedBSRatio           0.22577430 -0.083333333   0.68514884      -0.01601785
## KneadMins              0.43376072 -0.380394118  -0.13320193       0.61698838
## CookingTime           -0.42419598 -0.065842692  -0.14107766       0.11269775
## Score                 -0.28602019  0.340679911  -0.34499087      -0.01101451
##                        Yratio       Kratio     BSRatio BakedBSRatio   KneadMins
## Temperature_C     -0.04421852 -0.226501775  0.53560534  -0.32917624 -0.22218711
## Humidity          -0.32690476  0.604848238 -0.41681828  -0.15371627  0.53449371
## ProteinContent    -0.14376274 -0.032989213  0.34618847   0.39674602 -0.16198650
## ActiveTime_Hour   -0.58848454  0.498979532 -0.15459301  -0.10942844  1.00000000
## InactiveTime_Hour -0.29601113  0.564013722  0.03930285  -0.01601785  0.61698838
## DoughHydration     0.23653259 -0.539669395  0.13490131   0.11519392 -0.41057886
## SaltPercent       -0.17952019 -0.137236427 -0.07508270  -0.06411406 -0.02733468
## pH_PreAutolyse    -0.22513346  0.538753940 -0.27755163   0.07973994  0.11281236
## pH_PostAutolyse   -0.26282382  0.506605148  0.24635676   0.22577430  0.43376072
## FlourWeight        0.51333567 -0.003365573 -0.09759001  -0.08333333 -0.38039412
## PreKneadMins      -0.12789503 -0.560967042  0.04376532   0.68514884 -0.13320193
## NumAutolyseHours  -0.29601113  0.564013722  0.03930285  -0.01601785  0.61698838
## Yratio             1.00000000 -0.200409560 -0.06679524  -0.17111189 -0.58848454
## Kratio            -0.20040956  1.000000000 -0.20889184  -0.17837536  0.49897953
## BSRatio           -0.06679524 -0.208891835  1.00000000  -0.09759001 -0.15459301
## BakedBSRatio      -0.17111189 -0.178375362 -0.09759001   1.00000000 -0.10942844
## KneadMins         -0.58848454  0.498979532 -0.15459301  -0.10942844  1.00000000
## CookingTime        0.10515370 -0.071797875 -0.37411206  -0.44626714  0.32983179
## Score              0.78979488  0.059918212 -0.29815011  -0.37364893 -0.29411744
##                   CookingTime       Score
## Temperature_C     -0.03538037 -0.33673547
## Humidity           0.26073864  0.12488693
## ProteinContent    -0.59211740 -0.42164825
## ActiveTime_Hour    0.32983179 -0.29411744
## InactiveTime_Hour  0.11269775 -0.01101451
## DoughHydration    -0.14562581 -0.19444895
## SaltPercent        0.14890179 -0.04765609
## pH_PreAutolyse    -0.28205737  0.12399871
## pH_PostAutolyse   -0.42419598 -0.28602019
## FlourWeight       -0.06584269  0.34067991
## PreKneadMins      -0.14107766 -0.34499087
## NumAutolyseHours   0.11269775 -0.01101451
## Yratio             0.10515370  0.78979488
## Kratio            -0.07179787  0.05991821
## BSRatio           -0.37411206 -0.29815011
## BakedBSRatio      -0.44626714 -0.37364893
## KneadMins          0.32983179 -0.29411744
## CookingTime        1.00000000  0.40713984
## Score              0.40713984  1.00000000
# P-Value
res2$P
##                   Temperature_C     Humidity ProteinContent ActiveTime_Hour
## Temperature_C                NA 0.0019748879    0.226628589    0.2753058175
## Humidity            0.001974888           NA    0.064787160    0.0049092278
## ProteinContent      0.226628589 0.0647871596             NA    0.4291818672
## ActiveTime_Hour     0.275305817 0.0049092278    0.429181867              NA
## InactiveTime_Hour   0.331378394 0.0954166639    0.574708116    0.0007870606
## DoughHydration      0.016399592 0.0001120733    0.150183627    0.0372087291
## SaltPercent         0.500288418 0.7956755117    0.317404660    0.8945495729
## pH_PreAutolyse      0.006215549 0.0024591050    0.274865173    0.5832123654
## pH_PostAutolyse     0.204282349 0.0746472233    0.032572705    0.0268380374
## FlourWeight         0.664719157 0.4534191570    0.303065902    0.0552356187
## PreKneadMins        0.742812511 0.0318402750    0.365844188    0.5165348857
## NumAutolyseHours    0.331378394 0.0954166639    0.574708116    0.0007870606
## Yratio              0.830167518 0.1030892116    0.483521646    0.0015651656
## Kratio              0.265848490 0.0010631340    0.872894532    0.0094616732
## BSRatio             0.004803957 0.0341469000    0.083195637    0.4508160002
## BakedBSRatio        0.100577712 0.4534191570    0.044777074    0.5946301895
## KneadMins           0.275305817 0.0049092278    0.429181867             NaN
## CookingTime         0.863763031 0.1982633673    0.001438914    0.0998614411
## Score               0.092547760 0.5432738355    0.031917873    0.1447198309
##                   InactiveTime_Hour DoughHydration SaltPercent pH_PreAutolyse
## Temperature_C          0.3313783940   0.0163995922   0.5002884   0.0062155495
## Humidity               0.0954166639   0.0001120733   0.7956755   0.0024591050
## ProteinContent         0.5747081160   0.1501836271   0.3174047   0.2748651730
## ActiveTime_Hour        0.0007870606   0.0372087291   0.8945496   0.5832123654
## InactiveTime_Hour                NA   0.0003053762   0.6207476   0.1777217747
## DoughHydration         0.0003053762             NA   0.6668085   0.0003484022
## SaltPercent            0.6207475770   0.6668084588          NA   0.5835687088
## pH_PreAutolyse         0.1777217747   0.0003484022   0.5835687             NA
## pH_PostAutolyse        0.1667671641   0.1206216249   0.3781218   0.2827120444
## FlourWeight            0.4212289523   0.5752344769   0.7556783   0.8593121030
## PreKneadMins           0.4558398293   0.0451818957   0.5613727   0.0569387250
## NumAutolyseHours       0.0000000000   0.0003053762   0.6207476   0.1777217747
## Yratio                 0.1420302631   0.2446801874   0.3802055   0.2688247485
## Kratio                 0.0026909195   0.0044351904   0.5037954   0.0045160929
## BSRatio                0.8488208943   0.5111499035   0.7154589   0.1698185622
## BakedBSRatio           0.9380959964   0.5752344769   0.7556783   0.6985961158
## KneadMins              0.0007870606   0.0372087291   0.8945496   0.5832123654
## CookingTime            0.5835975065   0.4778111856   0.4678544   0.1627092088
## Score                  0.9574112472   0.3411660227   0.8171755   0.5461677270
##                   pH_PostAutolyse FlourWeight PreKneadMins NumAutolyseHours
## Temperature_C         0.204282349 0.664719157 0.7428125109     0.3313783940
## Humidity              0.074647223 0.453419157 0.0318402750     0.0954166639
## ProteinContent        0.032572705 0.303065902 0.3658441885     0.5747081160
## ActiveTime_Hour       0.026838037 0.055235619 0.5165348857     0.0007870606
## InactiveTime_Hour     0.166767164 0.421228952 0.4558398293     0.0000000000
## DoughHydration        0.120621625 0.575234477 0.0451818957     0.0003053762
## SaltPercent           0.378121834 0.755678333 0.5613726736     0.6207475770
## pH_PreAutolyse        0.282712044 0.859312103 0.0569387250     0.1777217747
## pH_PostAutolyse                NA 0.556361907 0.6588099724     0.1667671641
## FlourWeight           0.556361907          NA 0.8561710996     0.4212289523
## PreKneadMins          0.658809972 0.856171100           NA     0.4558398293
## NumAutolyseHours      0.166767164 0.421228952 0.4558398293               NA
## Yratio                0.194569105 0.007317516 0.5335265798     0.1420302631
## Kratio                0.008265535 0.986981407 0.0028704958     0.0026909195
## BSRatio               0.225051588 0.635303828 0.8318836791     0.8488208943
## BakedBSRatio          0.267428185 0.685679250 0.0001125044     0.9380959964
## KneadMins             0.026838037 0.055235619 0.5165348857     0.0007870606
## CookingTime           0.030790006 0.749295534 0.4918121909     0.5835975065
## Score                 0.156632389 0.088553916 0.0843391093     0.9574112472
##                         Yratio      Kratio     BSRatio BakedBSRatio
## Temperature_C     8.301675e-01 0.265848490 0.004803957 0.1005777124
## Humidity          1.030892e-01 0.001063134 0.034146900 0.4534191570
## ProteinContent    4.835216e-01 0.872894532 0.083195637 0.0447770738
## ActiveTime_Hour   1.565166e-03 0.009461673 0.450816000 0.5946301895
## InactiveTime_Hour 1.420303e-01 0.002690919 0.848820894 0.9380959964
## DoughHydration    2.446802e-01 0.004435190 0.511149904 0.5752344769
## SaltPercent       3.802055e-01 0.503795408 0.715458938 0.7556783330
## pH_PreAutolyse    2.688247e-01 0.004516093 0.169818562 0.6985961158
## pH_PostAutolyse   1.945691e-01 0.008265535 0.225051588 0.2674281853
## FlourWeight       7.317516e-03 0.986981407 0.635303828 0.6856792504
## PreKneadMins      5.335266e-01 0.002870496 0.831883679 0.0001125044
## NumAutolyseHours  1.420303e-01 0.002690919 0.848820894 0.9380959964
## Yratio                      NA 0.326276661 0.745785162 0.4032774816
## Kratio            3.262767e-01          NA 0.305781981 0.3833013621
## BSRatio           7.457852e-01 0.305781981          NA 0.6353038276
## BakedBSRatio      4.032775e-01 0.383301362 0.635303828           NA
## KneadMins         1.565166e-03 0.009461673 0.450816000 0.5946301895
## CookingTime       6.091887e-01 0.727431743 0.059725618 0.0222993285
## Score             1.605349e-06 0.771235402 0.139035728 0.0600675100
##                      KneadMins CookingTime        Score
## Temperature_C     0.2753058175 0.863763031 9.254776e-02
## Humidity          0.0049092278 0.198263367 5.432738e-01
## ProteinContent    0.4291818672 0.001438914 3.191787e-02
## ActiveTime_Hour            NaN 0.099861441 1.447198e-01
## InactiveTime_Hour 0.0007870606 0.583597507 9.574112e-01
## DoughHydration    0.0372087291 0.477811186 3.411660e-01
## SaltPercent       0.8945495729 0.467854406 8.171755e-01
## pH_PreAutolyse    0.5832123654 0.162709209 5.461677e-01
## pH_PostAutolyse   0.0268380374 0.030790006 1.566324e-01
## FlourWeight       0.0552356187 0.749295534 8.855392e-02
## PreKneadMins      0.5165348857 0.491812191 8.433911e-02
## NumAutolyseHours  0.0007870606 0.583597507 9.574112e-01
## Yratio            0.0015651656 0.609188680 1.605349e-06
## Kratio            0.0094616732 0.727431743 7.712354e-01
## BSRatio           0.4508160002 0.059725618 1.390357e-01
## BakedBSRatio      0.5946301895 0.022299328 6.006751e-02
## KneadMins                   NA 0.099861441 1.447198e-01
## CookingTime       0.0998614411          NA 3.898749e-02
## Score             0.1447198309 0.038987495           NA
corrplot(res2$r,type = 'upper', order = 'hclust',p.mat = res2$P, sig.level = 0.01, insig = 'blank', tl.srt = 45,tl.cex = 0.7)

# model3 <- lm(Score~., data = noodle_X)
# vif(model3)
# str(noodle_X)

KMO

data_fa <- noodle_X[,-19]
datamatrix <- cor(data_fa)
KMO(r=datamatrix)
## Error in solve.default(r) : 
##   system is computationally singular: reciprocal condition number = 3.89955e-32
## matrix is not invertible, image not found
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = datamatrix)
## Overall MSA =  0.5
## MSA for each item = 
##     Temperature_C          Humidity    ProteinContent   ActiveTime_Hour 
##               0.5               0.5               0.5               0.5 
## InactiveTime_Hour    DoughHydration       SaltPercent    pH_PreAutolyse 
##               0.5               0.5               0.5               0.5 
##   pH_PostAutolyse       FlourWeight      PreKneadMins  NumAutolyseHours 
##               0.5               0.5               0.5               0.5 
##            Yratio            Kratio           BSRatio      BakedBSRatio 
##               0.5               0.5               0.5               0.5 
##         KneadMins       CookingTime 
##               0.5               0.5

Number of Factors

ev <- eigen(cor(data_fa))
ev$values
##  [1]  5.747511e+00  2.789818e+00  2.223570e+00  1.888014e+00  1.329005e+00
##  [6]  1.147362e+00  7.918106e-01  6.395797e-01  5.083986e-01  3.044951e-01
## [11]  2.221834e-01  1.541481e-01  1.077138e-01  8.468951e-02  4.041151e-02
## [16]  2.128848e-02 -1.531197e-16 -1.961984e-16
plot(ev$values)

Run Analysis

# nfactors <- 6
# fit1 <- factanal(data_fa,nfactors,scores = c('regression'),rotation = 'varimax')
# print(fit1)
# 
# fa_var <- fa(r = data_fa, nfactors = 6, rotate = 'varimax', fm = 'pa')
# fa.diagram(fanone)

Regression

# head(fa_var$scores)
# 
# regdata <- cbind(noodle_X[19], fa_var$scores)

Cluster Analysis

Optimal Cluster (Elbow)

set.seed(123)
fviz_nbclust(norm,kmeans, method = 'wss')

Using the elbow method, we can see the bend at around 3 clusters.

K-Means Cluster

k2 <- kmeans(norm,centers = 2, nstart = 25)
p2 <- fviz_cluster(k2, data = norm)+ggtitle('K = 2')

k3 <- kmeans(norm,centers = 3, nstart = 25)
p3 <- fviz_cluster(k3, data = norm) + ggtitle('K = 3')

k4 <- kmeans(norm,centers = 4, nstart = 25)
p4 <- fviz_cluster(k4, data = norm) + ggtitle('K = 4')

k5 <- kmeans(norm,centers = 5, nstart = 25)
p5 <- fviz_cluster(k5, data = norm) + ggtitle('K = 5')

grid.arrange(p2,p3,p4,p5,nrow=2)

Here plotted noodle data with 2,3,4 and 5 clusters, 3 cluster was the ideal number based on the elbow method.

Hierachical Analysis

d <- dist(norm,method = 'euclidean')

hier <- hclust(d,method = 'complete')

plot(hier,cex=0.6, hang = -1)
rect.hclust(hier,k=4, border = 2:5)

hier2 <- hclust(d,method = 'ward.D2')
sub_grp <- cutree(hier2, k =3)
fviz_cluster(list(data = norm, cluster = sub_grp))

Evaluation

From both the K-Means and Hierachical analysis, we see that 3 clusters are ideal for our data. From observing the cluster pattern, we can see the last observation was the single cluster, it is understandable because this was good noodle. The second cluster, has most of the early experiemts, this is when I am trying to explore the environmental variables. The first cluster has some of the later stage experiments, which include additives that I had added based on research to alter the noodle texture and gluten formation.

Dimension Reduction (PCA)

Visualization

# Calc eigenvalues & eigenvectors
noo_cov <- cov(norm)
noo_eigen <- eigen(noo_cov)
str(noo_eigen)
## List of 2
##  $ values : num [1:19] 761 122.8 25.3 20.5 12.9 ...
##  $ vectors: num [1:19, 1:19] 9.36e-03 2.84e-04 2.02e-05 2.93e-03 3.88e-02 ...
##  - attr(*, "class")= chr "eigen"
# Extract loadings
(phi <- noo_eigen$vectors[,1:2])
##                [,1]          [,2]
##  [1,]  9.360277e-03  1.052237e-01
##  [2,]  2.843096e-04 -1.906533e-03
##  [3,]  2.017734e-05  7.285331e-05
##  [4,]  2.931642e-03 -1.416346e-02
##  [5,]  3.876970e-02 -3.372028e-01
##  [6,] -2.647884e-04  2.289098e-03
##  [7,]  4.146067e-05  1.560003e-04
##  [8,] -2.992494e-04 -6.751756e-03
##  [9,]  3.492374e-03 -2.320913e-02
## [10,] -9.824281e-01 -1.790340e-01
## [11,] -7.955104e-03  6.819841e-02
## [12,]  3.876970e-02 -3.372028e-01
## [13,] -8.468391e-04  1.362244e-03
## [14,]  5.614122e-05 -1.845671e-03
## [15,]  4.309960e-05  2.271457e-04
## [16,]  3.554354e-05  1.734079e-04
## [17,]  1.758985e-01 -8.498077e-01
## [18,]  4.922025e-03 -4.035623e-02
## [19,] -2.630012e-02  1.454068e-02
phi <- -phi
row.names(phi) <- names(norm)
colnames(phi) <- c('PC1','PC2')
phi
##                             PC1           PC2
## Temperature_C     -9.360277e-03 -1.052237e-01
## Humidity          -2.843096e-04  1.906533e-03
## ProteinContent    -2.017734e-05 -7.285331e-05
## ActiveTime_Hour   -2.931642e-03  1.416346e-02
## InactiveTime_Hour -3.876970e-02  3.372028e-01
## DoughHydration     2.647884e-04 -2.289098e-03
## SaltPercent       -4.146067e-05 -1.560003e-04
## pH_PreAutolyse     2.992494e-04  6.751756e-03
## pH_PostAutolyse   -3.492374e-03  2.320913e-02
## FlourWeight        9.824281e-01  1.790340e-01
## PreKneadMins       7.955104e-03 -6.819841e-02
## NumAutolyseHours  -3.876970e-02  3.372028e-01
## Yratio             8.468391e-04 -1.362244e-03
## Kratio            -5.614122e-05  1.845671e-03
## BSRatio           -4.309960e-05 -2.271457e-04
## BakedBSRatio      -3.554354e-05 -1.734079e-04
## KneadMins         -1.758985e-01  8.498077e-01
## CookingTime       -4.922025e-03  4.035623e-02
## Score              2.630012e-02 -1.454068e-02

Before we get to the visualization for PC1 and PC2, and determining the useful variables for our analysis. We first caluculated the eigenvalues and eigenvectors. As we can see from the above graph, we can assume that PC1 corresponds to overall noodle making process since it included variables such as FlourWeight, PreKneadMins, and YeastRatio that have higher values. While PC2 appears to be for the method of extended dough relaxation using Autolyse, variables that have high values in PC2 include KneadMins, pH_PreAutolyse, NumAutolyseHours, InactiveTime_Hour, CookingTime.

##   Trial        PC1       PC2
## 1     1  -7.541226 -1.322794
## 2     2 -10.128037 14.586391
## 3     3 -13.698070 35.432236
## 4     4  -6.647185 -5.508263
## 5     5 -10.166112 11.492440
## 6     6  -9.352824  7.039462

We can see from this above 2-dimensional graph, the PC1 represents the experiment success and overall noodle making process, with the last trial as the most desirable noodle. While the y-axis which PC2 represent another method that had experienced success in trial, but takes significant longer to replicate desirable result, that is to utilize the characteristics of dough to deform and realign gluten structures through this process called Autolyse.

Scree Plot/ Optimal Components

##  [1] 0.804 0.130 0.027 0.022 0.014 0.004 0.001 0.000 0.000 0.000 0.000 0.000
## [13] 0.000 0.000 0.000 0.000 0.000 0.000 0.000
## Warning: Removed 1 rows containing missing values (geom_point).
## Warning: Removed 1 row(s) containing missing values (geom_path).

We constructed the Scree Plot for better and easier gauging of variables choice. As we gather from our first and second principal components, the PC1 explains 80.4% of variability, and PC2 explains 13.0% variability. These two combined contribute 93% of noodle trial data variability. The scree plot also show that the first two components are significant in terms of determining data variability.

Reflection

Both PC1 and PC2 are proven to be helpful indeterminng useful variables, while reduced overall dimension of noodle trial data. The PC1 and PC2 represents two methods that I have found useful for making ideal Lamian-style noodle. The three variables of importance in PC1 are FlourWeight, YRatio (Inactive Yeast Ratio), PreKneadMins. These three are important variables to keep in mind when trying to replicate the noodle making process. The volume of flour determines the baseline measurements for all other variables such as hydration level, salt percentage and so on. Inactive Yeast have proven to be an useful additive that contribute to the overall success, and is demonstrated to be significant previously in cluster, regression and pca. PreKneadMins gives you an idea of the ease of overall noodle making process. PC2 listed variables that are of importance when utilizing the Autolyse method. Since this method only utilize base ingredients of noodle with no additives added, it utilize extended time to relax and let flour, water, salt to work their magics and become an uniformed ball that is not only stretchable, but also have great extensibility and elasticity.

Based on the results from PC1 and PC2, it overuled my previous notion that was found through cluster analysis of recommending one lamian making method using inactive yeast. Now I planned to provide two methods, one with additives and one without additives.