1.Setting

System Under Test

We find data from the link in data.gov in the 100+ interesting datasets, a dataset of automobile(model 2017) fuel economy is tested. The motivation of this analysis is that the problem of global warming has attracted much more attention since the beginning of this century. Some scientists believe that global warming is caused by the increase of greenhouse gases, and the volume of vehicle exhaust, which is a contributor to greenhouse effect, needs to be largely reduced.

In this study, a four-factor, multi-level experiment is performed to test whether factors of Cyl, Japanese, SmartWay, and Veh.Class may affect the emission level of CO2. The four factors are listed as below:

Cyl: number of cylinders.

Japanese: whether it is a Japanese car.

SmartWay: whether have an energy-saving system.

Veh.Class:the type of car.

# Read the data downloaded from website, and assign it to "car"
car <- read.csv("C:/Users/zhao/Desktop/alpha_2017.csv")
# Then, display "head"" and "tail"" of the dataset-"car"
head(car)
##   Japanese  Make     Model Displ Cyl      Trans Drive     Fuel Cert.Region
## 1      Yes ACURA ACURA ILX   2.4   4      AMS-8   2WD Gasoline          CA
## 2      Yes ACURA ACURA ILX   2.4   4      AMS-8   2WD Gasoline          FA
## 3      Yes ACURA ACURA MDX   3.5   6 SemiAuto-9   2WD Gasoline          CA
## 4      Yes ACURA ACURA MDX   3.5   6 SemiAuto-9   2WD Gasoline          CA
## 5      Yes ACURA ACURA MDX   3.5   6 SemiAuto-9   2WD Gasoline          FA
## 6      Yes ACURA ACURA MDX   3.5   6 SemiAuto-9   2WD Gasoline          FA
##        Stnd           Stnd.Description Underhood.ID Veh.Class
## 1 L3ULEV125 California LEV-III ULEV125 HHNXV02.4SH3 small car
## 2    T3B125     Federal Tier 3 Bin 125 HHNXV02.4SH3 small car
## 3 L3ULEV125 California LEV-III ULEV125 HHNXV03.5VH3 small SUV
## 4 L3ULEV125 California LEV-III ULEV125 HHNXV03.5VH3 small SUV
## 5    T3B125     Federal Tier 3 Bin 125 HHNXV03.5VH3 small SUV
## 6    T3B125     Federal Tier 3 Bin 125 HHNXV03.5VH3 small SUV
##   Air.Pollution.Score City.MPG Hwy.MPG Cmb.MPG Greenhouse.Gas.Score
## 1                   6       25      35      29                    7
## 2                   6       25      35      29                    7
## 3                   6       19      27      22                    5
## 4                   6       20      27      23                    5
## 5                   6       19      27      22                    5
## 6                   6       20      27      23                    5
##   SmartWay Comb.CO2
## 1      Yes      309
## 2      Yes      309
## 3       No      404
## 4       No      391
## 5       No      404
## 6       No      391
tail(car)
##      Japanese  Make       Model Displ Cyl      Trans Drive
## 1185       No VOLVO VOLVO XC 90     2   4 SemiAuto-8   4WD
## 1186       No VOLVO VOLVO XC 90     2   4 SemiAuto-8   4WD
## 1187       No VOLVO VOLVO XC 90     2   4 SemiAuto-8   4WD
## 1188       No VOLVO VOLVO XC 90     2   4 SemiAuto-8   4WD
## 1189       No VOLVO VOLVO XC 90     2   4 SemiAuto-8   4WD
## 1190       No VOLVO VOLVO XC 90     2   4 SemiAuto-8   4WD
##                      Fuel Cert.Region           Stnd
## 1185             Gasoline          CA      L3ULEV125
## 1186             Gasoline          CA      L3ULEV125
## 1187             Gasoline          FA         T3B125
## 1188             Gasoline          FA         T3B125
## 1189 Gasoline/Electricity          CA L3SULEV30/PZEV
## 1190 Gasoline/Electricity          FA          T3B30
##                     Stnd.Description Underhood.ID    Veh.Class
## 1185      California LEV-III ULEV125 HVVXT02.0U3T standard SUV
## 1186      California LEV-III ULEV125 HVVXT02.0U3T standard SUV
## 1187          Federal Tier 3 Bin 125 HVVXT02.0U3T standard SUV
## 1188          Federal Tier 3 Bin 125 HVVXT02.0U3T standard SUV
## 1189 California LEV-III SULEV30/PZEV HVVXT02.0P3T standard SUV
## 1190           Federal Tier 3 Bin 30 HVVXT02.0P3T standard SUV
##      Air.Pollution.Score City.MPG Hwy.MPG Cmb.MPG Greenhouse.Gas.Score
## 1185                   6       20      25      22                    5
## 1186                   6       22      25      23                    5
## 1187                   6       20      25      22                    5
## 1188                   6       22      25      23                    5
## 1189                   9       22      25      23                    8
## 1190                   8       22      25      23                    8
##      SmartWay Comb.CO2
## 1185       No      399
## 1186       No      384
## 1187       No      399
## 1188       No      384
## 1189      Yes      238
## 1190      Yes      238

Factors and Levels

In this study, each experiment contains four different factors, each with multiple levels. We include Cyl, Japanese, SmartWay, and Veh.Class. The factor ‘Cly’ has 2 levels, the factor ‘Japanese’ has 2 levels, the factor ‘SmartWay’ has 2 levels and the factor ‘Veh.Class’ has 5 levels. The factors are selected based on our best guess. For ‘Cylinder’, ‘SmartWay’, and ‘Veh.Class’, it is reasonable to think of that these factors with different levels will affect the vehicle exhaust. For factor ‘Japanese’, we test this factor because many people argues that Japanese cars are more efficient and environmental-friendly. The summay and the structure are listed below:

#Display the summary statistics of "car".
summary(car)
##  Japanese         Make                   Model          Displ      
##  No :976   BMW      :136   HONDA Accord     :  19   Min.   :1.400  
##  Yes:214   HYUNDAI  : 83   JEEP Cherokee    :  14   1st Qu.:2.000  
##            CHEVROLET: 72   JEEP Compass     :  14   Median :2.400  
##            KIA      : 69   JEEP Patriot     :  14   Mean   :2.514  
##            PORSCHE  : 69   CADILLAC ATS     :  12   3rd Qu.:3.000  
##            FORD     : 61   CHEVROLET Equinox:  12   Max.   :3.800  
##            (Other)  :700   (Other)          :1105                  
##       Cyl               Trans     Drive                       Fuel     
##  Min.   :4.000   SemiAuto-6:307   2WD:719   Diesel              :   8  
##  1st Qu.:4.000   SemiAuto-8:252   4WD:471   Ethanol             :   1  
##  Median :4.000   Man-6     :140             Ethanol/Gas         :  32  
##  Mean   :4.802   Auto-6    : 78             Gasoline            :1131  
##  3rd Qu.:6.000   AMS-7     : 59             Gasoline/Electricity:  18  
##  Max.   :6.000   CVT       : 52                                        
##                  (Other)   :302                                        
##  Cert.Region             Stnd    
##  CA:597      T3B125        :251  
##  FA:593      T3B110        :171  
##              U2            :161  
##              L3ULEV125     :117  
##              T3B30         :101  
##              L3SULEV30/PZEV: 71  
##              (Other)       :318  
##                             Stnd.Description       Underhood.ID 
##  Federal Tier 3 Bin 125             :251     HPRXV03.0C91:  40  
##  Federal Tier 3 Transitional Bin 110:171     HBMXV02.0B4X:  30  
##  California LEV-II ULEV             :161     HBMXV03.0B58:  30  
##  California LEV-III ULEV125         :117     HGMXJ03.6165:  30  
##  Federal Tier 3 Bin 30              :101     HJLXJ03.0FSP:  30  
##  California LEV-III SULEV30/PZEV    : 71     HBMXV03.0F10:  20  
##  (Other)                            :318     (Other)     :1010  
##         Veh.Class   Air.Pollution.Score    City.MPG        Hwy.MPG     
##  large car   : 88   Min.   :5.000       Min.   :13.00   Min.   :17.00  
##  midsize car :214   1st Qu.:6.000       1st Qu.:19.00   1st Qu.:26.00  
##  small car   :507   Median :6.000       Median :21.00   Median :29.00  
##  small SUV   :281   Mean   :6.399       Mean   :22.07   Mean   :29.64  
##  standard SUV:100   3rd Qu.:7.000       3rd Qu.:24.00   3rd Qu.:33.00  
##                     Max.   :9.000       Max.   :58.00   Max.   :53.00  
##                                                                        
##     Cmb.MPG      Greenhouse.Gas.Score SmartWay     Comb.CO2    
##  Min.   :14.00   Min.   : 2.000       No :976   Min.   : 51.0  
##  1st Qu.:22.00   1st Qu.: 5.000       Yes:214   1st Qu.:324.0  
##  Median :24.00   Median : 5.000                 Median :369.0  
##  Mean   :24.93   Mean   : 5.373                 Mean   :366.6  
##  3rd Qu.:27.00   3rd Qu.: 6.000                 3rd Qu.:406.0  
##  Max.   :71.00   Max.   :10.000                 Max.   :546.0  
## 
#Display the names found in "car".
names(car)
##  [1] "Japanese"             "Make"                 "Model"               
##  [4] "Displ"                "Cyl"                  "Trans"               
##  [7] "Drive"                "Fuel"                 "Cert.Region"         
## [10] "Stnd"                 "Stnd.Description"     "Underhood.ID"        
## [13] "Veh.Class"            "Air.Pollution.Score"  "City.MPG"            
## [16] "Hwy.MPG"              "Cmb.MPG"              "Greenhouse.Gas.Score"
## [19] "SmartWay"             "Comb.CO2"
#Display the structure of "car" and set 'Cyl' as factor.
car$Make<-as.character(car$Make)
car$Model<-as.character(car$Model)
car$Trans<-as.character(car$Trans)
car$Drive<-as.character(car$Drive)
car$Fuel<-as.character(car$Fuel)
car$Cert.Region<-as.character(car$Cert.Region)
car$Stnd<-as.character(car$Stnd)
car$Stnd.Descrption<-as.character(car$Stnd.Description)
car$Underhood.ID <-as.character(car$Underhood.ID )
car$Cyl=as.factor(car$Cyl)
str(car)
## 'data.frame':    1190 obs. of  21 variables:
##  $ Japanese            : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2 ...
##  $ Make                : chr  "ACURA" "ACURA" "ACURA" "ACURA" ...
##  $ Model               : chr  "ACURA ILX" "ACURA ILX" "ACURA MDX" "ACURA MDX" ...
##  $ Displ               : num  2.4 2.4 3.5 3.5 3.5 3.5 3.5 3.5 3.5 3.5 ...
##  $ Cyl                 : Factor w/ 2 levels "4","6": 1 1 2 2 2 2 2 2 2 2 ...
##  $ Trans               : chr  "AMS-8" "AMS-8" "SemiAuto-9" "SemiAuto-9" ...
##  $ Drive               : chr  "2WD" "2WD" "2WD" "2WD" ...
##  $ Fuel                : chr  "Gasoline" "Gasoline" "Gasoline" "Gasoline" ...
##  $ Cert.Region         : chr  "CA" "FA" "CA" "CA" ...
##  $ Stnd                : chr  "L3ULEV125" "T3B125" "L3ULEV125" "L3ULEV125" ...
##  $ Stnd.Description    : Factor w/ 18 levels "California LEV-II LEV",..: 9 13 9 9 13 13 9 9 13 13 ...
##  $ Underhood.ID        : chr  "HHNXV02.4SH3" "HHNXV02.4SH3" "HHNXV03.5VH3" "HHNXV03.5VH3" ...
##  $ Veh.Class           : Factor w/ 5 levels "large car","midsize car",..: 3 3 4 4 4 4 4 4 4 4 ...
##  $ Air.Pollution.Score : int  6 6 6 6 6 6 6 6 6 6 ...
##  $ City.MPG            : int  25 25 19 20 19 20 18 19 18 19 ...
##  $ Hwy.MPG             : int  35 35 27 27 27 27 26 26 26 26 ...
##  $ Cmb.MPG             : int  29 29 22 23 22 23 21 22 21 22 ...
##  $ Greenhouse.Gas.Score: int  7 7 5 5 5 5 4 5 4 5 ...
##  $ SmartWay            : Factor w/ 2 levels "No","Yes": 2 2 1 1 1 1 1 1 1 1 ...
##  $ Comb.CO2            : int  309 309 404 391 404 391 424 404 424 404 ...
##  $ Stnd.Descrption     : chr  "California LEV-III ULEV125" "Federal Tier 3 Bin 125" "California LEV-III ULEV125" "California LEV-III ULEV125" ...

Continuous variables (if any)

In this dataset, the controllable variables cannot be considered as continuous variables. A continuous variable is the variable that has infinite numbers of possible values. [1] The factors chosen in our model such as ‘Cyl’, ‘Japanese’, ‘SmartWay’, and ‘Veh.Class’ are not numeric variables, and they have a discrete values, so they cannot be considered as continuous variables. The response variable-‘Comb.CO2’, is kind of tricky. The response variable may be rounded to single digit. Even though the variables are integers, we tend to think that this variable should be continuous variables, because the emission of CO2 can be infinite numbers of possible values.

Response variables

In this analysis, we consider only one response variable, ‘Comb.CO2’, which denotes the average emission volume of CO2 in each model. If the ‘Comb.CO2’ is large, the car is not comparatively environment-friendly.

The Data: How is it organized and what does it look like?

This dataset contains the preliminary fuel economy values for 2017 model year vehicles from the Environmental Protection Agency’s National Vehicle and Fuel Emissions Laboratory in Ann Arbor, Michigan. The fuel economy data and fuel costs are updated weekly. 19 values are stored in the original table, and they are ‘Japanese’, ‘Make’, ‘Displ’, ‘Cyl’, ‘Trans’, ‘Drive’, ‘Fuel’, ‘Cert Region’, ‘Stnd’, ‘Stnd Discription’, ‘Veh Class’, ‘Air Pollution Score’, ‘City MPG’, ‘Hwy MPG’, ‘Cmb MPG’, ‘Greenhouse Gas Score’, ‘SmartWay’, and ‘Comb CO2’. In our analysis, we take ‘Japanese’, ‘Veh Class’, ‘Cyl’, and ‘SmartWay’ as our main factors, and ‘Comb CO2’ as the response factor.

2. (Experimental) Design

How will the experiment be organized and conducted to test the hypothesis?

In this experimental design, we would like to test whether the variation of our response factor can be explained by the four factors and the interaction term. So the null hypothesis in our experiment is that ‘Cyl’, ‘Japanese’, ‘SmartWay’, ‘Veh.Class’ and the interaction term do not have significant effects on ‘Comb CO2’ (i.e. the means between different levels are equal). In order to test the hypothesis, we perform an analysis of variance (ANOVA) to see if there is any difference in the means for ‘Comb CO2’ among the levels.

What is the rationale for this design?

The rationale for this study is that we are trying to see if the ‘Cyl’, ‘Japanese’, ‘SmartWay’, ‘Veh.Class’ and the interaction term have effects on the ejection of CO2. Because we believe the factor ‘Cyl’, ‘Japanese’, ‘SmartWay’ and ‘Veh.Class’ have effects on the emission of CO2 based on Best Guess. In this design, a four factor- multi level experiment is introduced to test whether the effects of different factors may affect the CO2 emission.

Randomize: What is the Randomization Scheme?

“Randomization is the use of a known, understood probabilistic mechanism for the assignment of treatments to units.”[2]

In order to reduce the nuisance effect, (1) subjects can be picked randomly, (2)subjects can be randomly assigned to each treatments, and (3) oreders of each units can also be randomized. Since fuel economy data are the result of vehicle testing, the cars in each model should be randomly picked under a known probabilistic mechanism to generate the data in our dataset. In our study we do not include the randomization scheme because we are analyzing the dataset.

Replicate: Are there replicates and/or repeated measures?

“Replication means an independent repeat run of each factor combination.Replication reflects sources of variability both between runs and (potentially) within runs”[3]

In this experiment, we analyze the data observed with different car model, so we do not have any replicates or repeated measures present.

Block: Did you use blocking in the design?

“Blocking is a design technique used to improve the precision with which comparisons among the factors of interest are made.Generally, a block is a set of relatively homogeneous experimental conditions.”[4]

In our analysis, blocking is not involved in our design, because in this data, we do not find a controllable nuisance factor that affect our results.

3. (Statistical) Analysis

(Exploratory Data Analysis) Graphics and descriptive summary

In this section, we list the summary of our dataset. Meanwhile, we also show the boxplot and interaction plot to reflect the mean effects and the interaction effects of four factors.

#Create a boxplot of'Comb.CO2'by 'Japanese'. 
boxplot(car$Comb.CO2~car$Japanese, xlab="Japanese Car", ylab="Comb.CO2")

# Calculate the mean of 'Comb.CO2' by 'Japanese'.
tapply(car$Comb.CO2,car$Make,mean)
##         ACURA          AUDI           BMW          BUCK      CADILLAC 
##      374.0769      371.4000      369.6912      379.4375      403.5455 
##     CHEVROLET      CHRYSLER         DODGE          FIAT          FORD 
##      327.9444      365.0000      437.4000      301.7500      397.9180 
##       GENESIS           GMC         HONDA       HYUNDAI      INFINITI 
##      443.0000      426.3636      325.4211      335.5181      393.3125 
##        JAGUAR          JEEP           KIA    LAND ROVER        LEXUS  
##      393.6000      384.8269      353.8986      389.6667      315.5000 
##       LINCOLN         LOTUS      MASERATI         MAZDA MERCEDES-BENZ 
##      437.6667      459.5000      474.6000      289.2857      383.5652 
##          MINI    MITSUBISHI        NISSAN       PORSCHE        SUBARU 
##      318.9444      320.0000      386.5833      392.0725      356.0625 
##        TOYOTA         VOLVO            VW 
##      290.6531      348.6875      353.1500
#Create a boxplot of'Comb.CO2'by 'SmartWay'). 
boxplot(car$Comb.CO2~car$SmartWay, xlab="Smartway", ylab="Comb.CO2")

# Calculate the mean of 'Comb.CO2' by 'SmartWay'.
tapply(car$Comb.CO2,car$SmartWay,mean)
##       No      Yes 
## 388.0953 268.6262
#Create a boxplot of'Comb.CO2'by vehacle 'Veh.Class'. 
boxplot(car$Comb.CO2~car$Veh.Class, xlab="Vehicle Class", ylab="Comb.CO2")

# Calculate the mean of 'Comb.CO2' by 'Veh.Class'.
tapply(car$Comb.CO2,car$Veh.Class,mean)
##    large car  midsize car    small car    small SUV standard SUV 
##     388.5114     326.3224     352.6529     387.6441     445.2200
#Create a boxplot of'Comb.CO2'by 'Cyl'. 
boxplot(car$Comb.CO2~car$Cyl, xlab="Clynder", ylab="Comb.CO2")

# Calculate the mean of 'Comb.CO2' by 'Cyl'.
tapply(car$Comb.CO2,car$Cyl,mean)
##        4        6 
## 331.5638 418.9979

From the boxplot and the list of means above, we can intuitively deduce that Japanese cars are comparatively eject less CO2 , and the cars which install the SmartWay system are more likely be environmental-friendly. Surprisingly, we find that midsized cars comparatively have lowest emission level of CO2, and small cars are not in the lowest level. Finally, cars with 6 cylinders emit more CO2 than with 4 cylinder.

interaction.plot(car$Japanese, car$SmartWay, car$Comb.CO2)

interaction.plot(car$Japanese, car$Veh.Class, car$Comb.CO2)

interaction.plot(car$Japanese, car$Cyl, car$Comb.CO2)

interaction.plot(car$SmartWay, car$Veh.Class, car$Comb.CO2)

interaction.plot(car$SmartWay, car$Cyl, car$Comb.CO2)

interaction.plot(car$Cyl, car$Veh.Class, car$Comb.CO2)

From the interaction analysis above, we found that factor ‘SmartWay’ and ‘Japanese’ have substitution effects. The reason behind this may be that the Japanese cars themselves eject less CO2 than non-Japanese cars, so the SmartWay system has less effect on these cars. For other interaction plots, the two-way interactions seem to be in the same direction.

Testing

In this section, in order to test whether the variation of ‘Comb.CO2’ can be explained by the treatments we take into consideration. We use the analysis of variance (ANOVA) to test if factors ‘Japanese’, ‘Veh Class’, ‘Cyl’, ‘SmartWay’, and interaction term can explain the variation of ‘Comb.CO2’. In this study, the null hypothesis is tested. If we reject the null hypothesis, we are intended to believe that the mean differences in ‘Com.Co2’ for each ‘Japanese’, ‘Veh.Class’, ‘Cyl’, and ‘SmartWay’ are not caused by randomization, and the mean differences can be explained by these different factor levels. If we cannot reject the null hypothesis, we tend to believe it may be caused by randomization.

anova1= aov(car$Comb.CO2~car$Japanese)
summary(anova1)
##                Df  Sum Sq Mean Sq F value   Pr(>F)    
## car$Japanese    1  159423  159423   36.48 2.06e-09 ***
## Residuals    1188 5192344    4371                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova2= aov(car$Comb.CO2~car$Cyl)
summary(anova2)
##               Df  Sum Sq Mean Sq F value Pr(>F)    
## car$Cyl        1 2184855 2184855   819.6 <2e-16 ***
## Residuals   1188 3166912    2666                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova3= aov(car$Comb.CO2~car$Veh.Class)
summary(anova3)
##                 Df  Sum Sq Mean Sq F value Pr(>F)    
## car$Veh.Class    4 1230594  307648   88.46 <2e-16 ***
## Residuals     1185 4121173    3478                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova4= aov(car$Comb.CO2~car$SmartWay)
summary(anova4)
##                Df  Sum Sq Mean Sq F value Pr(>F)    
## car$SmartWay    1 2505117 2505117    1045 <2e-16 ***
## Residuals    1188 2846650    2396                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova5= aov(car$Comb.CO2~car$Japanese*car$Cyl)
summary(anova5)
##                        Df  Sum Sq Mean Sq F value   Pr(>F)    
## car$Japanese            1  159423  159423  62.473 6.14e-15 ***
## car$Cyl                 1 2165596 2165596 848.625  < 2e-16 ***
## car$Japanese:car$Cyl    1     209     209   0.082    0.775    
## Residuals            1186 3026538    2552                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova6= aov(car$Comb.CO2~car$Japanese*car$Veh.Class)
summary(anova6)
##                              Df  Sum Sq Mean Sq F value   Pr(>F)    
## car$Japanese                  1  159423  159423  46.473 1.48e-11 ***
## car$Veh.Class                 4 1101109  275277  80.245  < 2e-16 ***
## car$Japanese:car$Veh.Class    3   39849   13283   3.872  0.00904 ** 
## Residuals                  1181 4051386    3430                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova7= aov(car$Comb.CO2~car$Japanese*car$SmartWay)
summary(anova7)
##                             Df  Sum Sq Mean Sq F value   Pr(>F)    
## car$Japanese                 1  159423  159423  66.830 7.54e-16 ***
## car$SmartWay                 1 2349589 2349589 984.950  < 2e-16 ***
## car$Japanese:car$SmartWay    1   13564   13564   5.686   0.0173 *  
## Residuals                 1186 2829191    2385                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova8= aov(car$Comb.CO2~car$Cyl*car$Veh.Class)
summary(anova8)
##                         Df  Sum Sq Mean Sq  F value   Pr(>F)    
## car$Cyl                  1 2184855 2184855 1079.654  < 2e-16 ***
## car$Veh.Class            4  731173  182793   90.328  < 2e-16 ***
## car$Cyl:car$Veh.Class    4   47817   11954    5.907 0.000105 ***
## Residuals             1180 2387922    2024                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova9= aov(car$Comb.CO2~car$Cyl*car$SmartWay)
summary(anova9)
##                        Df  Sum Sq Mean Sq  F value  Pr(>F)    
## car$Cyl                 1 2184855 2184855 1379.720 < 2e-16 ***
## car$SmartWay            1 1277761 1277761  806.897 < 2e-16 ***
## car$Cyl:car$SmartWay    1   11062   11062    6.985 0.00833 ** 
## Residuals            1186 1878090    1584                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova10= aov(car$Comb.CO2~car$Veh.Class*car$SmartWay)
summary(anova10)
##                              Df  Sum Sq Mean Sq F value   Pr(>F)    
## car$Veh.Class                 4 1230594  307648 165.454  < 2e-16 ***
## car$SmartWay                  1 1855117 1855117 997.683  < 2e-16 ***
## car$Veh.Class:car$SmartWay    4   71936   17984   9.672 1.07e-07 ***
## Residuals                  1180 2194120    1859                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

According to the results of the analyses of varianceabove, We can reject the null hypothesis that the main factor ‘Japanese’, ‘Cyl’, ‘SmartWay’ and ‘Veh.Class’ have no effect on the ejection of CO2, with each p-value < 0.001. For the interactions,‘Japanese x Veh.Class’(p-value<0.01),‘Japanese x SmartWay’(p-value<0.05), ‘Cyl x Veh.Class’(p-value<0.001),‘Cyl x SmartWay’(p-value<0.01) and ‘Veh.Class x Smartway’(p-value<0.001) are statistically significant. The interaction factor ‘Japanese x Cyl’(p-value = 0.775) are not statistically significant.

Estimation

Diagnostics / Model Adequacy Checking

4. References to the literature

[1]. Definition of continuous variable. http://www.statisticshowto.com/continuous-variable/

[2]. Gary W. Oehlert. A First Course in Design and Analysis of Experiments (p.6).

[3]. Montgomery, Douglas C.. Design and Analysis of Experiments, 8th Edition (p.13). Wiley. Kindle Edition.

[4]. Montgomery, Douglas C.. Design and Analysis of Experiments, 8th Edition (p.13). Wiley. Kindle Edition.

5. Appendices

A summary of, or pointer to, the raw data

http://www.fueleconomy.gov/feg/EPAGreenGuide/pdf/all_alpha_17.pdf