Session 19: Difference Analysis Practice Solution

Author

Sungjin Kim

The Auto Concepts Survey Differences Analysis

Objective
The goal of this case is to identify the target market for each of the five automobile models under consideration by analyzing differences in desirability ratings across demographic groups. This analysis will guide Auto Concepts in developing focused marketing strategies tailored to each model’s potential buyers.


Deliverables

  1. Difference Analysis: Conduct the relevant difference analysis for each automobile type by each demographic variable. If necessary, conduct post-hoc tests.

  2. Target Market Profiles: Identify the most desirable demographic groups for each model.

  3. R Code and Results

    • Include well-documented R scripts used for the analysis (Chi-Square tests, correlations, etc.).

    • Present outputs of the analysis, such as tables, graphs, and statistical test results.

    • Ensure reproducibility by providing clear instructions or annotations within the R code.

Solutions

1. Super Cycle

1.1 Difference Analysis

# Load necessary libraries
library(tidyverse)
library(knitr)
library(kableExtra)

# Load dataset
setwd("~/Documents/GitHub/Marketing-Research-2025-Spring")
auto_concept <- read_csv("auto_concept.csv")

#Analysis for supercycle1seat----------------
# T-test for "Super Cycle" by Gender
t_test_gender <- t.test(supercycle1seat ~ gender, data = auto_concept)
print(t_test_gender)

    Welch Two Sample t-test

data:  supercycle1seat by gender
t = 13.922, df = 982.09, p-value < 2.2e-16
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
 0.848863 1.127436
sample estimates:
mean in group 0 mean in group 1 
       3.076786        2.088636 
# Result: Males have significantly higher desirability ratings for "Super Cycle."

# T-test for "Super Cycle" by Marital Status
t_test_marital <- t.test(supercycle1seat ~ marital, data = auto_concept)
print(t_test_marital)

    Welch Two Sample t-test

data:  supercycle1seat by marital
t = 4.5988, df = 121.53, p-value = 1.05e-05
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
 0.4326959 1.0868138
sample estimates:
mean in group 0 mean in group 1 
       3.318182        2.558427 
# Result: Unmarried individuals have significantly higher desirability ratings for "Super Cycle."

# ANOVA for "Super Cycle" by Age
anova_age <- aov(supercycle1seat ~ factor(age), data = auto_concept)
summary(anova_age)
             Df Sum Sq Mean Sq F value Pr(>F)    
factor(age)   4  355.8   88.96   76.44 <2e-16 ***
Residuals   995 1158.0    1.16                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Age
tukey_age <- TukeyHSD(anova_age)
print(tukey_age)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = supercycle1seat ~ factor(age), data = auto_concept)

$`factor(age)`
            diff         lwr           upr     p adj
35-25 -3.9281250 -4.60763588 -3.2486141162 0.0000000
45-25 -3.2818182 -3.95585641 -2.6077799580 0.0000000
55-25 -2.9879310 -3.69114894 -2.2847131313 0.0000000
65-25 -3.6500000 -4.39193032 -2.9080696811 0.0000000
45-35  0.6463068  0.42970983  0.8629038069 0.0000000
55-35  0.9401940  0.64506324  1.2353246954 0.0000000
65-35  0.2781250 -0.10009098  0.6563409786 0.2620426
55-45  0.2938871  0.01158477  0.5761895272 0.0365466
65-45 -0.3681818 -0.73647492  0.0001112885 0.0501128
65-55 -0.6620690 -1.08138711 -0.2427508230 0.0001705
# Result: 20-29 age group has the highest desirability for "Super Cycle."

# ANOVA for "Super Cycle" by Hometown Size
anova_town <- aov(supercycle1seat ~ factor(townsize), data = auto_concept)
summary(anova_town)
                  Df Sum Sq Mean Sq F value   Pr(>F)    
factor(townsize)   4    103  25.759   18.17 2.05e-14 ***
Residuals        995   1411   1.418                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Hometown Size
tukey_town <- TukeyHSD(anova_town)
print(tukey_town)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = supercycle1seat ~ factor(townsize), data = auto_concept)

$`factor(townsize)`
                diff        lwr       upr     p adj
55-5     -0.07763158 -0.6437193 0.4884561 0.9958056
300-5    -0.13313008 -0.6878989 0.4216387 0.9655628
750-5    -0.11237374 -0.6522475 0.4275000 0.9795391
1250-5    0.85156250  0.2621135 1.4410115 0.0008017
300-55   -0.05549850 -0.3697848 0.2587878 0.9889509
750-55   -0.03474216 -0.3219200 0.2524357 0.9974228
1250-55   0.92919408  0.5570952 1.3012930 0.0000000
750-300   0.02075634 -0.2434109 0.2849236 0.9995277
1250-300  0.98469258  0.6300508 1.3393343 0.0000000
1250-750  0.96393624  0.6330798 1.2947927 0.0000000
# Result: Individuals from hometowns with 1 million and more population have the highest desirability.

# ANOVA for "Super Cycle" by Income Level
anova_income <- aov(supercycle1seat ~ factor(income), data = auto_concept)
summary(anova_income)
                Df Sum Sq Mean Sq F value Pr(>F)    
factor(income)   4  186.8   46.70   35.02 <2e-16 ***
Residuals      995 1327.0    1.33                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Income
tukey_income <- TukeyHSD(anova_income)
print(tukey_income)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = supercycle1seat ~ factor(income), data = auto_concept)

$`factor(income)`
               diff        lwr         upr     p adj
63-25   -2.62342974 -3.3551382 -1.89172129 0.0000000
88-25   -2.95965104 -3.6665004 -2.25280170 0.0000000
125-25  -2.93703385 -3.6471694 -2.22689827 0.0000000
175-25  -2.94871795 -3.7127495 -2.18468643 0.0000000
88-63   -0.33622130 -0.6302434 -0.04219923 0.0156915
125-63  -0.31360411 -0.6154410 -0.01176719 0.0371366
175-63  -0.32528821 -0.7382739  0.08769745 0.1989430
125-88   0.02261719 -0.2126364  0.25787080 0.9989527
175-88   0.01093309 -0.3562125  0.37807868 0.9999901
175-125 -0.01168410 -0.3851174  0.36174921 0.9999879
# Result: Individuals with income under $50K have the highest desirability for "Super Cycle."

# ANOVA for "Super Cycle" by Education Level
anova_education <- aov(supercycle1seat ~ factor(edcation), data = auto_concept)
summary(anova_education)
                  Df Sum Sq Mean Sq F value Pr(>F)    
factor(edcation)   4  231.2   57.81   44.85 <2e-16 ***
Residuals        995 1282.6    1.29                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Education Level
tukey_education <- TukeyHSD(anova_education)
print(tukey_education)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = supercycle1seat ~ factor(edcation), data = auto_concept)

$`factor(edcation)`
            diff        lwr         upr     p adj
4-0  -2.78228228 -3.5977002 -1.96686435 0.0000000
6-0  -3.32262626 -4.0774919 -2.56776060 0.0000000
8-0  -3.46634225 -4.2095670 -2.72311754 0.0000000
10-0 -3.33267974 -4.1377084 -2.52765106 0.0000000
6-4  -0.54034398 -0.9466647 -0.13402327 0.0027082
8-4  -0.68405997 -1.0683220 -0.29979791 0.0000131
10-4 -0.55039746 -1.0436981 -0.05709680 0.0199054
8-6  -0.14371599 -0.3730042  0.08557223 0.4263998
10-6 -0.01005348 -0.3951006  0.37499366 0.9999941
10-8  0.13366252 -0.2280309  0.49535590 0.8508870
# Result: Individuals with less than high school education have the highest desirability for "Super Cycle."

1.2 Target Market Profiles

Target Market Profile for Super Cycle - 1 Seat All-Electric Motorcycle
Demographic Segment
Gender Males
Marital Status Unmarried
Hometown Size 1 million and more
Age Group 20-29 years old
Education Level Less than high school
Income Level Under $50K

2. Runabout Sport

2.1 Difference Analysis

#Analysis for runaboutsport2seat----------------
# T-test for "Runabout Sport" by Gender
t_test_gender <- t.test(runaboutsport2seat ~ gender, data = auto_concept)
print(t_test_gender)

    Welch Two Sample t-test

data:  runaboutsport2seat by gender
t = -0.49153, df = 969.6, p-value = 0.6232
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
 -0.2382751  0.1428205
sample estimates:
mean in group 0 mean in group 1 
       3.900000        3.947727 
# Result: Both Males and Females have similar desirability, no strong gender preference.

# T-test for "Runabout Sport" by Marital Status
t_test_marital <- t.test(runaboutsport2seat ~ marital, data = auto_concept)
print(t_test_marital)

    Welch Two Sample t-test

data:  runaboutsport2seat by marital
t = -2.3079, df = 122.26, p-value = 0.02269
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
 -0.87877336 -0.06729406
sample estimates:
mean in group 0 mean in group 1 
       3.500000        3.973034 
# Result: Married individuals have significantly higher desirability ratings for "Runabout Sport."

# ANOVA for "Runabout Sport" by Age
anova_age <- aov(runaboutsport2seat ~ factor(age), data = auto_concept)
summary(anova_age)
             Df Sum Sq Mean Sq F value Pr(>F)    
factor(age)   4    388   97.01   48.98 <2e-16 ***
Residuals   995   1971    1.98                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Age
tukey_age <- TukeyHSD(anova_age)
print(tukey_age)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = runaboutsport2seat ~ factor(age), data = auto_concept)

$`factor(age)`
            diff         lwr        upr     p adj
35-25  2.6031250  1.71667260  3.4895774 0.0000000
45-25  2.7295455  1.85023238  3.6088585 0.0000000
55-25  1.8637931  0.94641383  2.7811724 0.0000004
65-25  0.8900000 -0.07788135  1.8578814 0.0885565
45-35  0.1264205 -0.15614002  0.4089809 0.7381635
55-35 -0.7393319 -1.12434316 -0.3543206 0.0000019
65-35 -1.7131250 -2.20652469 -1.2197253 0.0000000
55-45 -0.8657524 -1.23402846 -0.4974762 0.0000000
65-45 -1.8395455 -2.32000032 -1.3590906 0.0000000
65-55 -0.9737931 -1.52081241 -0.4267738 0.0000131
# Result: 30-49 years old group has the highest desirability for "Runabout Sport."

# ANOVA for "Runabout Sport" by Hometown Size
anova_town <- aov(runaboutsport2seat ~ factor(townsize), data = auto_concept)
summary(anova_town)
                  Df Sum Sq Mean Sq F value Pr(>F)    
factor(townsize)   4  520.6  130.16   70.46 <2e-16 ***
Residuals        995 1838.1    1.85                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Hometown Size
tukey_town <- TukeyHSD(anova_town)
print(tukey_town)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = runaboutsport2seat ~ factor(townsize), data = auto_concept)

$`factor(townsize)`
                diff        lwr       upr     p adj
55-5      0.90000000  0.2538413 1.5461587 0.0014060
300-5     1.50569106  0.8724522 2.1389299 0.0000000
750-5     2.45555556  1.8393186 3.0717925 0.0000000
1250-5    2.40000000  1.7271755 3.0728245 0.0000000
300-55    0.60569106  0.2469501 0.9644320 0.0000438
750-55    1.55555556  1.2277574 1.8833537 0.0000000
1250-55   1.50000000  1.0752691 1.9247309 0.0000000
750-300   0.94986450  0.6483318 1.2513972 0.0000000
1250-300  0.89430894  0.4895044 1.2991135 0.0000000
1250-750 -0.05555556 -0.4332105 0.3220994 0.9945005
# Result: Individuals from hometowns of 500K to 1 million population have the highest desirability.

# ANOVA for "Runabout Sport" by Income Level
anova_income <- aov(runaboutsport2seat ~ factor(income), data = auto_concept)
summary(anova_income)
                Df Sum Sq Mean Sq F value Pr(>F)    
factor(income)   4  578.6  144.65   80.85 <2e-16 ***
Residuals      995 1780.2    1.79                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Income
tukey_income <- TukeyHSD(anova_income)
print(tukey_income)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = runaboutsport2seat ~ factor(income), data = auto_concept)

$`factor(income)`
              diff        lwr        upr     p adj
63-25    3.7627812  2.9153031  4.6102592 0.0000000
88-25    2.2697201  1.4510343  3.0884059 0.0000000
125-25   1.8062249  0.9837329  2.6287169 0.0000000
175-25   1.6410256  0.7561104  2.5259409 0.0000048
88-63   -1.4930611 -1.8336028 -1.1525194 0.0000000
125-63  -1.9565563 -2.3061493 -1.6069633 0.0000000
175-63  -2.1217555 -2.6000831 -1.6434280 0.0000000
125-88  -0.4634952 -0.7359702 -0.1910202 0.0000373
175-88  -0.6286945 -1.0539292 -0.2034597 0.0005498
175-125 -0.1651993 -0.5977165  0.2673180 0.8348774
# Result: Individuals with income between $50K and $75K have the highest desirability for "Runabout Sport."

# ANOVA for "Runabout Sport" by Education Level
anova_education <- aov(runaboutsport2seat ~ factor(edcation), data = auto_concept)
summary(anova_education)
                  Df Sum Sq Mean Sq F value Pr(>F)    
factor(edcation)   4    871   217.8   145.6 <2e-16 ***
Residuals        995   1488     1.5                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Education Level
tukey_education <- TukeyHSD(anova_education)
print(tukey_education)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = runaboutsport2seat ~ factor(edcation), data = auto_concept)

$`factor(edcation)`
           diff        lwr         upr     p adj
4-0   2.3513514  1.4731289  3.22957382 0.0000000
6-0   3.8636364  3.0506300  4.67664274 0.0000000
8-0   1.8521898  1.0517210  2.65265861 0.0000000
10-0  1.9941176  1.1270846  2.86115067 0.0000000
6-4   1.5122850  1.0746690  1.94990107 0.0000000
8-4  -0.4991616 -0.9130200 -0.08530315 0.0089702
10-4 -0.3572337 -0.8885290  0.17406160 0.3525056
8-6  -2.0114466 -2.2583949 -1.76449829 0.0000000
10-6 -1.8695187 -2.2842227 -1.45481476 0.0000000
10-8  0.1419279 -0.2476236  0.53147934 0.8573651
# Result: Individuals with some college education have the highest desirability for "Runabout Sport."

2.2 Target Market Profiles

Target Market Profile for Runabout Sport - 2 Seat All-Electric Sports Car
Demographic Segment
Gender Both Males and Females
Marital Status Married
Hometown Size 500K to 1 million
Age Group 30-49 years old
Education Level Some college
Income Level $50K-$75K

3. Runabout with Stowage

3.1 Difference Analysis

#Analysis for runaboutstowage2seat----------------
# T-test for "Runabout with Stowage" by Gender
t_test_gender <- t.test(runaboutstowage2seat ~ gender, data = auto_concept)
print(t_test_gender)

    Welch Two Sample t-test

data:  runaboutstowage2seat by gender
t = -0.506, df = 886.14, p-value = 0.613
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
 -0.3049197  0.1799197
sample estimates:
mean in group 0 mean in group 1 
         3.9375          4.0000 
# Result: Both Males and Females have similar desirability ratings for "Runabout with Stowage."

# T-test for "Runabout with Stowage" by Marital Status
t_test_marital <- t.test(runaboutstowage2seat ~ marital, data = auto_concept)
print(t_test_marital)

    Welch Two Sample t-test

data:  runaboutstowage2seat by marital
t = -4.7723, df = 137.47, p-value = 4.598e-06
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
 -1.2879375 -0.5333087
sample estimates:
mean in group 0 mean in group 1 
       3.154545        4.065169 
# Result: Married individuals have significantly higher desirability ratings for "Runabout with Stowage."

# ANOVA for "Runabout with Stowage" by Age
anova_age <- aov(runaboutstowage2seat ~ factor(age), data = auto_concept)
summary(anova_age)
             Df Sum Sq Mean Sq F value Pr(>F)    
factor(age)   4    478  119.52    37.5 <2e-16 ***
Residuals   995   3172    3.19                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Age
tukey_age <- TukeyHSD(anova_age)
print(tukey_age)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = runaboutstowage2seat ~ factor(age), data = auto_concept)

$`factor(age)`
            diff        lwr        upr     p adj
35-25  1.9406250  0.8160498  3.0652002 0.0000271
45-25  1.7818182  0.6663001  2.8973363 0.0001366
55-25  0.1551724 -1.0086374  1.3189822 0.9962388
65-25  0.5000000 -0.7278780  1.7278780 0.7998393
45-35 -0.1588068 -0.5172700  0.1996563 0.7452047
55-35 -1.7854526 -2.2738873 -1.2970179 0.0000000
65-35 -1.4406250 -2.0665639 -0.8146861 0.0000000
55-45 -1.6266458 -2.0938499 -1.1594417 0.0000000
65-45 -1.2818182 -1.8913350 -0.6723014 0.0000001
65-55  0.3448276 -0.3491345  1.0387897 0.6548228
# Result: 30-49 years old group has the highest desirability for "Runabout with Stowage."

# ANOVA for "Runabout with Stowage" by Hometown Size
anova_town <- aov(runaboutstowage2seat ~ factor(townsize), data = auto_concept)
summary(anova_town)
                  Df Sum Sq Mean Sq F value Pr(>F)    
factor(townsize)   4    390   97.40   29.73 <2e-16 ***
Residuals        995   3260    3.28                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Hometown Size
tukey_town <- TukeyHSD(anova_town)
print(tukey_town)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = runaboutstowage2seat ~ factor(townsize), data = auto_concept)

$`factor(townsize)`
               diff        lwr         upr     p adj
55-5      1.7210526  0.8605108  2.58159442 0.0000006
300-5     1.9959350  1.1525996  2.83927032 0.0000000
750-5     1.5808081  0.7601155  2.40150064 0.0000017
1250-5    0.1640625 -0.7319922  1.06011716 0.9873319
300-55    0.2748823 -0.2028820  0.75264667 0.5155883
750-55   -0.1402446 -0.5767998  0.29631074 0.9050396
1250-55  -1.5569901 -2.1226386 -0.99134166 0.0000000
750-300  -0.4151269 -0.8167024 -0.01355138 0.0386808
1250-300 -1.8318725 -2.3709834 -1.29276157 0.0000000
1250-750 -1.4167456 -1.9196992 -0.91379200 0.0000000
# Result: Individuals from hometowns of 10K to 500K population have the highest desirability.

# ANOVA for "Runabout with Stowage" by Income Level
anova_income <- aov(runaboutstowage2seat ~ factor(income), data = auto_concept)
summary(anova_income)
                Df Sum Sq Mean Sq F value Pr(>F)    
factor(income)   4  743.6  185.91   63.65 <2e-16 ***
Residuals      995 2906.1    2.92                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Income
tukey_income <- TukeyHSD(anova_income)
print(tukey_income)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = runaboutstowage2seat ~ factor(income), data = auto_concept)

$`factor(income)`
               diff        lwr        upr     p adj
63-25   -0.32018697 -1.4030113  0.7626374 0.9281830
88-25    1.83605961  0.7900232  2.8820960 0.0000183
125-25   1.59954102  0.5486414  2.6504406 0.0003338
175-25   0.02930403 -1.1013539  1.1599619 0.9999943
88-63    2.15624659  1.7211358  2.5913574 0.0000000
125-63   1.91972799  1.4730523  2.3664037 0.0000000
175-63   0.34949100 -0.2616690  0.9606510 0.5217993
125-88  -0.23651859 -0.5846605  0.1116233 0.3417445
175-88  -1.80675559 -2.3500788 -1.2634324 0.0000000
175-125 -1.57023699 -2.1228651 -1.0176088 0.0000000
# Result: Individuals with income between $76K and $150K have the highest desirability for "Runabout with Stowage."

# ANOVA for "Runabout with Stowage" by Education Level
anova_education <- aov(runaboutstowage2seat ~ factor(edcation), data = auto_concept)
summary(anova_education)
                  Df Sum Sq Mean Sq F value Pr(>F)    
factor(edcation)   4   1575   393.7   188.8 <2e-16 ***
Residuals        995   2075     2.1                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Education Level
tukey_education <- TukeyHSD(anova_education)
print(tukey_education)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = runaboutstowage2seat ~ factor(edcation), data = auto_concept)

$`factor(edcation)`
             diff         lwr        upr     p adj
4-0  -0.366366366 -1.40349862  0.6707659 0.8706504
6-0  -0.006464646 -0.96658030  0.9536510 1.0000000
8-0   2.535685320  1.59037582  3.4809948 0.0000000
10-0  0.573856209 -0.45006193  1.5977743 0.5420739
6-4   0.359901720 -0.15689869  0.8767021 0.3162507
8-4   2.902051687  2.41330774  3.3907956 0.0000000
10-4  0.940222576  0.31279214  1.5676530 0.0004378
8-6   2.542149967  2.25051766  2.8337823 0.0000000
10-6  0.580320856  0.09057837  1.0700633 0.0108762
10-8 -1.961829111 -2.42186790 -1.5017903 0.0000000
# Result: Individuals with a college degree have the highest desirability for "Runabout with Stowage."

3.2 Target Market Profiles

Target Market Profile for Runabout with Stowage - 2 Seat Hybrid
Demographic Segment
Gender Both Males and Females
Marital Status Married
Hometown Size 10K to 500K
Age Group 30-49 years old
Education Level College degree
Income Level $76K-$150K

4. Economy Hybrid

4.1 Difference Analysis

#Analysis for economyhybrid4seat----------------

# T-test for "Economy Hybrid" by Gender
t_test_gender <- t.test(economyhybrid4seat ~ gender, data = auto_concept)
print(t_test_gender)

    Welch Two Sample t-test

data:  economyhybrid4seat by gender
t = 1.3349, df = 865.24, p-value = 0.1823
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
 -0.07199827  0.37816710
sample estimates:
mean in group 0 mean in group 1 
       3.530357        3.377273 
# Result: Males and Females both show significant interest, no strong gender differentiation.

# T-test for "Economy Hybrid" by Marital Status
t_test_marital <- t.test(economyhybrid4seat ~ marital, data = auto_concept)
print(t_test_marital)

    Welch Two Sample t-test

data:  economyhybrid4seat by marital
t = -4.211, df = 142.24, p-value = 4.488e-05
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
 -1.0346077 -0.3735639
sample estimates:
mean in group 0 mean in group 1 
       2.836364        3.540449 
# Result: Married individuals show significantly higher desirability.

# ANOVA for "Economy Hybrid" by Age
anova_age <- aov(economyhybrid4seat ~ factor(age), data = auto_concept)
summary(anova_age)
             Df Sum Sq Mean Sq F value Pr(>F)    
factor(age)   4  860.7  215.16   94.65 <2e-16 ***
Residuals   995 2262.0    2.27                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Age
tukey_age <- TukeyHSD(anova_age)
print(tukey_age)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = economyhybrid4seat ~ factor(age), data = auto_concept)

$`factor(age)`
            diff         lwr        upr     p adj
35-25 -0.1062500 -1.05595146  0.8434515 0.9981011
45-25  1.3113636  0.36931090  2.2534164 0.0014183
55-25  2.6379310  1.65509605  3.6207660 0.0000000
65-25  0.5000000 -0.53694043  1.5369404 0.6802380
45-35  1.4176136  1.11489225  1.7203350 0.0000000
55-35  2.7441810  2.33169892  3.1566631 0.0000000
65-35  0.6062500  0.07764586  1.1348541 0.0152392
55-45  1.3265674  0.93201451  1.7211203 0.0000000
65-45 -0.8113636 -1.32609932 -0.2966279 0.0001760
65-55 -2.1379310 -2.72398059 -1.5518815 0.0000000
# Result: The 50-59 age group shows the highest desirability.

# ANOVA for "Economy Hybrid" by Hometown Size
anova_town <- aov(economyhybrid4seat ~ factor(townsize), data = auto_concept)
summary(anova_town)
                  Df Sum Sq Mean Sq F value   Pr(>F)    
factor(townsize)   4  180.5   45.14   15.27 4.03e-12 ***
Residuals        995 2942.1    2.96                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Hometown Size
tukey_town <- TukeyHSD(anova_town)
print(tukey_town)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = economyhybrid4seat ~ factor(townsize), data = auto_concept)

$`factor(townsize)`
               diff         lwr        upr     p adj
55-5      0.3947368 -0.42274642  1.2122201 0.6790720
300-5     1.4878049  0.68666709  2.2889427 0.0000046
750-5     0.9040404  0.12441245  1.6836684 0.0136362
1250-5    1.2812500  0.43003080  2.1324692 0.0004056
300-55    1.0930680  0.63920936  1.5469267 0.0000000
750-55    0.5093036  0.09459198  0.9240151 0.0073160
1250-55   0.8865132  0.34916777  1.4238585 0.0000715
750-300  -0.5837645 -0.96524653 -0.2022824 0.0003035
1250-300 -0.2065549 -0.71869053  0.3055808 0.8054321
1250-750  0.3772096 -0.10057794  0.8549971 0.1969274
# Result: 100K to 500K and potentially 1 million or more population sizes show the highest desirability.

# ANOVA for "Economy Hybrid" by Income Level
anova_income <- aov(economyhybrid4seat ~ factor(income), data = auto_concept)
summary(anova_income)
                Df Sum Sq Mean Sq F value Pr(>F)    
factor(income)   4   1254  313.63     167 <2e-16 ***
Residuals      995   1868    1.88                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Income
tukey_income <- TukeyHSD(anova_income)
print(tukey_income)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = economyhybrid4seat ~ factor(income), data = auto_concept)

$`factor(income)`
              diff        lwr       upr     p adj
63-25   -0.3201870 -1.1883474 0.5479735 0.8518107
88-25   -0.1054162 -0.9440817 0.7332493 0.9970088
125-25   1.4911073  0.6485427 2.3336719 0.0000151
175-25   3.2930403  2.3865290 4.1995516 0.0000000
88-63    0.2147708 -0.1340818 0.5636233 0.4453025
125-63   1.8112943  1.4531695 2.1694190 0.0000000
175-63   3.6132273  3.1232263 4.1032282 0.0000000
125-88   1.5965235  1.3173988 1.8756482 0.0000000
175-88   3.3984565  2.9628441 3.8340689 0.0000000
175-125  1.8019330  1.3588603 2.2450057 0.0000000
# Result: Over $150K income group shows the highest desirability.

# ANOVA for "Economy Hybrid" by Education Level
anova_education <- aov(economyhybrid4seat ~ factor(edcation), data = auto_concept)
summary(anova_education)
                  Df Sum Sq Mean Sq F value Pr(>F)    
factor(edcation)   4    996  249.00   116.5 <2e-16 ***
Residuals        995   2127    2.14                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Education Level
tukey_education <- TukeyHSD(anova_education)
print(tukey_education)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = economyhybrid4seat ~ factor(edcation), data = auto_concept)

$`factor(edcation)`
           diff         lwr       upr     p adj
4-0  -0.3663664 -1.41634458 0.6836118 0.8756414
6-0   0.1389899 -0.83301778 1.1109976 0.9950701
8-0   1.0539335  0.09691535 2.0109516 0.0224751
10-0  3.7503268  2.71372637 4.7869272 0.0000000
6-4   0.5053563 -0.01784525 1.0285578 0.0641576
8-4   1.4202999  0.92550231 1.9150974 0.0000000
10-4  4.1166932  3.48149136 4.7518950 0.0000000
8-6   0.9149436  0.61969912 1.2101881 0.0000000
10-6  3.6113369  3.11552845 4.1071454 0.0000000
10-8  2.6963933  2.23065645 3.1621301 0.0000000
# Result: Graduate or professional degree holders show the highest desirability.

4.2 Target Market Profiles

Target Market Profile for Economy Hybrid - 4 Seat Hybrid
Demographic Segment
Gender Both Males and Females
Marital Status Married
Hometown Size 100K to 500K
Age Group 50-59 years old
Education Level Graduate or professional degree
Income Level Over $150K

5. Economy Gasoline

5.1 Difference Analysis

#Analysis for economyhybrid4seat----------------

# T-test for "Economy Gasoline" by Gender
t_test_gender <- t.test(economygas4seat ~ gender, data = auto_concept)
print(t_test_gender)

    Welch Two Sample t-test

data:  economygas4seat by gender
t = 7.7329, df = 997.49, p-value = 2.562e-14
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
 0.5039503 0.8466990
sample estimates:
mean in group 0 mean in group 1 
       3.507143        2.831818 
# Result: Males show slightly higher desirability ratings for "Economy Gasoline."

# T-test for "Economy Gasoline" by Marital Status
t_test_marital <- t.test(economygas4seat ~ marital, data = auto_concept)
print(t_test_marital)

    Welch Two Sample t-test

data:  economygas4seat by marital
t = -3.9866, df = 139.92, p-value = 0.0001075
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
 -0.8419347 -0.2837037
sample estimates:
mean in group 0 mean in group 1 
       2.709091        3.271910 
# Result: Married individuals show significantly higher desirability ratings.

# ANOVA for "Economy Gasoline" by Age
anova_age <- aov(economygas4seat ~ factor(age), data = auto_concept)
summary(anova_age)
             Df Sum Sq Mean Sq F value Pr(>F)    
factor(age)   4  415.8   104.0   61.13 <2e-16 ***
Residuals   995 1692.1     1.7                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Age
tukey_age <- TukeyHSD(anova_age)
print(tukey_age)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = economygas4seat ~ factor(age), data = auto_concept)

$`factor(age)`
           diff        lwr       upr     p adj
35-25 0.9062500 0.08485503 1.7276450 0.0220965
45-25 1.3022727 0.48749313 2.1170523 0.0001349
55-25 2.2000000 1.34994791 3.0500521 0.0000000
65-25 3.0400000 2.14315222 3.9368478 0.0000000
45-35 0.3960227 0.13419958 0.6578459 0.0003726
55-35 1.2937500 0.93699501 1.6505050 0.0000000
65-35 2.1337500 1.67656129 2.5909387 0.0000000
55-45 0.8977273 0.55647924 1.2389753 0.0000000
65-45 1.7377273 1.29253337 2.1829212 0.0000000
65-55 0.8400000 0.33312686 1.3468731 0.0000651
# Result: The 60+ age group shows the highest desirability for "Economy Gasoline."

# ANOVA for "Economy Gasoline" by Hometown Size
anova_town <- aov(economygas4seat ~ factor(townsize), data = auto_concept)
summary(anova_town)
                  Df Sum Sq Mean Sq F value Pr(>F)    
factor(townsize)   4  524.3  131.06   82.35 <2e-16 ***
Residuals        995 1583.6    1.59                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Hometown Size
tukey_town <- TukeyHSD(anova_town)
print(tukey_town)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = economygas4seat ~ factor(townsize), data = auto_concept)

$`factor(townsize)`
               diff         lwr        upr     p adj
55-5     -0.2565789 -0.85634214  0.3431842 0.7690691
300-5    -2.0989837 -2.68675474 -1.5112127 0.0000000
750-5    -1.7856061 -2.35759594 -1.2136162 0.0000000
1250-5   -1.8968750 -2.52138925 -1.2723608 0.0000000
300-55   -1.8424048 -2.17538742 -1.5094222 0.0000000
750-55   -1.5290271 -1.83328869 -1.2247655 0.0000000
1250-55  -1.6402961 -2.03453041 -1.2460617 0.0000000
750-300   0.3133777  0.03349562  0.5932597 0.0192537
1250-300  0.2021087 -0.17362998  0.5778475 0.5823269
1250-750 -0.1112689 -0.46180747  0.2392696 0.9087539
# Result: Under 10K ('5') is the most desirable hometown size

# ANOVA for "Economy Gasoline" by Income Level
anova_income <- aov(economygas4seat ~ factor(income), data = auto_concept)
summary(anova_income)
                Df Sum Sq Mean Sq F value   Pr(>F)    
factor(income)   4   46.7  11.687   5.642 0.000172 ***
Residuals      995 2061.2   2.072                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Income
tukey_income <- TukeyHSD(anova_income)
print(tukey_income)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = economygas4seat ~ factor(income), data = auto_concept)

$`factor(income)`
              diff         lwr       upr     p adj
63-25    0.7741747 -0.13774075 1.6860902 0.1393376
88-25    1.0468920  0.16595806 1.9278260 0.0105437
125-25   0.9215433  0.03651376 1.8065729 0.0364838
175-25   1.4102564  0.45805728 2.3624555 0.0005340
88-63    0.2727173 -0.09371722 0.6391519 0.2505234
125-63   0.1473686 -0.22880546 0.5235427 0.8216473
175-63   0.6360817  0.12138490 1.1507785 0.0068024
125-88  -0.1253487 -0.41854118 0.1678437 0.7694841
175-88   0.3633644 -0.09420276 0.8209315 0.1919235
175-125  0.4887131  0.02330969 0.9541165 0.0340396
# Result: Individuals with income levels 150K over show the highest desirability.

# ANOVA for "Economy Gasoline" by Education Level
anova_education <- aov(economygas4seat ~ factor(edcation), data = auto_concept)
summary(anova_education)
                  Df Sum Sq Mean Sq F value   Pr(>F)    
factor(edcation)   4    121  30.258   15.15 4.95e-12 ***
Residuals        995   1987   1.997                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Perform Tukey post-hoc analysis for Education Level
tukey_education <- TukeyHSD(anova_education)
print(tukey_education)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = economygas4seat ~ factor(edcation), data = auto_concept)

$`factor(edcation)`
           diff        lwr        upr     p adj
4-0   2.1411411  1.1262479  3.1560344 0.0000001
6-0   1.0731313  0.1336032  2.0126594 0.0158723
8-0   1.5178427  0.5928032  2.4428821 0.0000799
10-0  1.7281046  0.7261421  2.7300671 0.0000274
6-4  -1.0680098 -1.5737286 -0.5622911 0.0000001
8-4  -0.6232985 -1.1015624 -0.1450346 0.0035390
10-4 -0.4130366 -1.0270132  0.2009400 0.3519768
8-6   0.4447113  0.1593324  0.7300902 0.0002181
10-6  0.6549733  0.1757322  1.1342143 0.0018527
10-8  0.2102619 -0.2399124  0.6604362 0.7058469
# Result: Individuals with a high school education or Graduate or Professional degree show the highest desirability for "Economy Gasoline."

5.2 Target Market Profiles

Target Market Profile for Economy Gasoline - 4 Seat Gasoline Economy
Demographic Segment
Gender Males
Marital Status Married
Hometown Size Under 10K
Age Group 60+ years old
Education Level High school graduate or Graduate or Professional degree
Income Level $150K over