This is Project 1 for my Marketing Analytics MKTG3P98 course.
As a member of Toyota’s International Marketing Intelligence Team, my
role is to analyze consumer purchasing behaviors and brand attitudes
across different regions.
The purpose of this report is to:
Evaluate Toyota’s brand performance in various markets. Identify Toyota’s competitors and their strengths. Discover market opportunities where Toyota can grow. This analysis will guide Toyota’s strategic marketing decisions by providing data-driven insights based on consumer preferences. The findings will be visualized using R and data visualization techniques.
Before conducting the analysis, we first load key R packages that assist in:
Data manipulation → dplyr, tidyr Data visualization → ggplot2 Reading external datasets → readxl These packages enable us to clean, transform, and visualize Toyota’s data efficiently.
To ensure accurate analysis, we must handle missing values in the dataset.
The dataset contained missing values in key variables like Att_1 (attitude scores). To maintain data consistency, missing values were replaced with the column mean. This ensures that Toyota’s brand perception analysis remains valid without losing key information. This step prevents distorted insights and allows us to compare Toyota’s performance across different markets without bias.
# Check for missing values in dataset
summary(Car_Total)
## Resp Att_1 Att_2 Enj_1
## Length:1049 Min. :1.000 Min. :1.000 Min. :1.000
## Class :character 1st Qu.:4.000 1st Qu.:4.000 1st Qu.:5.000
## Mode :character Median :5.000 Median :6.000 Median :6.000
## Mean :4.882 Mean :5.287 Mean :5.378
## 3rd Qu.:6.000 3rd Qu.:6.000 3rd Qu.:7.000
## Max. :7.000 Max. :7.000 Max. :7.000
## Enj_2 Perform_1 Perform_2 Perform_3
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:3.000 1st Qu.:4.000 1st Qu.:4.000 1st Qu.:3.000
## Median :5.000 Median :5.000 Median :5.000 Median :5.000
## Mean :4.575 Mean :4.947 Mean :4.831 Mean :4.217
## 3rd Qu.:6.000 3rd Qu.:6.000 3rd Qu.:6.000 3rd Qu.:6.000
## Max. :7.000 Max. :7.000 Max. :7.000 Max. :7.000
## WOM_1 WOM_2 Futu_Pur_1 Futu_Pur_2 Valu_Percp_1
## Min. :1.000 Min. :1.00 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:4.000 1st Qu.:4.00 1st Qu.:5.000 1st Qu.:5.000 1st Qu.:5.000
## Median :6.000 Median :6.00 Median :6.000 Median :6.000 Median :6.000
## Mean :5.286 Mean :5.35 Mean :5.321 Mean :5.371 Mean :5.411
## 3rd Qu.:7.000 3rd Qu.:6.00 3rd Qu.:6.000 3rd Qu.:6.000 3rd Qu.:6.000
## Max. :7.000 Max. :7.00 Max. :9.000 Max. :7.000 Max. :7.000
## Valu_Percp_2 Pur_Proces_1 Pur_Proces_2 Residence
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:4.000 1st Qu.:5.000 1st Qu.:4.000 1st Qu.:1.000
## Median :5.000 Median :6.000 Median :5.000 Median :1.000
## Mean :5.114 Mean :5.256 Mean :4.923 Mean :1.474
## 3rd Qu.:6.000 3rd Qu.:6.000 3rd Qu.:6.000 3rd Qu.:2.000
## Max. :7.000 Max. :7.000 Max. :7.000 Max. :5.000
## Pay_Meth Insur_Type Gender Age
## Min. :1.000 Length:1049 Length:1049 Min. :18.00
## 1st Qu.:1.000 Class :character Class :character 1st Qu.:23.00
## Median :2.000 Mode :character Mode :character Median :34.00
## Mean :2.153 Mean :35.22
## 3rd Qu.:3.000 3rd Qu.:48.00
## Max. :3.000 Max. :60.00
## Education Region Brand Model
## Min. :1.000 Length:1049 Length:1049 Length:1049
## 1st Qu.:2.000 Class :character Class :character Class :character
## Median :2.000 Mode :character Mode :character Mode :character
## Mean :1.989
## 3rd Qu.:2.000
## Max. :3.000
## MPG Cyl acc1 C_cost. H_Cost
## Min. :14.00 Min. :4.0 Min. :3.600 Min. : 7.00 Min. : 6.000
## 1st Qu.:17.00 1st Qu.:4.0 1st Qu.:5.100 1st Qu.:10.00 1st Qu.: 8.000
## Median :19.00 Median :6.0 Median :6.500 Median :12.00 Median :10.000
## Mean :19.58 Mean :5.8 Mean :6.202 Mean :11.35 Mean : 9.634
## 3rd Qu.:22.00 3rd Qu.:6.0 3rd Qu.:7.500 3rd Qu.:13.00 3rd Qu.:11.000
## Max. :26.00 Max. :8.0 Max. :8.500 Max. :16.00 Max. :14.000
## Post.Satis
## Min. :2.00
## 1st Qu.:5.00
## Median :6.00
## Mean :5.28
## 3rd Qu.:6.00
## Max. :7.00
# Compute mean for 'Att_1' column, ignoring NA values
meanATT1 <- mean(Car_Total$Att_1, na.rm = TRUE)
# Replace NAs across **all numeric columns** with their respective means
Car_Total[is.na(Car_Total$Att_1), "Att_1"] <- meanATT1
# Replace NAs across all numeric columns
Car_Total <- Car_Total %>%
mutate(across(where(is.numeric), ~ ifelse(is.na(.), mean(., na.rm=TRUE), .)))
# Verify missing values are removed
summary(Car_Total)
## Resp Att_1 Att_2 Enj_1
## Length:1049 Min. :1.000 Min. :1.000 Min. :1.000
## Class :character 1st Qu.:4.000 1st Qu.:4.000 1st Qu.:5.000
## Mode :character Median :5.000 Median :6.000 Median :6.000
## Mean :4.882 Mean :5.287 Mean :5.378
## 3rd Qu.:6.000 3rd Qu.:6.000 3rd Qu.:7.000
## Max. :7.000 Max. :7.000 Max. :7.000
## Enj_2 Perform_1 Perform_2 Perform_3
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:3.000 1st Qu.:4.000 1st Qu.:4.000 1st Qu.:3.000
## Median :5.000 Median :5.000 Median :5.000 Median :5.000
## Mean :4.575 Mean :4.947 Mean :4.831 Mean :4.217
## 3rd Qu.:6.000 3rd Qu.:6.000 3rd Qu.:6.000 3rd Qu.:6.000
## Max. :7.000 Max. :7.000 Max. :7.000 Max. :7.000
## WOM_1 WOM_2 Futu_Pur_1 Futu_Pur_2 Valu_Percp_1
## Min. :1.000 Min. :1.00 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:4.000 1st Qu.:4.00 1st Qu.:5.000 1st Qu.:5.000 1st Qu.:5.000
## Median :6.000 Median :6.00 Median :6.000 Median :6.000 Median :6.000
## Mean :5.286 Mean :5.35 Mean :5.321 Mean :5.371 Mean :5.411
## 3rd Qu.:7.000 3rd Qu.:6.00 3rd Qu.:6.000 3rd Qu.:6.000 3rd Qu.:6.000
## Max. :7.000 Max. :7.00 Max. :9.000 Max. :7.000 Max. :7.000
## Valu_Percp_2 Pur_Proces_1 Pur_Proces_2 Residence
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:4.000 1st Qu.:5.000 1st Qu.:4.000 1st Qu.:1.000
## Median :5.000 Median :6.000 Median :5.000 Median :1.000
## Mean :5.114 Mean :5.256 Mean :4.923 Mean :1.474
## 3rd Qu.:6.000 3rd Qu.:6.000 3rd Qu.:6.000 3rd Qu.:2.000
## Max. :7.000 Max. :7.000 Max. :7.000 Max. :5.000
## Pay_Meth Insur_Type Gender Age
## Min. :1.000 Length:1049 Length:1049 Min. :18.00
## 1st Qu.:1.000 Class :character Class :character 1st Qu.:23.00
## Median :2.000 Mode :character Mode :character Median :34.00
## Mean :2.153 Mean :35.22
## 3rd Qu.:3.000 3rd Qu.:48.00
## Max. :3.000 Max. :60.00
## Education Region Brand Model
## Min. :1.000 Length:1049 Length:1049 Length:1049
## 1st Qu.:2.000 Class :character Class :character Class :character
## Median :2.000 Mode :character Mode :character Mode :character
## Mean :1.989
## 3rd Qu.:2.000
## Max. :3.000
## MPG Cyl acc1 C_cost. H_Cost
## Min. :14.00 Min. :4.0 Min. :3.600 Min. : 7.00 Min. : 6.000
## 1st Qu.:17.00 1st Qu.:4.0 1st Qu.:5.100 1st Qu.:10.00 1st Qu.: 8.000
## Median :19.00 Median :6.0 Median :6.500 Median :12.00 Median :10.000
## Mean :19.58 Mean :5.8 Mean :6.202 Mean :11.35 Mean : 9.634
## 3rd Qu.:22.00 3rd Qu.:6.0 3rd Qu.:7.500 3rd Qu.:13.00 3rd Qu.:11.000
## Max. :26.00 Max. :8.0 Max. :8.500 Max. :16.00 Max. :14.000
## Post.Satis
## Min. :2.00
## 1st Qu.:5.00
## Median :6.00
## Mean :5.28
## 3rd Qu.:6.00
## Max. :7.00
To enhance the clarity and interpretability of our analysis, we made the following modifications:
Age Segmentation → Grouped consumers into: Young Adults (Under 30) Adults (30-50) Mature Adults (50+) This helps Toyota understand which age groups prefer Toyota over competitors. Categorical Variables → Converted Region and Model into factors, making it easier to visualize trends. These changes allow us to segment Toyota’s consumer base more effectively.
# Convert Age into three demographic groups
Car_Total$AgeGrp <- cut(Car_Total$Age,
breaks = c(0,30,50,Inf),
labels= c("Young Adults", "Adults", "Mature Adults"),
right=FALSE)
# Convert 'Region' and 'Model' columns to categorical variables
Car_Total$Region <- as.factor(Car_Total$Region)
Car_Total$Model <- as.factor(Car_Total$Model)
# Verify the changes in data structure
str(Car_Total)
## 'data.frame': 1049 obs. of 32 variables:
## $ Resp : chr "Res1" "Res10" "Res100" "Res1000" ...
## $ Att_1 : num 6 6 6 6 6 3 2 7 2 6 ...
## $ Att_2 : int 6 6 7 6 6 1 2 7 1 6 ...
## $ Enj_1 : num 6 4 7 7 7 4 1 7 2 6 ...
## $ Enj_2 : num 6 4 3 6 6 3 2 6 1 5 ...
## $ Perform_1 : num 5 4 5 6 6 5 2 5 2 5 ...
## $ Perform_2 : num 6 4 6 6 6 6 2 6 2 5 ...
## $ Perform_3 : num 3 1 6 6 6 6 1 5 2 5 ...
## $ WOM_1 : num 3 5 3 6 4 2 6 6 7 3 ...
## $ WOM_2 : num 3 6 5 6 4 6 7 6 7 3 ...
## $ Futu_Pur_1 : num 3 6 6 6 4 6 6 6 7 6 ...
## $ Futu_Pur_2 : num 3 6 6 6 6 6 5 7 7 6 ...
## $ Valu_Percp_1: num 5 6 7 4 5 5 4 6 4 5 ...
## $ Valu_Percp_2: num 2 6 6 6 6 4 4 5 6 6 ...
## $ Pur_Proces_1: num 6 6 5 6 6 5 4 5 6 6 ...
## $ Pur_Proces_2: num 4 6 5 3 7 5 5 5 7 5 ...
## $ Residence : num 2 1 2 2 1 1 1 2 1 2 ...
## $ Pay_Meth : int 2 2 1 3 3 3 3 3 3 3 ...
## $ Insur_Type : chr "Collision" "Collision" "Collision" "Liability" ...
## $ Gender : chr "Male" "Male" "Female" "Female" ...
## $ Age : int 18 21 32 24 24 25 26 26 27 27 ...
## $ Education : int 2 2 1 2 2 2 2 2 2 2 ...
## $ Region : Factor w/ 4 levels "American","Asian",..: 3 3 1 2 2 2 2 2 2 2 ...
## $ Brand : chr "Ford" "Ford" "Toyota" "Toyota" ...
## $ Model : Factor w/ 14 levels "500x","Camaro",..: 6 6 13 3 3 3 3 3 3 3 ...
## $ MPG : int 15 15 24 26 26 26 26 26 26 26 ...
## $ Cyl : int 8 8 4 4 4 4 4 4 4 4 ...
## $ acc1 : num 5.5 5.5 8.2 8 8 8 8 8 8 8 ...
## $ C_cost. : num 16 16 10 7 7 7 7 7 7 7 ...
## $ H_Cost : num 14 14 8 6 6 6 6 6 6 6 ...
## $ Post.Satis : int 4 5 4 6 5 6 5 6 7 6 ...
## $ AgeGrp : Factor w/ 3 levels "Young Adults",..: 1 1 2 1 1 1 1 1 1 1 ...
Understanding how Toyota performs relative to other brands across regions is critical for market strategy.
This step focuses ONLY on Toyota’s market presence (not all brands). The graph shows where Toyota has the strongest market share. It helps identify gaps where Toyota’s presence is weaker. The results will help Toyota refine its marketing strategy per region.
# Filter dataset to focus only on Toyota's distribution across regions
toyota_distribution <- Car_Total %>%
filter(Brand == "Toyota")
# Create a bar chart showing Toyota's presence in different regions
ggplot(toyota_distribution, aes(x=Region, fill=Region)) +
theme_bw() +
geom_bar() +
labs(y="Number of Toyota Cars", title = "Toyota's Market Presence Across Regions")
To assess Toyota’s competitive positioning, we analyze brand attitudes across different regions.
Brands with high attitude scores → Have stronger consumer loyalty. Brands with lower attitude scores → May need marketing improvements. Competitor positioning varies by region, meaning Toyota must focus on region-specific strategies. Key Insights:
Honda and Ford are Toyota’s biggest threats Toyota is strong in the Middle East but weaker in Europe. Regional differences require tailored marketing approaches. This step will guide Toyota’s competitive strategy in the coming months.
# Compute mean attitude scores for Toyota and its key competitors (Honda & Ford)
brand_region_table <- aggregate(Att_1~Brand+Region, Car_Total, mean) %>%
filter(Brand %in% c("Toyota", "Honda", "Ford")) # Focus only on Toyota & its competitors
# Line chart comparing Toyota vs. competitors across regions
ggplot(brand_region_table, aes(x=Region, y=Att_1, group=Brand)) +
geom_line(aes(color=Brand)) +
geom_point(aes(color=Brand)) +
labs(y="Attitude Mean", title="Toyota vs. Competitors: Consumer Attitudes by Region")
By analyzing consumer attitudes, we identified growth opportunities where Toyota can improve its market presence.
Regions where Toyota’s perception is weaker → Opportunity for rebranding. Competitor weaknesses in certain markets → Toyota can take advantage. Emerging consumer trends → Help Toyota adjust its strategy proactively. Findings:
Toyota performs best in the Middle East Europe is an area for improvement Toyota should target Ford customers in North America. Emphasizing Toyota’s hybrid technology could differentiate it from Honda. Toyota should use these insights to strengthen its brand positioning.
# Identify regions where Toyota is underperforming compared to the global mean
toyota_weak_regions <- brand_region_table %>%
filter(Brand == "Toyota" & Att_1 < mean(brand_region_table$Att_1))
# Print Toyota's weakest regions
print(toyota_weak_regions)
## Brand Region Att_1
## 1 Toyota Asian 4.881199
## 2 Toyota European 4.800000
# Create a bar chart showing Toyota's lowest performance regions
ggplot(toyota_weak_regions, aes(x=Region, y=Att_1, fill=Region)) +
theme_bw() +
geom_bar(stat="identity", position="dodge") +
labs(y="Attitude Score", title="Toyota's Weakest Markets: Consumer Attitudes")
## Step 7: Toyota’s Key Findings Based on this
analysis, Toyota should focus on the following key areas:
Market leader in the Middle East with high consumer trust. Competitive with Honda in North America, maintaining a strong presence.
Struggles in the European market, where Ford and Honda are stronger. Needs stronger differentiation from Honda to maintain leadership.
Honda is Toyota’s biggest rival, especially in global markets. Ford is a strong competitor in North America, posing a threat to Toyota’s growth.
Improve brand perception in Europe to better compete with Ford. Attract Ford customers in North America through marketing and promotions. Leverage hybrid and fuel-efficient technology as Toyota’s unique selling point against Honda.
By leveraging these insights, Toyota can make data-driven decisions to maximize market growth and strengthen its global positioning.