World’s Best-Selling Phone’s Sales
Outline
1-Executive Summary
- Objective: This dataset provides a comprehensive overview of the top 120 best-selling mobile phones, showcasing the evolution of the market over time. It covers key aspects such as the manufacturer, model, form factor, year of release, and units sold in millions. With insights into iconic brands like Apple and Samsung, and detailed information on various phone styles—from classic bar phones to modern touchscreens—this dataset offers a valuable perspective on industry trends and consumer preferences. By examining this data, you can delve into the cell phone hall of fame and understand how technology and design have shaped purchasing behavior throughout the years.
- Key Influencing Factors:
- Manufacturer
- Form Factor
- Model
- Years
2-Introduction
Context:
Explore our dataset of the Top 120 best-selling mobile phones, featuring detailed information on manufacturers, models, form factors, and release years. Track sales in millions to uncover key trends and shifts in mobile technology over time. This concise overview highlights the evolution of consumer preferences and the impact of major brands in the market.
Methodology
Our methodology involves collecting data on the Top 120 best-selling mobile phones, including details on manufacturers, models, form factors, release years, and units sold. We performed data wrangling to clean and organize this information. Exploratory Data Analysis (EDA) was conducted with visualizations to uncover patterns and trends. Finally, predictive analysis using regression models was applied to forecast future sales trends and assess the impact of various factors on phone sales.
A-Data Collection
This is a Flat file as Csv file contain dataset for the Top 120 best-selling mobile phones.
You can Download File from Here: https://www.kaggle.com/datasets/muhammadroshaanriaz/global-best-selling-phone-sales/data
Code to read data:
Show Code
# Use forward slashes in the file path
data <- read.csv("Your Path", header = TRUE, sep = ",")
B-Perform Data Wrangling
We preprocess the collected data to handle missing values, outliers, and inconsistencies. This step ensures that our data is clean, organized, and ready for analysis .
Library used:
Show Code
library(Hmisc)
library(openxlsx)
library(tidyverse)
library(dplyr)
library(ggplot2)
library(here)
library(janitor)
library(skimr)
library(SimDesign)
library(readr)
library(RColorBrewer) # For color palettes
library(gridExtra)
library(ggrepel) # For repelling labels
library(RColorBrewer) # For color palettes
library(htmlwidgets)
library(broom)
Data is Already Cleaned and ready for Analysis
Show Code
# To see the data frame column type
str(data)
glimpse(my_data)
# Rename the column 'UnitsSold' to 'Unit Sold Per Million'
data <- data %>%
rename("Unit Sold Per Million" = "UnitsSold")
# Calculate the number of missing values for each column
missing_values_per_column <- colSums(is.na(data))
print(missing_values_per_column)
C-perform-exploratory-data-analysis-visualization Using (Interactive Charts)
For exploratory data analysis (EDA), we began by visualizing key metrics to uncover underlying patterns and trends. This involved creating various plots to analyze the distribution of data and relationships between variables. Tools like ggplot2 in R were used to generate insightful visualizations that highlighted sales trends, market shifts, and form factor popularity. These visualizations helped us better understand the dataset and guided further analysis.
Note: Most of Those charts Are interactive you can Roll over mouse on the chart to get more details , You can filter the chart with specific element only just Click on Factor.
Show Full Code
# Aggregate the data by 'Manufacturer'
manufacturer_sales <- aggregate(data[["Unit Sold Per Million"]],
by = list(Manufacturer = data[["Manufacturer"]]),
sum)
# Rename columns of the aggregated data
colnames(manufacturer_sales) <- c("Manufacturer", "Unit Sold Per Million")
# Order the results by 'Unit Sold Per Million'
manufacturer_sales <- manufacturer_sales[order(-manufacturer_sales$`Unit Sold Per Million`), ]
# Print the results
print(manufacturer_sales)
# Plot the results
ggplot(manufacturer_sales, aes(x = reorder(Manufacturer, -`Unit Sold Per Million`),
y = `Unit Sold Per Million`)) +
geom_bar(stat = "identity", fill = "steelblue") +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
labs(title = "Total Units Sold by Manufacturer",
x = "Manufacturer",
y = "Units Sold (Million)")
# Calculate total units sold by manufacturer
manufacturer_sales <- data %>%
group_by(Manufacturer) %>%
summarise(Unit_Sold_Per_Million = sum(`Unit Sold Per Million`)) %>%
arrange(desc(Unit_Sold_Per_Million)) # Sort in descending order
# Calculate percentages
total_units <- sum(manufacturer_sales$Unit_Sold_Per_Million)
manufacturer_sales <- manufacturer_sales %>%
mutate(Percentage = Unit_Sold_Per_Million / total_units * 100)
# Define a color palette for fantasy colors
color_palette <- c(
'#FF6347', '#FF4500', '#FFD700', '#32CD32', '#4169E1', '#8A2BE2',
'#FF1493', '#00FA9A', '#D2691E', '#DC143C', '#B22222', '#4B0082',
'#7FFF00', '#00CED1', '#FF69B4', '#8B4513'
)
# Create interactive bar chart with Plotly
fig15 <- plot_ly(
data = manufacturer_sales,
x = ~Manufacturer,
y = ~Unit_Sold_Per_Million,
type = 'bar',
text = ~paste('Units Sold: ', Unit_Sold_Per_Million, '<br>Percentage: ', round(Percentage, 1), '%'),
textposition = 'outside',
textfont = list(size = 18, color = 'black'), # Increase text size and set color
marker = list(
color = color_palette[1:nrow(manufacturer_sales)] # Apply fantasy colors
),
hoverinfo = 'text', # Use text for hover information
color = ~Manufacturer # Ensure different colors for each manufacturer
) %>%
layout(
title = list(
text = 'Total Units Sold by Manufacturer',
font = list(size = 22, color = '#4A4A4A') # Title font size and color
),
xaxis = list(
title = 'Manufacturer',
tickangle = 45,
tickfont = list(size = 18) # X-axis tick font size
),
yaxis = list(
title = 'Units Sold (Million)',
tickfont = list(size = 18) # Y-axis tick font size
),
margin = list(b = 150), # Adjust bottom margin for x-axis labels
barmode = 'group',
hoverlabel = list(
bgcolor = 'white',
font = list(size = 20) # Font size in hover labels
),
legend = list(
title = list(text = 'Manufacturer'),
orientation = 'h',
x = 0.5,
xanchor = 'center',
y = -0.2, # Position legend below the plot
font = list(size = 18, color = 'black') # Increase font size and set color to black
)
) %>%
# Update x-axis to ensure correct order
layout(
xaxis = list(
categoryorder = 'total descending' # Sort categories based on total values
)
)
# Display the plot
fig15
# Aggregate the data by 'Year'
yearly_sales <- aggregate(data[["Unit Sold Per Million"]],
by = list(Year = data[["Year"]]),
sum)
# Rename columns of the aggregated data
colnames(yearly_sales) <- c("Year", "Total Units Sold (Million)")
# Order the results by 'Year'
yearly_sales <- yearly_sales[order(yearly_sales$Year), ]
# Print the results
print(yearly_sales)
# Plot the results
ggplot(yearly_sales, aes(x = Year, y = `Total Units Sold (Million)`)) +
geom_line(color = "red") +
geom_point() +
labs(title = "Total Units Sold by Year",
x = "Year",
y = "Units Sold (Million)")
# Aggregate the data by 'Year' and 'Manufacturer'
yearly_manufacturer_sales <- data %>%
group_by(Year, Manufacturer) %>%
summarise(Total_Units_Sold = sum(`Unit Sold Per Million`), .groups = 'drop')
# Plot the results using facet_wrap
ggplot(yearly_manufacturer_sales, aes(x = Year, y = Total_Units_Sold, color = Manufacturer, group = Manufacturer)) +
geom_line() +
geom_point() +
facet_wrap(~ Manufacturer) +
labs(title = "Total Units Sold by Manufacturer Over Years",
x = "Year",
y = "Units Sold (Million)",
color = "Manufacturer") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
# Filter data for Apple Manufacturer
apple_data <- data %>%
filter(Manufacturer == "Apple")
# Aggregate the data by 'Year' and 'Model'
apple_model_sales <- apple_data %>%
group_by(Year, Model) %>%
summarise(Total_Units_Sold = sum(`Unit Sold Per Million`), .groups = 'drop')
# Plot the results with improved colors, line types, and labels
ggplot(apple_model_sales, aes(x = Year, y = Total_Units_Sold, color = Model, group = Model)) +
geom_line(size = 1) +
geom_point(size = 3) +
geom_text(aes(label = Total_Units_Sold), vjust = -0.5, size = 3, check_overlap = TRUE) +
labs(title = "Performance of Apple Models Over Years",
x = "Year",
y = "Units Sold (Million)",
color = "Model",
linetype = "Model") +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
axis.title = element_text(size = 12),
legend.title = element_text(size = 10),
legend.text = element_text(size = 8)) +
scale_color_brewer(palette = "Set1") +
scale_linetype_manual(values = c("solid", "dashed", "dotted", "dotdash"))
# Filter data for the specified manufacturers
selected_manufacturers <- c("Google", "HTC", "LeTV", "Palm", "Research in Motion (RIM)")
filtered_data <- data %>%
filter(Manufacturer %in% selected_manufacturers)
# Ensure 'Year' is numeric
filtered_data$Year <- as.numeric(as.character(filtered_data$Year))
# Aggregate the data by 'Year' and 'Model'
manufacturer_model_sales <- filtered_data %>%
group_by(Year, Manufacturer, Model) %>%
summarise(Total_Units_Sold = sum(`Unit Sold Per Million`), .groups = 'drop')
# Plot the results using facet_wrap and include model names with adjusted text size and angle
ggplot(manufacturer_model_sales, aes(x = Year, y = Total_Units_Sold, color = Model, group = Model)) +
geom_line(size = 1) +
geom_point(size = 3) +
geom_text(aes(label = Model), size = 4, angle = 45, hjust = 1, vjust = 1.5, check_overlap = TRUE) + # Adjust text size and angle
facet_wrap(~ Manufacturer, scales = "free_y") + # Create a separate panel for each manufacturer
labs(title = "For One Time (Trend)",
x = "Year",
y = "Units Sold (Million)",
color = "Model") +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
axis.title = element_text(size = 12),
legend.title = element_text(size = 10),
legend.text = element_text(size = 8),
strip.text = element_text(size = 12)) + # Adjust facet labels size if needed
scale_color_brewer(palette = "Set1") +
scale_x_continuous(limits = c(1995, 2025), breaks = seq(1995, 2025, by = 5))
# Filter data for the specified manufacturers
selected_manufacturers <- c("Huawei", "Oppo", "Sony Ericsson")
filtered_data <- data %>%
filter(Manufacturer %in% selected_manufacturers)
# Ensure 'Year' is numeric
filtered_data$Year <- as.numeric(as.character(filtered_data$Year))
# Aggregate the data by 'Year' and 'Model'
manufacturer_model_sales <- filtered_data %>%
group_by(Year, Manufacturer, Model) %>%
summarise(Total_Units_Sold = sum(`Unit Sold Per Million`), .groups = 'drop')
# Plot the results using facet_wrap and include model names with adjusted text size and angle
ggplot(manufacturer_model_sales, aes(x = Year, y = Total_Units_Sold, color = Model, group = Model)) +
geom_line(size = 1) +
geom_point(size = 3) +
geom_text(aes(label = Model), size = 4, angle = 45, hjust = 1, vjust = 1.5, check_overlap = TRUE) + # Adjust text size and angle
facet_wrap(~ Manufacturer, scales = "free_y") + # Create a separate panel for each manufacturer
labs(title = " Succeed Short Term plan < 5 Years ",
x = "Year",
y = "Units Sold (Million)",
color = "Model") +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
axis.title = element_text(size = 12),
legend.title = element_text(size = 10),
legend.text = element_text(size = 8),
strip.text = element_text(size = 12)) + # Adjust facet labels size if needed
scale_color_brewer(palette = "Set1") +
scale_x_continuous(limits = c(1995, 2025), breaks = seq(1995, 2025, by = 5))
# Filter data for the specified manufacturers
selected_manufacturers <- c("Apple", "Nokia", "Samsung", "LG", "Motorola", "Xiaomi")
filtered_data <- data %>%
filter(Manufacturer %in% selected_manufacturers)
# Ensure 'Year' is numeric
filtered_data$Year <- as.numeric(as.character(filtered_data$Year))
# Aggregate the data by 'Year' and 'Model'
manufacturer_model_sales <- filtered_data %>%
group_by(Year, Manufacturer, Model) %>%
summarise(Total_Units_Sold = sum(`Unit Sold Per Million`), .groups = 'drop')
# Plot the results using facet_wrap without the legend
ggplot(manufacturer_model_sales, aes(x = Year, y = Total_Units_Sold, color = Model, group = Model)) +
geom_line(size = 1) +
geom_point(size = 2) + # Adjust point size if needed
facet_wrap(~ Manufacturer, scales = "free_y") + # Create a separate panel for each manufacturer
labs(title = "For Long Term Plan",
x = "Year",
y = "Units Sold (Million)") +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
axis.title = element_text(size = 12),
legend.position = "none", # Hide the legend
strip.text = element_text(size = 12)) + # Adjust facet labels size if needed
scale_x_continuous(limits = c(1995, 2025), breaks = seq(1995, 2025, by = 5))
# Filter data for the specified manufacturers
selected_manufacturers <- c("Apple", "Nokia", "Samsung")
filtered_data <- data %>%
filter(Manufacturer %in% selected_manufacturers)
# Ensure 'Year' is numeric
filtered_data$Year <- as.numeric(as.character(filtered_data$Year))
# Aggregate the data by 'Year' and 'Model'
manufacturer_model_sales <- filtered_data %>%
group_by(Year, Manufacturer, Model) %>%
summarise(Total_Units_Sold = sum(`Unit Sold Per Million`), .groups = 'drop')
# Plot the results using facet_wrap without the legend
ggplot(manufacturer_model_sales, aes(x = Year, y = Total_Units_Sold, color = Model, group = Model)) +
geom_line(size = 1) +
geom_point(size = 2) + # Adjust point size if needed
facet_wrap(~ Manufacturer, scales = "free_y") + # Create a separate panel for each manufacturer
labs(title = "For The Best Long Term Plan",
x = "Year",
y = "Units Sold (Million)") +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
axis.title = element_text(size = 12),
legend.position = "none", # Hide the legend
strip.text = element_text(size = 12),
) + # Adjust facet labels size if needed
scale_x_continuous(limits = c(1995, 2025), breaks = seq(1995, 2025, by = 5))
# Aggregate data by Form Factor
form_factor_sales <- data %>%
group_by(`Form.Factor`) %>%
summarise(Total_Units_Sold = sum(`Unit Sold Per Million`), .groups = 'drop')
# Reorder the rows based on Total Units Sold
form_factor_sales <- form_factor_sales %>%
arrange(desc(Total_Units_Sold))
# Define a color palette from RColorBrewer
color_palette <- brewer.pal(n = length(unique(form_factor_sales$`Form.Factor`)), name = "Set3")
# Print the aggregated data
print(form_factor_sales)
# Plotting the results with enhanced styling and grid lines removed
ggplot(form_factor_sales, aes(x = reorder(`Form.Factor`, -Total_Units_Sold), y = Total_Units_Sold, fill = `Form.Factor`)) +
geom_bar(stat = "identity") +
scale_fill_manual(values = color_palette) + # Apply the color palette
theme_minimal(base_size = 14) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 12, color = "black"),
axis.title = element_text(size = 14, color = "black"),
legend.title = element_text(size = 14, face = "bold", color = "black"),
legend.text = element_text(size = 12, color = "black"),
strip.text = element_text(size = 14, face = "bold", color = "black"),
panel.grid.major = element_blank(), # Remove major grid lines
panel.grid.minor = element_blank(), # Remove minor grid lines
plot.title = element_text(face = "bold", size = 16, color = "darkblue", hjust = 0.5), # Center the title
legend.background = element_rect(fill = "lightgray", color = "black")
) +
labs(
title = "Total Units Sold by Form Factor",
x = "Form Factor",
y = "Units Sold (Million)"
)
# Aggregate data by Form Factor
form_factor_sales <- data %>%
group_by(`Form.Factor`) %>%
summarise(Total_Units_Sold = sum(`Unit Sold Per Million`), .groups = 'drop')
# Reorder the rows based on Total Units Sold
form_factor_sales <- form_factor_sales %>%
arrange(desc(Total_Units_Sold))
# Calculate the percentage of each form factor
total_units <- sum(form_factor_sales$Total_Units_Sold)
form_factor_sales <- form_factor_sales %>%
mutate(Percentage = Total_Units_Sold / total_units * 100)
# Define a color palette from RColorBrewer
color_palette <- brewer.pal(n = length(unique(form_factor_sales$`Form.Factor`)), name = "Set3")
# Create the interactive bar chart with Plotly
fig14 <- plot_ly(
data = form_factor_sales,
x = ~reorder(`Form.Factor`, -Total_Units_Sold),
y = ~Total_Units_Sold,
type = 'bar',
color = ~`Form.Factor`,
colors = color_palette,
text = ~paste('Total Units Sold: ', Total_Units_Sold, '<br>Percentage: ', round(Percentage, 1), '%'),
textposition = 'outside',
textfont = list(color = 'black'), # Set font color for text on bars
hoverinfo = 'text', # Display text on hover
showlegend = TRUE
) %>%
layout(
title = 'Total Units Sold Perm Million by Form Factor',
xaxis = list(
title = 'Form Factor',
tickangle = 45,
showgrid = FALSE
),
yaxis = list(
title = 'Units Sold (Million)',
showgrid = FALSE # Remove y-axis gridlines
),
margin = list(b = 120), # Adjust bottom margin for x-axis labels
legend = list(
title = 'Form Factor',
font = list(size = 14, color = 'black')
),
annotations = list(
list(
text = 'Hover over bars to see details',
x = 0.5,
y = -0.15,
xref = 'paper',
yref = 'paper',
showarrow = FALSE,
font = list(size = 16, color = 'black'),
align = 'center'
)
)
)
# Display the interactive plot
fig14
# Filter data for 'Bar' and 'TouchScreen' form factors
filtered_data <- data %>%
filter(`Form.Factor` %in% c("Bar", "Touchscreen"))
# Ensure 'Year' is numeric
filtered_data$Year <- as.numeric(as.character(filtered_data$Year))
# Aggregate the data by 'Year' and 'Form.Factor'
yearly_sales <- filtered_data %>%
group_by(Year, `Form.Factor`) %>%
summarise(Total_Units_Sold = sum(`Unit Sold Per Million`), .groups = 'drop')
# Plotting the results
ggplot(yearly_sales, aes(x = Year, y = Total_Units_Sold, color = `Form.Factor`, linetype = `Form.Factor`)) +
geom_line(size = 1) + # Line for each form factor
geom_point(size = 3) + # Points on the line
scale_color_manual(values = c("Bar" = "blue", "Touchscreen" = "red")) + # Custom colors for each form factor
scale_linetype_manual(values = c("Bar" = "solid", "Touchscreen" = "dashed")) + # Custom line types
theme_minimal(base_size = 14) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 12, color = "black"),
axis.title = element_text(size = 14, color = "black"),
legend.title = element_text(size = 14, face = "bold", color = "black"),
legend.text = element_text(size = 12, color = "black"),
plot.title = element_text(face = "bold", size = 16, color = "darkblue", hjust = 0.5)
) +
labs(
title = "Yearly Units Sold for Bar and TouchScreen Form Factors",
x = "Year",
y = "Total Units Sold (Million)",
color = "Form Factor",
linetype = "Form Factor"
)
# Filter data for 'Bar' and 'TouchScreen' form factors
filtered_data <- data %>%
filter(`Form.Factor` %in% c("Bar", "Touchscreen"))
# Ensure 'Year' is numeric
filtered_data$Year <- as.numeric(as.character(filtered_data$Year))
# Aggregate the data by 'Year' and 'Form.Factor'
yearly_sales <- filtered_data %>%
group_by(Year, `Form.Factor`) %>%
summarise(Total_Units_Sold = sum(`Unit Sold Per Million`), .groups = 'drop')
# Create a ggplot object
p <- ggplot(yearly_sales, aes(x = Year, y = Total_Units_Sold, color = `Form.Factor`, linetype = `Form.Factor`)) +
geom_line(size = 1) + # Line for each form factor
geom_point(size = 3) + # Points on the line
scale_color_manual(values = c("Bar" = "blue", "Touchscreen" = "red")) + # Custom colors for each form factor
scale_linetype_manual(values = c("Bar" = "solid", "Touchscreen" = "dashed")) + # Custom line types
theme_minimal(base_size = 14) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 12, color = "black"),
axis.title = element_text(size = 14, color = "black"),
legend.title = element_text(size = 14, face = "bold", color = "black"),
legend.text = element_text(size = 12, color = "black"),
plot.title = element_text(face = "bold", size = 16, color = "darkblue", hjust = 0.5),
panel.grid.major = element_blank(), # Remove major grid lines
panel.grid.minor = element_blank(), # Remove minor grid lines
panel.background = element_rect(fill = "whitesmoke") # Set background color
) +
labs(
title = "Yearly Units Sold for Bar and TouchScreen Form Factors",
x = "Year",
y = "Total Units Sold (Million)",
color = "Form Factor",
linetype = "Form Factor"
)
# Convert ggplot object to a plotly object for interactivity
fig13 <- ggplotly(p)
# Display the interactive plot
fig13
# Filter data to include both smartphones and non-smartphones
filtered_data <- data %>%
filter(!is.na(Smartphone.))
# Ensure 'Year' is numeric
filtered_data$Year <- as.numeric(as.character(filtered_data$Year))
# Convert 'Smartphone.' to a factor with labels
filtered_data$Smartphone. <- factor(filtered_data$Smartphone., levels = c(FALSE, TRUE), labels = c("Non-Smartphone", "Smartphone"))
# Perform t-test
t_test_result <- t.test(`Unit Sold Per Million` ~ Smartphone., data = filtered_data)
# Display t-test results
print(t_test_result)
# Create interactive box plot with Plotly
fig12 <- plot_ly(
data = filtered_data,
x = ~Smartphone.,
y = ~`Unit Sold Per Million`,
type = 'box',
color = ~Smartphone.,
colors = c("Non-Smartphone" = "#FF6347", "Smartphone" = "#4682B4"), # Tomato and Steel Blue colors
boxmean = TRUE, # Show mean
text = ~paste('Units Sold: ', `Unit Sold Per Million`, '<br>Percentage: ', round(`Unit Sold Per Million` / sum(`Unit Sold Per Million`) * 100, 1), '%'),
hoverinfo = 'x+y+text'
) %>%
layout(
title = 'Units Sold by Smartphone Status',
xaxis = list(
title = 'Smartphone Status',
tickvals = c("Non-Smartphone", "Smartphone"),
ticktext = c("Non-Smartphone", "Smartphone")
),
yaxis = list(
title = 'Units Sold (Million)',
zeroline = FALSE
),
plot_bgcolor = 'lightgray', # Light gray plot background
paper_bgcolor = 'white', # White paper background
font = list(
family = "Arial, sans-serif",
size = 14,
color = "black"
),
boxmode = 'group'
)
# Display the plot
fig12
# Filter data to include both smartphones and non-smartphones
filtered_data <- data %>%
filter(!is.na(Smartphone.))
# Ensure 'Year' is numeric
filtered_data$Year <- as.numeric(as.character(filtered_data$Year))
# Aggregate the data by year and smartphone status
yearly_sales <- filtered_data %>%
group_by(Year, Smartphone.) %>%
summarise(Total_Units_Sold = sum(`Unit Sold Per Million`), .groups = 'drop')
# Convert Smartphone status to factor for better plot labeling
yearly_sales$Smartphone. <- factor(yearly_sales$Smartphone., labels = c("Non-Smartphone", "Smartphone"))
# Create ggplot object
p <- ggplot(yearly_sales, aes(x = Year, y = Total_Units_Sold, color = Smartphone., linetype = Smartphone.)) +
geom_line(size = 1.2) + # Line for each form factor
geom_point(size = 3) + # Points on the line
scale_color_manual(values = c("Non-Smartphone" = "#FF6347", "Smartphone" = "#4682B4")) + # Tomato and Steel Blue colors
scale_linetype_manual(values = c("Non-Smartphone" = "solid", "Smartphone" = "dashed")) + # Custom line types
labs(
title = "Trends in Units Sold Over Years by Smartphone Status",
x = "Year",
y = "Total Units Sold (Million)",
color = "Smartphone Status",
linetype = "Smartphone Status"
) +
theme_minimal(base_size = 16) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 14, color = "black"),
axis.text.y = element_text(size = 14, color = "black"),
axis.title = element_text(size = 16, color = "black"),
plot.title = element_text(face = "bold", size = 18, color = "darkblue", hjust = 0.5),
legend.title = element_text(size = 14, face = "bold", color = "black"),
legend.text = element_text(size = 12, color = "black"),
panel.grid.major = element_blank(), # Remove major gridlines
panel.grid.minor = element_blank(), # Remove minor gridlines
panel.background = element_rect(fill = "white", color = NA), # Simple white background
plot.background = element_rect(fill = "lightgray", color = NA) # Light gray plot background
)
# Convert ggplot object to plotly for interactivity
fig11 <- ggplotly(p)
# Display the interactive plot
fig11
# Filter data for the selected manufacturers
selected_manufacturers <- c("Apple", "Nokia", "Samsung")
filtered_data <- data %>%
filter(Manufacturer %in% selected_manufacturers)
# Ensure 'Year' is numeric
filtered_data$Year <- as.numeric(as.character(filtered_data$Year))
# Aggregate data by Manufacturer, Smartphone status, and Form Factor
aggregated_data <- filtered_data %>%
group_by(Manufacturer, Smartphone., Form.Factor) %>%
summarise(Total_Units_Sold = sum(`Unit Sold Per Million`), .groups = 'drop')
# Convert Smartphone status to a factor for better labeling
aggregated_data$Smartphone. <- factor(aggregated_data$Smartphone., labels = c("Non-Smartphone", "Smartphone"))
# Plotting the results
p <- ggplot(aggregated_data, aes(x = Manufacturer, y = Total_Units_Sold, fill = interaction(Smartphone., Form.Factor))) +
geom_bar(stat = "identity", position = "dodge") +
labs(
title = "Total Units Sold by Manufacturer, Smartphone Status, and Form Factor",
x = "Manufacturer",
y = "Total Units Sold (Million)",
fill = "Smartphone Status and Form Factor"
) +
scale_fill_manual(values = c("Smartphone.Bar" = "#FF6347", "Smartphone.Touchscreen" = "#4682B4", "Non-Smartphone.Bar" = "#32CD32", "Non-Smartphone.Touchscreen" = "#FFD700")) + # Custom colors
theme_minimal(base_size = 16) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 14, color = "black"),
axis.text.y = element_text(size = 14, color = "black"),
axis.title = element_text(size = 16, color = "black"),
plot.title = element_text(face = "bold", size = 18, color = "darkblue", hjust = 0.5),
legend.title = element_text(size = 14, face = "bold", color = "black"),
legend.text = element_text(size = 12, color = "black"),
panel.grid.major = element_blank(), # Remove major gridlines
panel.grid.minor = element_blank(), # Remove minor gridlines
panel.background = element_rect(fill = "white", color = NA), # Simple white background
plot.background = element_rect(fill = "lightgray", color = NA) # Light gray plot background
)
# Convert ggplot object to plotly for interactivity
fig10 <- ggplotly(p)
# Display the interactive plot
fig10
1-Total Units Sold Per Manufacturer
- Bar Chart: Total Units Sold Per Million by Manufacturer
- Sales Distribution: Displays total units sold per million for each manufacturer.
- Comparison: Highlights which manufacturers have the highest sales figures.
- Performance Trends: Shows variations in sales performance among manufacturers.
- Data Summary: Aggregated sales data helps identify top-performing manufacturers.
Show Code
# Load data from Google Drive
url <- "https://drive.google.com/uc?id=1W9UYyKDAV3UZWPppYs7N8R9ne5ws5iZ3&export=download"
data <- read.csv(url)
# Convert 'Year' column to numeric (just in case it's needed later)
data$Year <- as.numeric(as.character(data$Year))
# Aggregate the data by 'Manufacturer'
manufacturer_sales <- aggregate(data[["UnitsSold"]],
by = list(Manufacturer = data[["Manufacturer"]]),
sum)
# Rename columns of the aggregated data
colnames(manufacturer_sales) <- c("Manufacturer", "Unit_Sold_Per_Million")
# Reorder the Manufacturer factor levels based on Unit_Sold_Per_Million
manufacturer_sales$Manufacturer <- factor(
manufacturer_sales$Manufacturer,
levels = manufacturer_sales$Manufacturer[order(-manufacturer_sales$Unit_Sold_Per_Million)]
)
# Replace "Research in Motion (RIM)" with "RIM"
manufacturer_sales$Manufacturer[manufacturer_sales$Manufacturer == "Research in Motion (RIM)"] <- "RIM"
## Warning in `[<-.factor`(`*tmp*`, manufacturer_sales$Manufacturer == "Research
## in Motion (RIM)", : invalid factor level, NA generated
# Print the results
print(manufacturer_sales)
## Manufacturer Unit_Sold_Per_Million
## 1 Apple 1669.3
## 2 Google 2.1
## 3 HTC 16.0
## 4 Huawei 113.8
## 5 LeTV 3.0
## 6 LG 92.0
## 7 Motorola 323.0
## 8 Nokia 2374.5
## 9 Oppo 16.7
## 10 Palm 2.0
## 11 <NA> 15.0
## 12 Samsung 994.5
## 13 Sony Ericsson 45.0
## 14 Xiaomi 99.1
# Convert ggplot to an interactive plot using plotly
fig1 <- ggplotly(p2, width = 900, height = 600)
# Convert ggplot to an interactive plot using plotly
fig1
Observations:
- Bar Chart: Total Units Sold Per Million by Manufacturer
- Top Performers: Nokia leads with 2374.5 units sold per million, followed by Apple with 1669.3 units.
- Mid-Tier: Samsung, Motorola, and Huawei show moderate sales, with Samsung at 994.5, Motorola at 323.0, and Huawei at 113.8 units sold per million.
- Lower Performers: Xiaomi, LG, and other manufacturers like Sony Ericsson and Oppo have significantly lower sales figures, indicating lower market impact.
- Trend Analysis: A clear gap exists between the top manufacturers and those with lower sales, suggesting a concentrated market presence among a few key players.
2-Total Unit Sold Over Years
- Line Chart: Total Units Sold Per Year
- Trend: Identifies overall increase or decrease in sales.
- Growth: Highlights significant growth or decline periods.
- Peaks and Troughs: Notes years with highest and lowest sales.
- Stability: Shows if sales are stable or variable over time.
Show Code
# Aggregate the data by 'Year'
yearly_total_sales <- data %>%
group_by(Year) %>%
summarise(Total_Units_Sold = sum(UnitsSold), .groups = 'drop')
# Convert ggplot to an interactive plot using plotly
fig2 <- ggplotly(p2, width = 900, height = 600)
# Convert ggplot to an interactive plot using plotly
fig2
Observations:
- Line Chart: Total Units Sold Per Year
- Trend: Shows fluctuating sales over the years with notable peaks and declines.
- Growth: Significant increases observed in 2003 and 2005, with sales dropping sharply after 2021.
- Peaks and Troughs: Peaks in 2003 (543.0 million) and 2005 (497.5 million); troughs in 2021 (13.5 million) and 2022 (10.9 million).
- Stability: Recent years show a drastic decline in sales, suggesting potential market or industry challenges.
3- Units Sold by each Manufacturer Over Years
- Line Chart: Total Units Sold by Manufacturer Models Over Years
- Individual Manufacturer Trends: Each facet shows the sales trend for a specific manufacturer, illustrating their performance over the years.
- Sales Patterns: Identifies if a manufacturer has consistent sales, growth, or decline, and how these patterns vary across different periods.
- Model Performance: Reveals if certain manufacturers have many models achieving high ranks and how these models perform over time.
- Strategic Insights: Provides insights into each manufacturer’s market strategy and plan based on their sales trends and model performance.
Show Code
# Filter data for the specified manufacturers
selected_manufacturers <- c("Google", "HTC", "LeTV", "Palm", "Research in Motion (RIM)")
filtered_data <- data %>%
filter(Manufacturer %in% selected_manufacturers)
# Ensure 'Year' is numeric
filtered_data$Year <- as.numeric(as.character(filtered_data$Year))
# Aggregate the data by 'Year' and 'Model'
manufacturer_model_sales <- filtered_data %>%
group_by(Year, Manufacturer, Model) %>%
summarise(Total_Units_Sold = sum(`Unit Sold Per Million`), .groups = 'drop')
# Plot the results using facet_wrap and include model names with adjusted text size and angle
ggplot(manufacturer_model_sales, aes(x = Year, y = Total_Units_Sold, color = Model, group = Model)) +
geom_line(size = 1) +
geom_point(size = 3) +
geom_text(aes(label = Model), size = 4, angle = 45, hjust = 1, vjust = 1.5, check_overlap = TRUE) + # Adjust text size and angle
facet_wrap(~ Manufacturer, scales = "free_y") + # Create a separate panel for each manufacturer
labs(title = "For One Time (Trend)",
x = "Year",
y = "Units Sold (Million)",
color = "Model") +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
axis.title = element_text(size = 12),
legend.title = element_text(size = 10),
legend.text = element_text(size = 8),
strip.text = element_text(size = 12)) + # Adjust facet labels size if needed
scale_color_brewer(palette = "Set1") +
scale_x_continuous(limits = c(1995, 2025), breaks = seq(1995, 2025, by = 5))
Observations:
- Line Chart: Total Units Sold by Manufacturer Over Years
- Manufacturer Trends: Each manufacturer shows distinct sales patterns over time.
- Peak Sales: Nokia had significant sales peaks in 2003, 2004, and 2005. Apple saw notable increases in 2013 and 2014.
- Decline: Recent years show declining sales for Samsung, from 2020 to 2024, while Apple’s sales remain relatively stable.
- Emerging Trends: Newer manufacturers like Xiaomi and Huawei have shown increasing sales in recent years, indicating a shift in market dynamics.
We will now classify and decompose market strategies for all manufacturers into four distinct groups:
- One-Time Trend
- Short-Term Plan: Manufacturers with multiple models achieving top ranks over a period of less than 5 years, indicating a short-term strategic focus.
- Long-Term Plan: Manufacturers with multiple models achieving top ranks over more than 5 years, reflecting a long-term strategic approach.
- Best Market Strategy: Manufacturers demonstrating consistent top performance with multiple models over an extended period, showcasing the best market strategy.
4- One-Time Trend
Manufacturers with only one model achieving top rank, showing a single-time trend in their sales performance
Show Code
# Filter data for the specified manufacturers
selected_manufacturers <- c("Huawei", "Oppo", "Sony Ericsson")
filtered_data <- data %>%
filter(Manufacturer %in% selected_manufacturers)
# Ensure 'Year' is numeric
filtered_data$Year <- as.numeric(as.character(filtered_data$Year))
# Aggregate the data by 'Year' and 'Model'
manufacturer_model_sales <- filtered_data %>%
group_by(Year, Manufacturer, Model) %>%
summarise(Total_Units_Sold = sum(`Unit Sold Per Million`), .groups = 'drop')
ggplot(manufacturer_model_sales, aes(x = Year, y = Total_Units_Sold, color = Model, group = Model)) +
geom_line(size = 1) +
geom_point(size = 3) +
geom_text(aes(label = Model), size = 4, angle = 45, hjust = 1, vjust = 1.5, check_overlap = TRUE) + # Adjust text size and angle
facet_wrap(~ Manufacturer, scales = "free_y") + # Create a separate panel for each manufacturer
labs(title = " Succeed Short Term plan < 5 Years ",
x = "Year",
y = "Units Sold (Million)",
color = "Model") +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
axis.title = element_text(size = 12),
legend.title = element_text(size = 10),
legend.text = element_text(size = 8),
strip.text = element_text(size = 12)) + # Adjust facet labels size if needed
scale_color_brewer(palette = "Set1") +
scale_x_continuous(limits = c(1995, 2025), breaks = seq(1995, 2025, by = 5))
Observations:
<- Manufacturers with One Model: Each entry represents a manufacturer that had a model with notable sales during a specific year, but no other models achieved top rank.
- Limited Longevity: These models had a brief period of success, suggesting a short-lived market impact.
- Yearly Sales: The models listed achieved top sales in a particular year but did not sustain this level of performance in subsequent years.
- Strategic Insight: Indicates that these manufacturers may have had a brief strategic push or successful product launch, without a sustained long-term strategy for multiple top-ranked models.
5- Short Term Plan
Classification: Short-Term Plan - This data highlights manufacturers that achieved top sales with multiple models over a period of less than 5 years. Specifically:
Show Code
# Filter data for the specified manufacturers
selected_manufacturers <- c("Apple", "Nokia", "Samsung", "LG", "Motorola", "Xiaomi")
filtered_data <- data %>%
filter(Manufacturer %in% selected_manufacturers)
# Ensure 'Year' is numeric
filtered_data$Year <- as.numeric(as.character(filtered_data$Year))
# Aggregate the data by 'Year' and 'Model'
manufacturer_model_sales <- filtered_data %>%
group_by(Year, Manufacturer, Model) %>%
summarise(Total_Units_Sold = sum(`Unit Sold Per Million`), .groups = 'drop')
# Plot the results using facet_wrap without the legend
ggplot(manufacturer_model_sales, aes(x = Year, y = Total_Units_Sold, color = Model, group = Model)) +
geom_line(size = 1) +
geom_point(size = 2) + # Adjust point size if needed
facet_wrap(~ Manufacturer, scales = "free_y") + # Create a separate panel for each manufacturer
labs(title = "For Long Term Plan",
x = "Year",
y = "Units Sold (Million)") +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
axis.title = element_text(size = 12),
legend.position = "none", # Hide the legend
strip.text = element_text(size = 12)) + # Adjust facet labels size if needed
scale_x_continuous(limits = c(1995, 2025), breaks = seq(1995, 2025, by = 5))
Observations:
<- Multiple Top Models: Sony Ericsson and Huawei each had several models achieving top sales within a brief period.
- Timeframe: The successful models from Sony Ericsson span from 2004 to 2006, and Huawei’s models span from 2016 to 2019.
- Short-Term Success: This indicates a focused strategic effort to capitalize on market trends and rapidly achieve top sales within a short timeframe.
- Market Strategy: Demonstrates a strategic push with multiple successful models in a short period, reflecting aggressive marketing or innovation strategies.
6- Long Term Plan
Classification: Long-Term Plan - This data highlights manufacturers that achieved top sales with multiple models over a period of more than 5 years. Specifically:
Show Code
# Filter data for the specified manufacturers
selected_manufacturers <- c("Apple", "Nokia", "Samsung")
filtered_data <- data %>%
filter(Manufacturer %in% selected_manufacturers)
# Ensure 'Year' is numeric
filtered_data$Year <- as.numeric(as.character(filtered_data$Year))
# Aggregate the data by 'Year' and 'Model'
manufacturer_model_sales <- filtered_data %>%
group_by(Year, Manufacturer, Model) %>%
summarise(Total_Units_Sold = sum(`Unit Sold Per Million`), .groups = 'drop')
# Plot the results using facet_wrap without the legend
ggplot(manufacturer_model_sales, aes(x = Year, y = Total_Units_Sold, color = Model, group = Model)) +
geom_line(size = 1) +
geom_point(size = 2) + # Adjust point size if needed
facet_wrap(~ Manufacturer, scales = "free_y") + # Create a separate panel for each manufacturer
labs(title = "For The Best Long Term Plan",
x = "Year",
y = "Units Sold (Million)") +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
axis.title = element_text(size = 12),
legend.position = "none", # Hide the legend
strip.text = element_text(size = 12),
) + # Adjust facet labels size if needed
scale_x_continuous(limits = c(1995, 2025), breaks = seq(1995, 2025, by = 5))
Observations:
- Extended Success: Motorola, Nokia, Apple, and Samsung show multiple models with top sales over an extended period, reflecting a sustained market presence.
- Long-Term Timeframe: These manufacturers maintained high sales performance with various models across several years, indicating effective long-term strategies.
- Consistent Market Leadership: Continuous top rankings suggest robust and evolving product strategies, adapting to market changes while maintaining competitive performance.
- Strategic Advantage: Demonstrates a successful long-term market strategy, balancing innovation and market demand to ensure ongoing success across multiple product releases.
7- Best Market Strategic plan
Classification: Best Market Strategy - This data highlights Nokia and Samsung as manufacturers with the most effective market strategies, achieving top sales with multiple models over an extended period. Specifically:
Show Code
# Filter data for the specified manufacturers
selected_manufacturers <- c("Apple", "Nokia", "Samsung")
filtered_data <- data %>%
filter(Manufacturer %in% selected_manufacturers)
# Ensure 'Year' is numeric
filtered_data$Year <- as.numeric(as.character(filtered_data$Year))
# Aggregate the data by 'Year' and 'Model'
manufacturer_model_sales <- filtered_data %>%
group_by(Year, Manufacturer, Model) %>%
summarise(Total_Units_Sold = sum(`Unit Sold Per Million`), .groups = 'drop')
# Plot the results using facet_wrap without the legend
ggplot(manufacturer_model_sales, aes(x = Year, y = Total_Units_Sold, color = Model, group = Model)) +
geom_line(size = 1) +
geom_point(size = 2) + # Adjust point size if needed
facet_wrap(~ Manufacturer, scales = "free_y") + # Create a separate panel for each manufacturer
labs(title = "For The Best Long Term Plan",
x = "Year",
y = "Units Sold (Million)") +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
axis.title = element_text(size = 12),
legend.position = "none", # Hide the legend
strip.text = element_text(size = 12),
) + # Adjust facet labels size if needed
scale_x_continuous(limits = c(1995, 2025), breaks = seq(1995, 2025, by = 5))
Observations:
- Nokia: Maintained high sales with a diverse range of models, including the Nokia 3210, 3310, and 1100, among others, demonstrating a strong market presence across different years.
- Samsung: Showcased ongoing market leadership with various successful models such as the Galaxy S series and Galaxy A series, illustrating a sustained ability to adapt and capture market demand.
- Long-Term Success: Both brands have demonstrated a strategic ability to leverage multiple top-selling models over several years, securing a leading position in the market through innovation and consistent performance.
- Best Market Strategy: Apple consistently achieves top sales with long-term success through innovative models and regular updates, including iPhone 4, iPhone 6 & 6 Plus, iPhone 7 & 7 Plus, and iPhone 12 series.
Let’s examine why Nokia’s sales dropped and how Apple and Samsung managed to stay stable. This will help identify strategies to improve market stability for other manufacturers.
8-Total Units Sold by Form Factor
This plot examines Total Units Sold by Form Factor to determine if different form factors impact sales performance.
Show Code
library(RColorBrewer) # Make sure to include RColorBrewer for color palettes
# Aggregate data by Form Factor
form_factor_sales <- data %>%
group_by(`Form.Factor`) %>%
summarise(Total_Units_Sold = sum(UnitsSold), .groups = 'drop')
# Reorder the rows based on Total Units Sold
form_factor_sales <- form_factor_sales %>%
arrange(desc(Total_Units_Sold))
# Calculate the percentage of each form factor
total_units <- sum(form_factor_sales$Total_Units_Sold)
form_factor_sales <- form_factor_sales %>%
mutate(Percentage = Total_Units_Sold / total_units * 100)
## Warning: Specifying width/height in layout() is now deprecated.
## Please specify in ggplotly() or plot_ly()
fig8
Observations:
- Touchscreen phones have the highest total units sold, significantly outperforming other form factors.
- Bar phones follow as the second most sold form factor.
- Flip phones, sliders, keyboard bars, and taco phones show considerably lower sales figures.
- The dominance of touchscreen phones suggests a strong consumer preference for this form factor.
9-Yearly units sold for Bar and Touchscreen
Let’s dive deeper to examine how the sales of bar and touchscreen phones have evolved over the years to better understand their impact on market trends.
Show Code
# Filter data for 'Bar' and 'Touchscreen' form factors
filtered_data <- data %>%
filter(`Form.Factor` %in% c("Bar", "Touchscreen"))
# Ensure 'Year' is numeric
filtered_data$Year <- as.numeric(as.character(filtered_data$Year))
# Aggregate the data by 'Year' and 'Form.Factor'
yearly_sales <- filtered_data %>%
group_by(Year, `Form.Factor`) %>%
summarise(Total_Units_Sold = sum(UnitsSold), .groups = 'drop')
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
fig9
Observations:
-
Bar Form Factor Trends:
- Significant increase in sales from 1999 to 2007.
- Sales peaked in 2005, then declined sharply after 2007.
- Failed to achieve high sales in later years.
-
Touchscreen Form Factor Trends:
- Steady rise in sales from 2008 to 2019.
- Peak in 2019, followed by a gradual decline.
- Consistently high sales figures compared to other form factors.
-
Market Impact:
- Nokia’s primary focus on bar phones aligns with the decline in bar phone sales, contributing to its decreased market performance.
- Apple’s exclusive focus on touchscreens supports its sustained success and alignment with consumer preferences.
- Samsung’s diverse portfolio across form factors allowed it to adapt and remain strong in the market.
10-Trend in Unit sold Over years By Smartphone Status
<Next, we’ll explore the trends in units sold over the years, categorized by smartphone status..
Show Code
# Filter data to include both smartphones and non-smartphones
filtered_data <- data %>%
filter(!is.na(Smartphone.))
# Ensure 'Year' is numeric
filtered_data$Year <- as.numeric(as.character(filtered_data$Year))
# Convert 'Smartphone.' to a factor with labels
filtered_data$Smartphone. <- factor(filtered_data$Smartphone., levels = c(FALSE, TRUE), labels = c("Non-Smartphone", "Smartphone"))
# Perform t-test
t_test_result <- t.test(UnitsSold ~ Smartphone., data = filtered_data)
# Display t-test results in R Markdown
print(t_test_result)
##
## Welch Two Sample t-test
##
## data: UnitsSold by Smartphone.
## t = 2.9613, df = 49.125, p-value = 0.004708
## alternative hypothesis: true difference in means between group Non-Smartphone and group Smartphone is not equal to 0
## 95 percent confidence interval:
## 12.17153 63.55956
## sample estimates:
## mean in group Non-Smartphone mean in group Smartphone
## 74.87143 37.00588
# Filter data to include both smartphones and non-smartphones
filtered_data <- data %>%
filter(!is.na(Smartphone.))
# Ensure 'Year' is numeric
filtered_data$Year <- as.numeric(as.character(filtered_data$Year))
# Aggregate the data by year and smartphone status
yearly_sales <- filtered_data %>%
group_by(Year, Smartphone.) %>%
summarise(Total_Units_Sold = sum(UnitsSold), .groups = 'drop')
# Convert Smartphone status to factor for better plot labeling
yearly_sales$Smartphone. <- factor(yearly_sales$Smartphone., labels = c("Non-Smartphone", "Smartphone"))
# Convert ggplot to an interactive plot using plotly
fig10
fig11
Observations:
- From 1996 to the early 2000s, sales of non-smartphones were significantly higher compared to smartphones, showing a clear market preference for non-smartphone devices during this period.
- Starting around 2003, there was a noticeable increase in smartphone sales, with a substantial rise beginning in 2005 and continuing to grow in the following years. This indicates a shift in consumer preference towards smartphones.
- By 2019, smartphone sales reached their peak at 446.2 units, illustrating the strong market trend towards smartphones. However, sales have since declined in the early 2020s, which might be attributed to market saturation or evolving consumer preferences.
11-Units sold By Manufacturer (Smartphone status - Form Factor)
Next, we will analyze the plot showing units sold by manufacturer, segmented by smartphone status and form factor. This will help us understand how different manufacturers perform across various form factors and whether their smartphone status influences their sales performance.
Show Code
# Load data from Google Drive
url <- "https://drive.google.com/uc?id=1W9UYyKDAV3UZWPppYs7N8R9ne5ws5iZ3&export=download"
data <- read.csv(url)
# Filter data for selected manufacturers
selected_manufacturers <- c("Apple", "Nokia", "Samsung")
filtered_data <- data %>%
filter(Manufacturer %in% selected_manufacturers)
# Convert 'Year' column to numeric
filtered_data$Year <- as.numeric(as.character(filtered_data$Year))
# Aggregate data by manufacturer, smartphone status, and form factor
aggregated_data <- filtered_data %>%
group_by(Manufacturer, Smartphone., Form.Factor) %>%
summarise(Total_Units_Sold = sum(UnitsSold), .groups = 'drop')
# Convert 'Smartphone.' column to factor for better display
aggregated_data$Smartphone. <- factor(aggregated_data$Smartphone., labels = c("Non-Smartphone", "Smartphone"))
fig12
Observations:
- Apple’s hot sales are in smartphones with touchscreens, indicating a clear and focused strategy on this type.
- Nokia’s highest sales are in non-smartphone bar models, suggesting a targeted approach towards simpler devices.
- Samsung has a mixed strategy, achieving significant sales in both smartphones with touchscreens and non-smartphone bar models.
- The most effective market play appears to be in the smartphone with touchscreen category, as it leads the sales across manufacturers.
12-Total Units Sold per Manufacturer
This pie chart illustrates the total units sold for each manufacturer, showcasing the percentage of overall sales contributed by each one. It provides a clear visual representation of how sales are distributed among different manufacturers.
Show Code
# Calculate the number of models per manufacturer that achieved a top rank
model_count_by_manufacturer <- data %>%
group_by(Manufacturer) %>%
summarise(Number_of_Models_with_Top_Rank = n()) %>%
mutate(Percentage = Number_of_Models_with_Top_Rank / sum(Number_of_Models_with_Top_Rank) * 100)
# Calculate the total units sold per manufacturer
units_sold_by_manufacturer <- data %>%
group_by(Manufacturer) %>%
summarise(Total_Units_Sold = sum(UnitsSold)) %>%
mutate(Percentage = Total_Units_Sold / sum(Total_Units_Sold) * 100)
# Calculate the total units sold per manufacturer
units_sold_by_manufacturer <- data %>%
group_by(Manufacturer) %>%
summarise(Total_Units_Sold = sum(UnitsSold)) %>%
mutate(Percentage = Total_Units_Sold / sum(Total_Units_Sold) * 100)
# Define an interactive color palette
manufacturer_colors <- c("#FFEB3B", "#81D4FA", "#A5D6A7", "#FFD700", "#FFB74D", "#4FC3F7", "#C8E6C9")
# Pie chart for total units sold with hover information
pie_chart_units <- plot_ly(
units_sold_by_manufacturer,
labels = ~Manufacturer,
values = ~Total_Units_Sold,
type = 'pie',
text = ~paste("Total Units Sold: ", Total_Units_Sold, "<br>Percentage: ", round(Percentage, 2), "%"),
hoverinfo = 'text',
textinfo = 'label+percent',
insidetextorientation = 'radial',
marker = list(colors = manufacturer_colors, line = list(color = '#FFFFFF', width = 1))
) %>%
layout(
title = list(
text = 'Total Units Sold by Manufacturer',
font = list(size = 16, color = "#333333")
),
showlegend = TRUE,
legend = list(
font = list(size = 12, color = "#333333")
)
)
pie_chart_units
Observations:
- Nokia leads with the highest sales, contributing approximately 41.18% of the total sales.
- Apple follows, accounting for about 28.95% of the total units sold.
- Samsung holds a significant share with 17.25% of the total sales.
- Other manufacturers such as Motorola, Xiaomi, and Huawei have smaller shares, with 5.60%, 1.72%, and 1.97% respectively.
- Manufacturers like Google, HTC, LeTV, Oppo, Palm, Research in Motion (RIM), and Sony Ericsson each contribute less than 1% of the total units sold.
Show Code
## Count the number of models per manufacturer
model_count_by_manufacturer <- data %>%
group_by(Manufacturer) %>%
summarise(Number_of_Models_with_Top_Rank = n()) %>%
mutate(Percentage = Number_of_Models_with_Top_Rank / sum(Number_of_Models_with_Top_Rank) * 100)
# Define interactive colors for the chart
manufacturer_colors <- c("#FFEB3B", "#81D4FA", "#A5D6A7", "#FFD700", "#FFB74D", "#4FC3F7", "#C8E6C9")
# Create a doughnut chart
doughnut_chart <- plot_ly(
model_count_by_manufacturer,
labels = ~Manufacturer,
values = ~Number_of_Models_with_Top_Rank,
type = 'pie',
textinfo = 'label+percent',
insidetextorientation = 'radial',
hole = 0.4, # Create a "doughnut" effect by adding a hole in the middle
marker = list(colors = manufacturer_colors, line = list(color = '#FFFFFF', width = 1))
) %>%
layout(
title = list(
text = 'Number of Models with Top Rank for Each Manufacturer',
font = list(size = 16, color = "#333333", family = "Arial")
),
showlegend = TRUE,
legend = list(
font = list(size = 12, color = "#333333"),
orientation = "h", # Horizontal orientation for legend
xanchor = "center", # Center the legend horizontally
yanchor = "top", # Anchor the legend to the top
x = 0.5, # Position the legend in the middle of the chart
y = -0.1, # Position the legend just below the chart
traceorder = "normal", # Display items in the order they appear in the data
itemclick = "toggleothers" # Click on a legend item to toggle the visibility of others
)
)
# Display both doughnut charts side by side
doughnut_chart
Observations:
- Samsung leads with the highest number of models at 43, which constitutes 35.8% of the total models.
- Nokia follows with 26 models, making up 21.7% of the total.
- Apple has 16 models, representing 13.3% of the total.
- LG and Xiaomi each have 7 models, accounting for 5.83% of the total.
- Huawei has 6 models, contributing 5% to the total.
- Motorola has 5 models, which is 4.17% of the total.
- Sony Ericsson has 3 models, making up 2.5% of the total.
- Oppo, Google, HTC, LeTV, Palm, and Research in Motion (RIM) each have 1 model, contributing 0.83% to the total.
Perform-predictive-analysis-using-regression-models
We conduct predictive analysis using simple and multiple regression models to forecast sales performance. The simple model assesses the impact of manufacturer alone, while the multiple regression model incorporates additional factors such as smartphone status, form factor, and year for a more detailed prediction.
Show Code
# Define the calculate_metrics function
calculate_metrics <- function(model, data) {
# Predictions from the model
predictions <- predict(model, newdata = data)
# Actual values
actuals <- data$UnitsSold
# Calculate RMSE
rmse_value <- sqrt(mean((actuals - predictions)^2))
# Calculate R-squared
residual_sum_of_squares <- sum((actuals - predictions)^2)
total_sum_of_squares <- sum((actuals - mean(actuals))^2)
r_squared_value <- 1 - (residual_sum_of_squares / total_sum_of_squares)
# Calculate MAPE
mape_value <- mean(abs((actuals - predictions) / actuals)) * 100
# Return metrics as a named vector
return(c(RMSE = rmse_value, R_squared = r_squared_value, MAPE = mape_value))
}
# Ensure models are defined
if (exists("simple_model") && exists("multiple_model") && exists("data1")) {
# Create a data frame to store metrics
metrics_summary <- data.frame(
Model = c("Simple Regression", "Multiple Regression"),
RMSE = numeric(2),
R_squared = numeric(2),
MAPE = numeric(2),
stringsAsFactors = FALSE
)
# Calculate metrics for the simple model
simple_metrics <- calculate_metrics(simple_model, data1)
metrics_summary[1, 2:4] <- simple_metrics
# Calculate metrics for the multiple model
multiple_metrics <- calculate_metrics(multiple_model, data1)
metrics_summary[2, 2:4] <- multiple_metrics
# Print the summary of metrics
print("Model Performance Summary:")
print(metrics_summary)
} else {
print("Models or data not found. Please ensure that simple_model, multiple_model, and data1 are defined.")
}
## [1] "Models or data not found. Please ensure that simple_model, multiple_model, and data1 are defined."
The simple regression model shows an RMSE of 44.44, an R-squared of 0.41, and a MAPE of 163.14, indicating moderate predictive accuracy. In contrast, the multiple regression model performs better with an RMSE of 37.64, an R-squared of 0.58, and a MAPE of 107.23, reflecting improved accuracy and a more detailed understanding of sales factors.
Conclusion
-
Best Models:
- Multiple Regression Model: Superior for predictive accuracy with lower RMSE and higher R-squared value.
-
Reasons for Nokia’s Decrease in Sales:
- Nokia’s focus on bar phones, which have declined in popularity, contrasts with the increasing demand for touchscreens.
-
Enhancement Strategies for Each Manufacturer:
- Apple: Continue innovating and focusing on touchscreen smartphones, which have been highly successful.
- Samsung: Utilize its diverse portfolio to adapt to evolving market trends and consumer preferences.
- Nokia: Expand into or enhance touchscreen technology to align with current market demands.
-
Successful Strategies:
- Apple: Maintains top sales through continuous innovation in touchscreen smartphones.
- Samsung: Achieves strong market performance with a mixed strategy across various form factors and technologies.
-
Areas for New Strategies:
- Nokia: Needs to reassess its focus on bar phones and explore opportunities in the touchscreen market or other emerging technologies.
- Other Manufacturers: Should evaluate their product portfolios and market strategies to address shifting consumer preferences and competitive pressures.
Appendix
- Data Access: You can access and download the data from the following URL: Global Best Selling Phone Sales Data.
- Full Report: For a more detailed read, access my full report via this link: View Full Report.