This is my compile markdown with r studio and this is a compile provides detailed data sales information for BMW vehicles from 2010 to 2024 across global regions. It includes attributes such as model, year, engine size, mileage, transmission type, fuel type, price, and sales volume.
Untuk Sumber : Kaggle.com
Untuk Compile File : .git
Let’s load some data
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.1 ✔ stringr 1.5.2
## ✔ ggplot2 4.0.0 ✔ tibble 3.3.0
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.1.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
BMW_sales_data_2010_2024_ <- read_excel("C:/Users/Mareko/Downloads/archive (7)/BMW sales data (2010-2024).xlsx")
View(BMW_sales_data_2010_2024_)
print(BMW_sales_data_2010_2024_)## # A tibble: 50,000 × 12
## Model Year Region Color Fuel_Type Transmission Engine_Size_L Mileage_KM
## <chr> <dbl> <chr> <chr> <chr> <chr> <chr> <dbl>
## 1 i8 2010 South A… Black Hybrid Automatic 4.7 16020
## 2 i8 2010 Middle … Grey Petrol Automatic 4.0 98514
## 3 X6 2010 Europe Red Hybrid Manual 3.8 128477
## 4 i8 2010 North A… White Electric Automatic 2.5 75457
## 5 X6 2010 Africa Silv… Petrol Manual 1.7 176650
## 6 M5 2010 South A… White Diesel Manual 2.8 121393
## 7 3 Series 2010 Asia Black Petrol Manual 2.1 107572
## 8 5 Series 2010 Europe Red Petrol Manual 1.8 194101
## 9 i3 2010 Africa Blue Petrol Manual 3.6 91061
## 10 i3 2010 Africa Blue Diesel Manual 1.8 120482
## # ℹ 49,990 more rows
## # ℹ 4 more variables: Price_USD <dbl>, Sales_Volume <dbl>,
## # Sales_Classification <chr>, New_Count <dbl>
## # A tibble: 50,000 × 12
## Model Year Region Color Fuel_Type Transmission Engine_Size_L Mileage_KM
## <chr> <dbl> <chr> <chr> <chr> <chr> <chr> <dbl>
## 1 i8 2010 South A… Black Hybrid Automatic 4.7 16020
## 2 i8 2010 Middle … Grey Petrol Automatic 4.0 98514
## 3 X6 2010 Europe Red Hybrid Manual 3.8 128477
## 4 i8 2010 North A… White Electric Automatic 2.5 75457
## 5 X6 2010 Africa Silv… Petrol Manual 1.7 176650
## 6 M5 2010 South A… White Diesel Manual 2.8 121393
## 7 3 Series 2010 Asia Black Petrol Manual 2.1 107572
## 8 5 Series 2010 Europe Red Petrol Manual 1.8 194101
## 9 i3 2010 Africa Blue Petrol Manual 3.6 91061
## 10 i3 2010 Africa Blue Diesel Manual 1.8 120482
## # ℹ 49,990 more rows
## # ℹ 4 more variables: Price_USD <dbl>, Sales_Volume <dbl>,
## # Sales_Classification <chr>, New_Count <dbl>
## # A tibble: 1 × 0
## Model Year Region Color
## Length:50000 Min. :2010 Length:50000 Length:50000
## Class :character 1st Qu.:2013 Class :character Class :character
## Mode :character Median :2017 Mode :character Mode :character
## Mean :2017
## 3rd Qu.:2021
## Max. :2024
## Fuel_Type Transmission Engine_Size_L Mileage_KM
## Length:50000 Length:50000 Length:50000 Min. : 3
## Class :character Class :character Class :character 1st Qu.: 50178
## Mode :character Mode :character Mode :character Median :100389
## Mean :100307
## 3rd Qu.:150630
## Max. :199996
## Price_USD Sales_Volume Sales_Classification New_Count
## Min. : 30000 Min. : 100 Length:50000 Min. :1
## 1st Qu.: 52435 1st Qu.:2588 Class :character 1st Qu.:1
## Median : 75012 Median :5087 Mode :character Median :1
## Mean : 75035 Mean :5068 Mean :1
## 3rd Qu.: 97628 3rd Qu.:7537 3rd Qu.:1
## Max. :119998 Max. :9999 Max. :1
This Data is to much for us to compile so, I create or minimalize to 100 data with this code :
## # A tibble: 100 × 12
## Model Year Region Color Fuel_Type Transmission Engine_Size_L Mileage_KM
## <chr> <dbl> <chr> <chr> <chr> <chr> <chr> <dbl>
## 1 i8 2010 South A… Black Hybrid Automatic 4.7 16020
## 2 i8 2010 Middle … Grey Petrol Automatic 4.0 98514
## 3 X6 2010 Europe Red Hybrid Manual 3.8 128477
## 4 i8 2010 North A… White Electric Automatic 2.5 75457
## 5 X6 2010 Africa Silv… Petrol Manual 1.7 176650
## 6 M5 2010 South A… White Diesel Manual 2.8 121393
## 7 3 Series 2010 Asia Black Petrol Manual 2.1 107572
## 8 5 Series 2010 Europe Red Petrol Manual 1.8 194101
## 9 i3 2010 Africa Blue Petrol Manual 3.6 91061
## 10 i3 2010 Africa Blue Diesel Manual 1.8 120482
## # ℹ 90 more rows
## # ℹ 4 more variables: Price_USD <dbl>, Sales_Volume <dbl>,
## # Sales_Classification <chr>, New_Count <dbl>
## Model Year Region Color
## Length:100 Min. :2010 Length:100 Length:100
## Class :character 1st Qu.:2010 Class :character Class :character
## Mode :character Median :2010 Mode :character Mode :character
## Mean :2010
## 3rd Qu.:2010
## Max. :2010
## Fuel_Type Transmission Engine_Size_L Mileage_KM
## Length:100 Length:100 Length:100 Min. : 3514
## Class :character Class :character Class :character 1st Qu.: 46205
## Mode :character Mode :character Mode :character Median :105041
## Mean :103430
## 3rd Qu.:152912
## Max. :199290
## Price_USD Sales_Volume Sales_Classification New_Count
## Min. : 30946 Min. : 184 Length:100 Min. :1
## 1st Qu.: 48255 1st Qu.:2706 Class :character 1st Qu.:1
## Median : 67198 Median :5824 Mode :character Median :1
## Mean : 72359 Mean :5499 Mean :1
## 3rd Qu.: 99819 3rd Qu.:8095 3rd Qu.:1
## Max. :117253 Max. :9996 Max. :1
Basicly I just take a 2010 sample for my presentation, why? because the minimal data is 2010 in year section on the max side is also 2010
MILEAGE(KM), PRICE(USD), and SALES VOLUME
This will shows you the mean, median, and mode
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 3514 46205 105041 103430 152912 199290
#Median
mean_Mileage <- mean(BMWSD$Mileage_KM)
cat("Mean dari data Mileage BMW adalah:", round(mean_Mileage,2), "KM\n")## Mean dari data Mileage BMW adalah: 103430 KM
#Median
median_Mileage <- median(BMWSD$Mileage_KM)
cat("Median dari data Mileage BMW adalah:", round(median_Mileage,2), "KM\n")## Median dari data Mileage BMW adalah: 105041 KM
#Mode
mode_Mileage <- names(sort(-table(BMWSD$Mileage_KM)))[1]
mode_Mileage <- as.numeric(mode_Mileage)
cat("Modus dari data Mileage BMW adalah:", round(mode_Mileage,2), "KM\n")## Modus dari data Mileage BMW adalah: 191728 KM
#Standart Deviation
sd_Mileage <- sd(BMWSD$Mileage_KM, na.rm = TRUE)
cat("Standar Deviasi dari data Mileage BMW adalah:", round(sd_Mileage,2), "KM")## Standar Deviasi dari data Mileage BMW adalah: 61848.08 KM
This will shows you the mean, median, and mode
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 30946 48255 67198 72359 99819 117253
#Median
mean_Price <- mean(BMWSD$Price_USD)
cat("Mean dari data Harga BMW adalah:", round(mean_Price,2), "USD\n")## Mean dari data Harga BMW adalah: 72358.59 USD
#Median
median_Price <- median(BMWSD$Price_USD)
cat("Median dari data Harga BMW adalah:", round(median_Price,2), "USD\n")## Median dari data Harga BMW adalah: 67198 USD
#Mode
mode_Price <- names(sort(-table(BMWSD$Price_USD)))[1]
mode_Price <- as.numeric(mode_Price)
cat("Modus dari data Harga BMW adalah:", round(mode_Price,2), "USD\n")## Modus dari data Harga BMW adalah: 30946 USD
#Standart Deviation
sd_Price <- sd(BMWSD$Price_USD, na.rm = TRUE)
cat("Standar Deviasi dari data Harga BMW adalah:", round(sd_Price,2), "USD")## Standar Deviasi dari data Harga BMW adalah: 27004.97 USD
This will shows you the mean, median, and mode
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 184 2706 5824 5499 8095 9996
#Median
mean_Volume <- mean(BMWSD$Sales_Volume)
cat("Mean dari data volume penjualan BMW adalah:", round(mean_Volume,2), "Unit\n")## Mean dari data volume penjualan BMW adalah: 5499.05 Unit
#Median
median_Volume <- median(BMWSD$Sales_Volume)
cat("Median dari data volume penjualan BMW adalah:", round(median_Volume,2), "Unit\n")## Median dari data volume penjualan BMW adalah: 5824 Unit
#Mode
mode_Volume <- names(sort(-table(BMWSD$Sales_Volume)))[1]
mode_Volume <- as.numeric(mode_Volume)
cat("Modus dari data volume penjualan BMW adalah:", round(mode_Volume,2), "Unit\n")## Modus dari data volume penjualan BMW adalah: 184 Unit
#Standart Deviation
sd_Volume <- sd(BMWSD$Sales_Volume, na.rm = TRUE)
cat("Standar Deviasi dari data volume penjualan BMW adalah:", round(sd_Volume,2), "Unit")## Standar Deviasi dari data volume penjualan BMW adalah: 2857.56 Unit
x <- BMWSD$Price_USD
hist(x,
main = "Histogram of BMW Prices with Mean",
xlab = "Price (USD)",
col = "#FF7315",
border = "black")
# Tambahkan garis rata-rata
abline(v = mean(x, na.rm = TRUE), col = "blue", lwd = 2)
# Tambahkan garis ±1 standar deviasi
abline(v = mean(x, na.rm = TRUE) + sd(x, na.rm = TRUE), col = "red", lty = 2)
abline(v = mean(x, na.rm = TRUE) - sd(x, na.rm = TRUE), col = "red", lty = 2)x <- BMWSD$Sales_Volume
hist(x,
main = "Histogram of BMW Sales_Value with Mean",
xlab = "Price (USD)",
col = "#FF7315",
border = "black")
abline(v = mean(x, na.rm = TRUE), col = "blue", lwd = 2)
abline(v = mean(x, na.rm = TRUE) + sd(x, na.rm = TRUE), col = "red", lty = 2)
abline(v = mean(x, na.rm = TRUE) - sd(x, na.rm = TRUE), col = "red", lty = 2)Bar Chart and Scatterplot showing the relationship between car price and the km on the car.
price_car <- c(BMWSD$Price_USD)
cars <- c(BMWSD$Model)
mileage <- c(BMWSD$Mileage_KM)
barplot(price_car, names.arg = cars, col = "orange",main = "BMW Sales 2010", border = "white") plot(mileage, price_car,col = "orange", main = "BMW Sales 2010")
model <- lm(price_car ~ mileage)
abline(model, col = "red", lwd = 2)Hasil analisis dari scatterplot yang sudah dibuat menunjukkan korelasi yang hampir tidak memiliki korelasinya pada data. \(R^2 = 0,0164\) menunjukkan korelasinya tidak ada, itu disebabkan penjualan produk dari series BMW tahun 2010 bukan sebab bagaimana kondisinya tetapi langka dan popularitasnya menjadikan perusahaan BMW menampilkan performa yang baik dengan menciptakan mobil yang dapat penggunanya merasa nyaman, bukan sekedar melihat mileage(km) oleh karena itu hasilnya bahkan tidak memiliki korelasi yang kuat
This code will show the frequencies transmission appearing on 2010 sales BMW’s Cars.
## Automatic Manual Total
## 3 Series 3 2 5
## 5 Series 4 7 11
## 7 Series 6 4 10
## M3 6 4 10
## M5 5 5 10
## X1 5 9 14
## X3 5 2 7
## X5 6 4 10
## X6 2 5 7
## i3 2 4 6
## i8 9 1 10
## Total 53 47 100
Automatic = c(3,4,6,2,9,6,5,5,5,6,2)
Model_Series = c("3 Series","5 Series","7 Series","i3","i8","M3","M5","X1","X3","X5","X6")
pct <- round(Automatic/sum(Automatic)*100)
Model_Series <- paste(Model_Series,"=", pct)
Model_Series <- paste(Model_Series, "%",sep = "")
pie(Automatic, labels = Model_Series,col = heat.colors(length(Model_Series)), main = "Presentage Automatic/Matic", border = "white")Automatic = c(2,7,4,4,1,4,5,9,2,4,5)
Model_Series = c("3 Series","5 Series","7 Series","i3","i8","M3","M5","X1","X3","X5","X6")
pct <- round(Automatic/sum(Automatic)*100)
Model_Series <- paste(Model_Series,"=", pct)
Model_Series <- paste(Model_Series, "%",sep = "")
pie(Automatic, labels = Model_Series,col = heat.colors(length(Model_Series)), main = "Presentage Manual", border = "white")Dari diagram “Presentase Automatic/Matic”, tiga model dengan proporsi penjualan tertinggi adalah:
i8 (17%) – Menjadi model paling populer pada kategori transmisi matic. Hal ini menunjukkan bahwa varian i8 dengan sistem transmisi otomatis sangat diminati, mungkin karena performa tinggi dan kenyamanan berkendara khas model ini.
7 Series (11%) dan X6 (11%) – Keduanya berbagi posisi kedua, menggambarkan preferensi konsumen terhadap seri premium dan SUV besar dengan transmisi otomatis.
X1 dan M5 (masing-masing 9%) – Menempati posisi ketiga, memperlihatkan bahwa mobil sport dan crossover juga memiliki pangsa signifikan di segmen matic.
Sementara pada diagram “Presentase Manual”, tiga model dengan penjualan tertinggi adalah:
X1 (19%) – Mendominasi segmen manual, menunjukkan bahwa SUV kompak ini populer di kalangan pengemudi yang lebih menyukai kendali manual.
5 Series (15%) – Berada di posisi kedua, memperlihatkan bahwa sedan eksekutif dengan transmisi manual masih memiliki peminat setia.
X6 (11%) – Menempati posisi ketiga, menandakan SUV besar dengan transmisi manual juga cukup diminati, meski tidak sebanyak versi matic-nya.
This code will show you the frequencies of the fuel type in each series/model of BMW’s Cars on 2010 sales.
## Diesel Electric Hybrid Petrol Total
## 3 Series 3 1 1 5
## 5 Series 3 3 3 2 11
## 7 Series 2 2 2 4 10
## M3 1 3 3 3 10
## M5 3 4 2 1 10
## X1 7 3 2 2 14
## X3 2 3 2 7
## X5 6 2 2 10
## X6 1 2 4 7
## i3 1 2 3 6
## i8 2 2 3 3 10
## Total 30 21 24 25 100
Diesel = c(3,3,2,1,2,1,3,7,2,6,0)
Model_Series = c("3 Series","5 Series","7 Series","i3","i8","M3","M5","X1","X3","X5","X6")
pct <- round(Diesel/sum(Diesel)*100)
Model_Series <- paste(Model_Series,"=", pct)
Model_Series <- paste(Model_Series, "%",sep = "")
pie(Automatic, labels = Model_Series,col = heat.colors(length(Model_Series)), main = "Diesel")Hybd = c(1,3,2,2,3,3,2,2,2,2,2)
Model_Series = c("3 Series","5 Series","7 Series","i3","i8","M3","M5","X1","X3","X5","X6")
pct <- round(Hybd/sum(Hybd)*100)
Model_Series <- paste(Model_Series,"=", pct)
Model_Series <- paste(Model_Series, "%",sep = "")
pie(Automatic, labels = Model_Series,col = heat.colors(length(Model_Series)), main = "Hybrid")Petrol = c(1,2,4,3,3,3,1,2,0,2,4)
Model_Series = c("3 Series","5 Series","7 Series","i3","i8","M3","M5","X1","X3","X5","X6")
pct <- round(Petrol/sum(Petrol)*100)
Model_Series <- paste(Model_Series,"=", pct)
Model_Series <- paste(Model_Series, "%",sep = "")
pie(Automatic, labels = Model_Series,col = heat.colors(length(Model_Series)), main = "Petrol")Electric = c(0,3,2,0,2,3,4,3,3,0,1)
Model_Series = c("3 Series","5 Series","7 Series","i3","i8","M3","M5","X1","X3","X5","X6")
pct <- round(Electric/sum(Electric)*100)
Model_Series <- paste(Model_Series,"=", pct)
Model_Series <- paste(Model_Series, "%",sep = "")
pie(Automatic, labels = Model_Series,col = heat.colors(length(Model_Series)), main = "Electric")Variabel Transmission terdiri dari dua kategori utama: Automatic (60%) dan Manual (40%). Sedangkan Fuel_Type didominasi oleh Bensin (Petrol) sebesar 50% dari total.
Secara umum, model BMW seri X dan 3 Series mendominasi penjualan. Mobil dengan transmisi otomatis dan bahan bakar bensin lebih banyak terjual dibandingkan tipe lainnya.