The data set I am using is a BMW Global Sales Analysis from 2010 - 2024. This analysis is composed of 50,000 rows and 11 columns. The columns of this data set are
(1) Model, (2) Year, (3) Region, (4) Color, (5) Fuel_Type, (6) Transmission, (7) Engine_Size_L, (8) Mileage_KM, (9) Price_USD, (10) Sales_Volume, and (11) Sales_Classification.
## 'data.frame': 50000 obs. of 11 variables: ## $ Model : chr "5 Series" "i8" "5 Series" "X3" ... ## $ Year : int 2016 2013 2022 2024 2020 2017 2022 2014 2016 2019 ... ## $ Region : chr "Asia" "North America" "North America" "Middle East" ... ## $ Color : chr "Red" "Red" "Blue" "Blue" ... ## $ Fuel_Type : chr "Petrol" "Hybrid" "Petrol" "Petrol" ... ## $ Transmission : chr "Manual" "Automatic" "Automatic" "Automatic" ... ## $ Engine_Size_L : num 3.5 1.6 4.5 1.7 2.1 1.9 1.8 1.6 1.7 3 ... ## $ Mileage_KM : int 151748 121671 10991 27255 122131 171362 196741 121156 48073 35700 ... ## $ Price_USD : int 98740 79219 113265 60971 49898 42926 55064 102778 116482 96257 ... ## $ Sales_Volume : int 8300 3428 6994 4047 3080 1232 7949 632 8944 4411 ... ## $ Sales_Classification: chr "High" "Low" "Low" "Low" ...