This project has a dataset called BMW_sales_data from Kaggle.
The data set contains the sales from 2010 - 2024 for BMW. It has the model and trim details which include the Year, Fuel_Type, Transmission,Engine_Size, and Prize.
The data set has 11 columns and it is an analysis of how mileage, engine size, and market region relate to price. It also shows how the sales pattern differs across different regions.
## # A tibble: 10 × 12 ## Model Year Region Color Fuel_Type Transmission Engine_Size_L Mileage_KM ## <chr> <int> <chr> <chr> <chr> <chr> <dbl> <dbl> ## 1 5 Series 2016 Asia Red Petrol Manual 3.5 151748 ## 2 i8 2013 North A… Red Hybrid Automatic 1.6 121671 ## 3 5 Series 2022 North A… Blue Petrol Automatic 4.5 10991 ## 4 X3 2024 Middle … Blue Petrol Automatic 1.7 27255 ## 5 7 Series 2020 South A… Black Diesel Manual 2.1 122131 ## 6 5 Series 2017 Middle … Silv… Diesel Manual 1.9 171362 ## 7 i8 2022 Europe White Diesel Manual 1.8 196741 ## 8 M5 2014 Asia Black Diesel Automatic 1.6 121156 ## 9 X3 2016 South A… White Diesel Automatic 1.7 48073 ## 10 i8 2019 Europe White Electric Manual 3 35700 ## # ℹ 4 more variables: Price_USD <dbl>, Sales_Volume <dbl>, ## # Sales_Classification <chr>, Fuel_type <chr>
## Brief Overview We are going to look at the ggplot and plotly visuals and an easy statistical summary to answer: How the mileage, engine size, and region explain the variation in the BMW prices?
Layout: - Line chart: The average price by region and year - Scatter plot: Price vs Mileage (size = engine size, color = region) - Interactive scatter plot: explore data with hover details - Pie charts: Fuel type shares by region - 3D plot (plotly): Price vs Engine Size vs Mileage - Box plot: Price distribution by region and by Sales_Classification - Stats: five-number summaries by the region and linear model