The name of the dataset is “Brazilian E-Commerce Public Dataset by Olist”. The dataset was generated by Olist Store based on orders from 2016 to 2018 from multiple marketplaces in Brazil and it was shared on Kaggle.
It contains 9 files and 52 columns. The files are:
olist_customers_dataset.csv
olist_geolocation_dataset.csv
olist_order_items_dataset.csv
olist_order_payments_dataset.csv
olist_order_reviews_dataset.csv
olist_orders_dataset.csv
olist_products_dataset.csv
olist_sellers_dataset.csv
product_category_name_translation.csv
The key features in the dataset are order status, price, payment amount, delivery charges, and review score.
The key questions are segregated into two (2) main categories, which are:
1. Customer’s Perspective - to facilitate the product selection process by providing the following information:
2. Portal Admin’s Perspective - to provide key insights on the performance of the portal, which include:
The data is divided into multiple files and the following schema was used to prepare the final dataset:
Dataset Preparation includes the following:
Treatment for Missing Values
New variable creation
a) The overall GMV value is showing an increasing trend over time as shown in the below graph:
The above chart also allows for further exploration on average order value and active users over the past 30 days.
b) The top 10 products are:
c) The states that are doing well are:
How the above insights can be utilised to be actionable items:
Find the app here: https://ushani.shinyapps.io/eCommerce_Performance
Find the codes here: https://github.com/UshaniE/Analysis-of-eCommerce
References:
https://www.kaggle.com/olistbr/brazilian-ecommerce