Please check my Portfolio for more projects.
Rickaby hardware is a company which supplies the hardware peripherals to different clients such as nomad stores etc. They have a head office in Mumbai India, and they have different regional offices in different cities of India. Dumisani Sagas is the sales director of this company, and he Is managing the business from the head office. So, what is happening with this company is that sales are declining and Dumisani as a sales director is having a lot of struggle tracking where the business is failing due to the size of the company which is big. When Dumisani calls his reginal mangers asking for information on why the sales are declining, unfortunately, his reginal managers are always giving him fuzzy and unclear information which can help him understand the root cause. In addition to that, they always give him big excel files of all transactions which is very difficult for Dumisani to transform all those excel sheets into insights. However, Dumisani calls for an emergency meeting with the sales director, marketing, customer, data analysis and IT teams.
Purpose
Stakeholders & Data Warehousing
Stakeholders were chosen based on each team’s impact towards profit and revenue of the company. To map down stakeholder’s impact, i made reference to the company’s data warehousing structure.
Depicted form the
diagram, stakeholders are:
- Sales Director.
- Marketing Team.
- Customer Service Team.
- Data miners and Analysis team.
- IT team (Software Engineers).
Result (what do we aim at the end of this project?)
Success criteria (should be able to measure success of the project.)
Due to the amount of data available, Dumisani wants his analysis team use the following tools.
Dumisani asked the data analysis team to come up with the following:
Revenue breakdown by cities.
Revenue breakdown by years and months.
Show top 5 customers by revenue and sales quantity.
Top 5 products by revenue number.
NB) He wants a visualized version of Revenue and Profit Analysis dashboard where he can click on different regions.
To perform the analysis, i used a Dataset from GitHub.This dataset has 5 tables which are:
Customers
Products
Transactions
Markets(Cities)
Date
To model the data, i used a star schema whereby transactions table will the fact table referencing the other 4 dimension tables. See the diagram below showing the relationship between all the tables.
Transactions Join with Customer table
# SELECT customer_code From Customers INNER JOIN Transactions ON Cusomers.customer_code = Transactions.customer_code
Transactions Join with Products table
# SELECT product_code From Products INNER JOIN Transactions ON Products.product_code = Transactions.product_code
Transactions Join with Markets table
# SELECT market_code From Markets INNER JOIN Transactions ON Markets.market_code = Transactions.market_code
Transactions Join with Date table
# SELECT order_date From Date INNER JOIN Transactions ON Date.order_date = Transactions.order_date