product sales Analysis
Proffessional Data analyst assignment
First 6 rows of the data | |||||||
week | sales_method | customer_id | nb_sold | revenue | years_as_customer | nb_site_visits | state |
---|---|---|---|---|---|---|---|
2 | 2e72d641-95ac-497b-bbf8-4861764a7097 | 10 | 93.93494 | 0 | 24 | Arizona | |
6 | Email + Call | 3998a98d-70f5-44f7-942e-789bb8ad2fe7 | 15 | 225.47000 | 1 | 28 | Kansas |
5 | Call | d1de9884-8059-4065-b10f-86eef57e4a44 | 11 | 52.55000 | 6 | 26 | Wisconsin |
4 | 78aa75a4-ffeb-4817-b1d0-2f030783c5d7 | 11 | 93.93494 | 3 | 25 | Indiana | |
3 | 10e6d446-10a5-42e5-8210-1b5438f70922 | 9 | 90.49000 | 0 | 28 | Illinois | |
6 | Call | 6489e678-40f2-4fed-a48e-d0dff9c09205 | 13 | 65.01000 | 10 | 24 | Mississippi |
Data validation
- The dataset had 15000 rows and 8 columns
- customer_id has 15000 unique customer ids without missing values thus matching the description
- nb_sold is a numeric variable with no missing values and matches the data description
- revenue has 1074 missing values that were replaced by the mean
- sales method variable had 5 non unique values with no missing values
in the column but had duplicate categories with non unique names so I
lumped the categories
em + call
andEmail + Call
into one category calledEmail + Call
, also lumbed the categoryemail
andEmail
to one category calledEmail
. at the end i ended up with 3 categories as required by the description - there are 39 years since the company has been formed so for the
variable
years_as_customer
i removed all rows or observations with years greater than 39 as these where outliers - week is made up of 6 unique categories that is
1-6
and it matches the description so no data cleaning was needed nb_site_visits
is a numeric variable with none missing values and does not need cleaningstate
has 50 unique categories and no cleaning was required- after validating , the dataset remained with 14998 rows and 8 columns
How many customers were there for each approach?
comments
- the most sales were made through the email (7465 people) followed by call(4961) and lastly email+call(2572 people)
What does the spread of the revenue look like overall? And for each method?
comments
- from the histogram we note that, most sales fell between 40-60 and 80-115 with more sales concentrated on 60 and 90 specifically
Comments
- the plot suggests that the revenue per each individual is quite spread out ranging from approximately 50 to 200
correlations
Comments
- number of new products sold seem to increase as the weeks progress , we see that in the high correlation between the two variables
- revenue also significantly increases with number of new products sold since there is a high correlation
- with an increase in number of site visits , number of new products sold also increase and so does revenue to some extent
comments
- the plots suggest that calls were the least productive by looking and the range of sales revenue for the method
- this also concludes our initial hypothesis of why calls had a smaller percentage of total revenue
proportion of revenue
comments
- the plot suggest that calls were the lowest revenue generating activity ,making up only around 17% of the total revenue produced this period
- Email + call accounted for around 31% which was better than calls alone
- Emails was responsible for the greatest proportion of revenue
Was there any difference in revenue over time for each of the methods?
comments
- we can note that email produced the highest amount only in the first week which accounted to a value of 247519.25. revenue gradually decreased over the next weeks
- Email + call method started much lower but ended the period much higher than other methods claiming the top spot
- the calls method started out lower and ended up lower than other other methods,there was litle to no change in the revenue generated from this method
comments
- the graphs desplay average revenue for each sales method over time.
- Email + call have the highest or up above in terms of the trend in average revenue over time
- Email Alone comes in second then calls alone comes last
Based on the Data, which method would you recommend we continue to use?
- Based on the data and charts above, I would recommend that this organization pivots away from using the calling approach on it’s own as it appears to be producing the worst results out of all of the sales methods.
- I recommend the team doubles down on the Call + Email approach since from graphs the ‘Email + Call’ approach seems to be working really well and so I would recommend the business focuses on that outreach to remain relevant and timely while also decreasing the amount of time spent on the sales method.
- The ROI for Email + Call can be seen on both the graphs as it appears to trend upward on both avg weekly revenue and total weekly revenue over time. Focus on Email + Calling! If possible focus on Email + Call, and then Email alone
metrics to focus on
** rows showing maximum values** | |
sales_method | maximum average |
---|---|
6 | |
Email + Call | 205.9621 |
127.9681 | |
5 | |
Email + Call | 174.3793 |
4 | |
Email + Call | 169.9397 |
2 | |
Email + Call | 145.5017 |
3 | |
Email + Call | 144.9161 |
1 | |
Email + Call | 124.8799 |
- since the company is focused on increasing revenue and profits by selling more products, I would suggest they focus on the metric of average weekly revenue generated from each method as their main performance indicator.
- this would be the best metric to focus on because it can be measured over time and can be analyzed regularly on a weekly basis, which will also allow them to pivot should results and sales not fall in line with their goal.
- looking at the weekly average revenue per sales method, we can see that the ‘Email + Call’ approach typically trended upwards week over week, with only a slight decline between week 2 and 3, and that was completely made up in the following weeks.
- By analyzing this suggested metric of average weekly revenue per sales method, we could observe if this trend continues for the three different approaches.
Final Summary & Recommendations
- First, I recommend the company increases the focus on the ‘Email + Call’ method because of the success it displayed in total average revenue generated per sale and also because it showed a consistent increase across most weeks, only decreasing slightly between weeks 2 and 3.
- Next, because Email also displayed somewhat positive results that trended upwards across the 6 week period, I would suggest the company conveys stronger value in messages with more targeted emails into their strategy as this could be a way to add some extra revenue to the pipeline with little to no extra effort and a light lift on their end.
- I would recommend the team focuses on the metric of average revenue generated on a weekly basis for each sales method as this can be measured over time, allowing trends to appear and be analyzed to predict future outcomes.
- I would suggest that the team decreases the focus on the ‘Call’ method because it showed poor results that were significantly lower than the other approaches
- Improve Data Collection practices controlling for getting missing data