Enter your name here: Labdhi Ghelani
Step 1: Import the Global Orders 2016 data set using the read.csv function (10 points)
## [1] "/Users/labdhighelani/Desktop/OneDrive - Northeastern University/Academics/Communication Visualization/Week 3/Assignment"
Step 2 - Create a horizontal barplot of Total Sales by Market, ordered descending on total sales (i.e. the market with highest total sales should be on the top) (30 points)
## Market Sales
## 3 Canada 384
## 1 Africa 4587
## 4 EMEA 5029
## 7 US 9994
## 5 EU 10000
## 6 LATAM 10294
## 2 APAC 11002
Answer the following question: Does any market appear to be an outlier in total sales?
The Canada Market with the Sales value 384 is farthest from the rest though it is not a significant outlier and has been visually indicated with a separate blue color.
Can you visually indicate this market as a separate color from the others?
## Market Sales
## 3 Canada 384
## 1 Africa 4587
## 4 EMEA 5029
## 7 US 9994
## 5 EU 10000
## 6 LATAM 10294
## 2 APAC 11002
Step 3 - Create a line chart total sales by year for each market (30 points)
Answer the following question: Does the same market appear to be an outlier in your line graph as well? Can you visually indicate this market as a separate color from the other markets in your graph?
Yes, it is visually indicated in the line chart that Canada has the lowest total sales over the years which is distinguished in the color Blue as obeserved in the barplot.
Step 4 - Create a box plot of total sales by market (30 points)
Does the general pattern you observe match that of the earlier steps 2 and 3?
Answer below:
In the steps 2 and 3 we concluded that Canada Market had the lowest sales which is reflected in the boxplot as well.
What other insights can you draw from your box plot above?
Answer below:
Each of the markets are represented by boxes and whiskers. The middle lines in the boxplots are the medians. The medians for Africa, Canada, EMEA and US have approximately similar values for Total Sales at 60. However, the first quartile for all of these nearly start at 10. These boxes represent the middle 50% of the total sales. The IQR measured is nearly 150 which means 50% of the sales lies between 20-150. APAC has the highest maximum value of the boxplot. There are outliers detected for all of the Markets. The outliers for APAC are skewed towards starting from 800. We can also conclude that Africa, Canada, EMEA have distributions (centers, variation and shape) more similar to each other than to APAC, EU, LATAM and US.
Can you visually indicate this market as a separate color from the others in your boxplot graph?
The Canada Market in the boxplot has been distinguished with a separate Red color.
Step 5 - Can you generate a different type of plot other than what was produced in steps 2, 3, & 4?
Not a bar plot, line graph, or box plot (Extra credit 10 points)
## [1] 0.3135772
As seen from the scatterplot there is a weak uphill positive linear relationship between the Quantity of all the Items and the Sales with a correlation coefficient value of 0.3135772. The regression line is added to fit the relationship between Quantity and Sales.