Myra Hallman
Marketing Analytics
24 Feb 2019
library(ggplot2)
library(scales)
library(dplyr)
library(tidyverse)
sales.df <- read.csv("LaptopSales.csv")
sales.df <- na.omit(sales.df)
View(sales.df)
Price Questions
• At what price are the laptops actually selling?
The prices range from under $200 to over $750, and the number sold is approximately normally distributed. The average retail price of laptops sold falls around just under $500.
ggplot(sales.df) +
geom_histogram(aes(x=Retail.Price), binwidth=20) +
labs(title = "Sell Price of Laptops", x="Retail Price", y="Number Sold")

• Does price change with time? (Hint: Make sure that the date column is recognized as such. The software should then enable different temporal aggregation choices, e.g., plotting the data by weekly or monthly aggregates, or even by day of week.)
As seen in the smoothed line graph below, you can see that price of laptops does change over time. Peak sales happen between July and August. The sales drop dramatically in April. This suggests a seasonality for the demand of laptops.
ggplot(sales.df) +
geom_smooth(aes(x=as.Date(Date), y=Retail.Price)) +
labs(title = "Price by Date", x="Month", y="Retail Price")

• Are prices consistent across retail outlets?
The prices are close, but not entirely consistent overall. As noted in the graph below, you can see that the majority of stores have a mean price of approximately $520. Five of the stores, however, sell laptops at a deeply discounted rate of approximately $470.
pricegroup.df <- group_by(sales.df, Store.Postcode)
price.by.postcode.df <- summarize(pricegroup.df, mean_price=mean(Retail.Price),
max_price=max(Retail.Price), min_price=min(Retail.Price))
ggplot(price.by.postcode.df) +
geom_point (aes(x=Store.Postcode, y=mean_price)) +
labs(title = "Mean Price by Store Location", x="Store Postcode", y="Mean Price")

• How does price change with configuration?
According to the data, and visualized in the graph below, the configuration number and retail price are closely related. As the configuration number increases, so then does the retail price.
ggplot(sales.df) +
geom_smooth(aes(x=Retail.Price, y=Configuration)) +
labs(title = "Price Compared to Configuration", x="Retail Price", y="Configuration Number")

Revenue Questions
• How do the sales volume in each store relate to Acell’s revenues?
In the table below, you can see the relationship between each store’s revenue and Acell’s overall revenue. There is a column that shows each store’s revenue, and another column that gives the percentage that revenue is of Acell’s overall revenue. Each store contributes at very different levels. There is also a bar graph below that shows each store code and it’s contributions to the overall revenue.
totalrevenuegroup.df <- group_by(sales.df, Store.Postcode)
revenue.by.business.df <- summarize(totalrevenuegroup.df,
Total_Revenue=sum(Retail.Price),
percentage_rows=n()/nrow(totalrevenuegroup.df)*100)
(revenue.by.business.df)
ggplot(revenue.by.business.df) +
geom_bar (aes(x=Store.Postcode, y=percentage_rows), stat="identity") +
labs(title = "Store Revenue", x="Store Postcode", y="Percent of Acell's Total Revenue")

• How does this relationship depend on the configuration?
You can see in the bar graph below that the configurations related to store postcode look very close to the graph above. It can be deduced from this that the greater percentage of Acell’s revenue comes from stores that sell higher configuration numbers primarily. The data is very consistent with this conclusion.
ggplot(sales.df) +
geom_bar (aes(x=Store.Postcode, y=Configuration), stat="identity") +
labs(title = "Store Compared to Configuration", x="Store.Postcode", y="Configuration")

Below is another graph that shows retail price related to store postcode.
sales.df <- mutate(sales.df, Configuration.Small=round(Configuration/100))
ggplot(sales.df) +
geom_bar(aes(x=Store.Postcode, y=Retail.Price, fill=factor(Configuration.Small)), stat="summary",
fun.y='sum', position = 'fill') +
labs(title = "Retail Price and Store Postcode", x="Store Postcode", y="Retail Price")

Configuration Questions
• What are the details of each configuration? How does this relate to price?
Below are three examples of how different configurations are related to screensize, battery life, and RAM. You can see that it is consistent with the increase in price as the configuration number grows. With higher configuration, the features tend to also improve. Not all features improve at once, but with at least one increased feature, the price will expect to rise as well.
The bar graph below shows the relationship to screen size based on configuration. There are two different screen sizes, and the larger screen size can be found in the configurations with higher numbers.
Screensize.df <- summarize(group_by(sales.df, Configuration), mean.screen.size.inches=mean(Screen.Size..Inches., na.rm = TRUE))
ggplot(Screensize.df) +
geom_bar(aes(x=Configuration, y=mean.screen.size.inches), stat="identity") +
labs(title = "Screen Size and Configuration", x="Configuration", y="Mean Screen Size in Inches")

Battery life is available in 4, 5, or 6 hour lifespans. You can see there is steady growth as configurations rise, and then the battery life drops back to 4 hours once it reaches the mazimum of 6 hours.
battery.life.df <- summarize(group_by(sales.df, Configuration), mean.battery.life=mean(Battery.Life..Hours., na.rm = TRUE))
ggplot(battery.life.df) +
geom_bar(aes(x=Configuration, y=mean.battery.life), stat="identity") +
labs(title = "Battery Life and Configuration", x="Configuration", y="Mean Battery Life in Hours")

RAM size comes in three phases as well. There is 1 GB, 2 GB and 4 GB. These also grow incremintally as the configuration grows. You can see that they cycle through the options 6 different times as the configuration grows.
RAMsize.df <- summarize(group_by(sales.df, Configuration), mean.RAM.GB=mean(RAM..GB., na.rm = TRUE))
ggplot(RAMsize.df) +
geom_bar(aes(x=Configuration, y=mean.RAM.GB), stat="identity") +
labs(title = "RAM Size and Configuration", x="Configuration", y="Mean RAM Size in GB")

• Do all stores sell all configurations?
Yes, all stores sell all configurations. You can see on the list below that the stores do sell all configurations. The differences lie with the number of each configuration available through each store.
configurationsgroup.df <- group_by(sales.df, Store.Postcode, Configuration)
configurations.by.business.df <- summarize(configurationsgroup.df,
count_rows=n(),
percentage_rows=n()/nrow(configurationsgroup.df)*100)
(configurations.by.business.df)
```
END
Worked with Jannika, Kaisa, Kia
---
title: "Mini Project 1"
output: html_notebook
---
Myra Hallman

Marketing Analytics

24 Feb 2019

```{r}
library(ggplot2)
library(scales)
library(dplyr)
library(tidyverse)
sales.df <- read.csv("LaptopSales.csv")
sales.df <- na.omit(sales.df)
View(sales.df)
```


#Price Questions

•	At what price are the laptops actually selling?

The prices range from under $200 to over $750, and the number sold is approximately normally distributed. The average retail price of laptops sold falls around just under $500. 

```{r}
ggplot(sales.df) +
  geom_histogram(aes(x=Retail.Price), binwidth=20) +
  labs(title = "Sell Price of Laptops", x="Retail Price", y="Number Sold")
```




•	Does price change with time? (Hint: Make sure that the date column is recognized as such. The software should then enable different temporal aggregation choices, e.g., plotting the data by weekly or monthly aggregates, or even by day of week.)

As seen in the smoothed line graph below, you can see that price of laptops does change over time. Peak sales happen between July and August. The sales drop dramatically in April. This suggests a seasonality for the demand of laptops. 

```{r}
ggplot(sales.df) +
  geom_smooth(aes(x=as.Date(Date), y=Retail.Price)) +
  labs(title = "Price by Date", x="Month", y="Retail Price") 
```



•	Are prices consistent across retail outlets?

The prices are close, but not entirely consistent overall. As noted in the graph below, you can see that the majority of stores have a mean price of approximately $520. Five of the stores, however, sell laptops at a deeply discounted rate of approximately $470. 


```{r}
pricegroup.df <- group_by(sales.df, Store.Postcode) 
price.by.postcode.df <- summarize(pricegroup.df, mean_price=mean(Retail.Price), 
                                  max_price=max(Retail.Price), min_price=min(Retail.Price)) 
ggplot(price.by.postcode.df) +
  geom_point (aes(x=Store.Postcode, y=mean_price)) +
  labs(title = "Mean Price by Store Location", x="Store Postcode", y="Mean Price") 
```




•	How does price change with configuration?

According to the data, and visualized in the graph below, the configuration number and retail price are closely related. As the configuration number increases, so then does the retail price. 


```{r}
ggplot(sales.df) +
  geom_smooth(aes(x=Retail.Price, y=Configuration)) +
  labs(title = "Price Compared to Configuration", x="Retail Price", y="Configuration Number")
```


# Revenue Questions


•	How do the sales volume in each store relate to Acell’s revenues?

In the table below, you can see the relationship between each store's revenue and Acell's overall revenue. There is a column that shows each store's revenue, and another column that gives the percentage that revenue is of Acell's overall revenue. Each store contributes at very different levels. There is also a bar graph below that shows each store code and it's contributions to the overall revenue. 


```{r}
totalrevenuegroup.df <- group_by(sales.df, Store.Postcode) 
revenue.by.business.df <- summarize(totalrevenuegroup.df, 
                             Total_Revenue=sum(Retail.Price), 
                              percentage_rows=n()/nrow(totalrevenuegroup.df)*100)

(revenue.by.business.df)


```
```{r}
ggplot(revenue.by.business.df) +
  geom_bar (aes(x=Store.Postcode, y=percentage_rows), stat="identity")  +
  labs(title = "Store Revenue", x="Store Postcode", y="Percent of Acell's Total Revenue")
```




•	How does this relationship depend on the configuration?

You can see in the bar graph below that the configurations related to store postcode look very close to the graph above. It can be deduced from this that the greater percentage of Acell's revenue comes from stores that sell higher configuration numbers primarily. The data is very consistent with this conclusion.  

```{r}
ggplot(sales.df) +
  geom_bar (aes(x=Store.Postcode, y=Configuration), stat="identity")  +
  labs(title = "Store Compared to Configuration", x="Store.Postcode", y="Configuration")
```




Below is another graph that shows retail price related to store postcode. 


```{r}
sales.df <- mutate(sales.df, Configuration.Small=round(Configuration/100))

ggplot(sales.df) +
  geom_bar(aes(x=Store.Postcode, y=Retail.Price, fill=factor(Configuration.Small)), stat="summary",
           fun.y='sum', position = 'fill') +
  labs(title = "Retail Price and Store Postcode", x="Store Postcode", y="Retail Price")
```



# Configuration Questions


•	What are the details of each configuration? How does this relate to price?

Below are three examples of how different configurations are related to screensize, battery life, and RAM. You can see that it is consistent with the increase in price as the configuration number grows. With higher configuration, the features tend to also improve. Not all features improve at once, but with at least one increased feature, the price will expect to rise as well. 



The bar graph below shows the relationship to screen size based on configuration. There are two different screen sizes, and the larger screen size can be found in the configurations with higher numbers.


```{r}
Screensize.df <- summarize(group_by(sales.df, Configuration), mean.screen.size.inches=mean(Screen.Size..Inches., na.rm = TRUE))

ggplot(Screensize.df) +
geom_bar(aes(x=Configuration, y=mean.screen.size.inches), stat="identity") +
  labs(title = "Screen Size and Configuration", x="Configuration", y="Mean Screen Size in Inches")

```



Battery life is available in 4, 5, or 6 hour lifespans. You can see there is steady growth as configurations rise, and then the battery life drops back to 4 hours once it reaches the mazimum of 6 hours. 



```{r}
battery.life.df <- summarize(group_by(sales.df, Configuration), mean.battery.life=mean(Battery.Life..Hours., na.rm = TRUE))

ggplot(battery.life.df) +
  geom_bar(aes(x=Configuration, y=mean.battery.life), stat="identity") +
  labs(title = "Battery Life and Configuration", x="Configuration", y="Mean Battery Life in Hours")
```


RAM size comes in three phases as well. There is 1 GB, 2 GB and 4 GB. These also grow incremintally as the configuration grows. You can see that they cycle through the options 6 different times as the configuration grows. 


```{r}
RAMsize.df <- summarize(group_by(sales.df, Configuration), mean.RAM.GB=mean(RAM..GB., na.rm = TRUE))

ggplot(RAMsize.df) +
geom_bar(aes(x=Configuration, y=mean.RAM.GB), stat="identity") +
  labs(title = "RAM Size and Configuration", x="Configuration", y="Mean RAM Size in GB")
```




•	Do all stores sell all configurations?

Yes, all stores sell all configurations. You can see on the list below that the stores do sell all configurations. The differences lie with the number of each configuration available through each store. 



```{r}
configurationsgroup.df <- group_by(sales.df, Store.Postcode, Configuration) 
configurations.by.business.df <- summarize(configurationsgroup.df, 
                                           count_rows=n(),
                                           percentage_rows=n()/nrow(configurationsgroup.df)*100)

(configurations.by.business.df)

```



```


#END
Worked with Jannika, Kaisa, Kia