Loading in Data

library(readr)
labtest <- read_csv("C:/Users/cbado/OneDrive/Documents/R Code/labtest.csv")

labtest$region <- as.factor(labtest$region)

Question 1

library(ggplot2)
library(cowplot)

ggplot(labtest, aes(x="", y=sales)) + geom_boxplot(fill="#0076B6",)+ggtitle("Sales for All Sectors & Regions")+xlab("All Sectors & Regions")+ylab("Sales (thousands)")+scale_y_continuous(labels=scales::dollar_format())

ggplot(labtest, aes(x="", y=profit)) + geom_boxplot(fill="#B0B7BC",)+ggtitle("Profit for All Sectors & Regions")+xlab("All Sectors & Regions")+ylab("Profit (thousands)")+scale_y_continuous(labels=scales::dollar_format())

ggplot(labtest, aes(x="", y=newinvest)) + geom_boxplot(fill="#32CD32",)+ggtitle("New Investment for All Sectors & Regions")+xlab("All Sectors & Regions")+ylab("New Investment (thousands)")+scale_y_continuous(labels=scales::dollar_format())

ggplot(labtest, aes(x=region, y=sales, fill=region)) + geom_boxplot()+ggtitle("Sales by Region")+xlab("Region")+ylab("Sales (thousands)")+scale_y_continuous(labels=scales::dollar_format())+scale_fill_manual(values=c("#FFC906","#223971"))+labs(fill = "Region")

ggplot(labtest, aes(x=region, y=profit, fill=region)) + geom_boxplot()+ggtitle("Profit by Region")+xlab("Region")+ylab("Profit (thousands)")+scale_y_continuous(labels=scales::dollar_format())+scale_fill_manual(values=c("#FFC906","#223971"))+labs(fill = "Region")

ggplot(labtest, aes(x=region, y=newinvest, fill=region)) + geom_boxplot()+ggtitle("New Investments by Region")+xlab("Region")+ylab("New Investment (thousands)")+scale_y_continuous(labels=scales::dollar_format())+scale_fill_manual(values=c("#FFC906","#223971"))+labs(fill = "Region")

Looking at the generated boxplots, a number of insights can be derived:

  • Sales
    • Sales range between ~55,000,000 and ~102,500,000 dollars, with a median value of ~68,000,000 dollars.
  • Profits
    • Profits range between ~50,000 and ~4,300,000 dollars, with a median value of ~850,000 dollars. There are a few higher-value profit outlier. On a median basis, average profit margin is 1.25%.
  • New Investments
    • New Investments range between ~6,500,000 and ~27,000,000 dollars, with a median value of ~14,750,000 dollars. On a median basis, average new investment is 0.2x revenue and 17x profit for the reported period.
  • Regional Comparisons
    • Sales, profit, and new investment are all greater in Region 2. However, Region 2 has a wider sales and new investment IQR, Region 1 profit margins appear stronger, there is a higher likelihood to have a high profit outlier value in Region 1, and the range of new investments is much greater in Region 2.

Question 2

ggplot(labtest, aes(x=sales, y=profit, color=region)) + geom_point()+ggtitle("Profit by Sales")+xlab("Sales")+ylab("Profit")+scale_y_continuous(labels=scales::dollar_format())+scale_x_continuous(labels=scales::dollar_format())+scale_color_manual(values=c("#FFC906","#223971"))+labs(color = "Region")+stat_smooth(method = "lm", formula = y ~ x, size = 1)

ggplot(labtest, aes(x=newinvest, y=sales, color=region)) + geom_point()+ggtitle("Sales by New Investment")+xlab("New Investment")+ylab("Sales")+scale_y_continuous(labels=scales::dollar_format())+scale_x_continuous(labels=scales::dollar_format())+scale_color_manual(values=c("#FFC906","#223971"))+labs(color = "Region")+stat_smooth(method = "lm", formula = y ~ x, size = 1)

ggplot(labtest, aes(x=newinvest, y=profit, color=region)) + geom_point()+ggtitle("Profit by New Investment")+xlab("Profit")+ylab("Sales")+scale_y_continuous(labels=scales::dollar_format())+scale_x_continuous(labels=scales::dollar_format())+scale_color_manual(values=c("#FFC906","#223971"))+labs(color = "Region")+stat_smooth(method = "lm", formula = y ~ x, size = 1)

Looking at the generated scatter plots, a number of insights can be derived:

  • Profitability
    • On average, above sales of ~$62,000,000, Region 1 appears to earn increasingly larger profits relative to Region 2. However, profit earned varies greater for any given sale amount in Region 1 relative to Region 2
  • New Investments
    • New investments contribute to sales growth at an equal rate in both Region 1 and 2. However, sales return on investment is higher in Region 2 and more investments of higher amounts are being made in Region 2 as well.

Question 3

library(DataExplorer)
library(tidyverse)

plot_correlation(labtest %>% select(profit, sales, newinvest))

From the correlation map of profit, sales, and new investment, it can be concluded that while new investments do drive growth in sales, they only contribute to profit growth at half the rate, on average. Also, as sales grow in general, profitability does not necessarily grow at the same rate, perhaps due to costs contributed by meeting increased sales volumes.

Question 4

Region <- c("1", "2")
Count <- c(nrow(labtest[labtest$region == "1",]), nrow(labtest[labtest$region == "2",]))

q4 <- data.frame(Region, Count)

ggplot(q4, aes(x = Region, y = Count, fill = Region)) + geom_bar(stat =
                                                                   "identity") + ggtitle("Number of Records by Region") + xlab("Region") +
  ylab("Number of Records") + scale_fill_manual(values = c("#FFC906", "#FF4300"))+geom_text(aes(label=Count), position = position_stack(vjust = 0.5))