Business Analytics Lab Worksheet 04 (bsad

About

Qualitative Descriptive Analytics aims to gather an in-depth understanding of the underlying reasons and motivations for an event or observation. It is typically represented with visuals or charts.

Quantitative Descriptive Analytics focuses on investigating a phenomenon via statistical, mathematical, and computationaly techniques. It aims to quantify an event with metrics and numbers.

In this lab, we will explore both analytics using the data set provided.

Setup

Remember to always set your working directory to the source file location. Go to ‘Session’, scroll down to ‘Set Working Directory’, and click ‘To Source File Location’. Read carefully the below and follow the instructions to complete the tasks and answer any questions. Submit your work to RPubs as detailed in previous notes.

Note

For your assignment you may be using different data sets than what is included here. Read carefully the instructions on Sakai.

Task 1: Quantitative Analysis

Begin by reading in the data from the ‘marketing.csv’ file, and viewing it to make sure it is read in correctly.

mydata = read.csv(file="data/marketing.csv")
head(mydata)

Now calculate the Range, Min, Max, Mean, STDEV, and Variance for each variable. Below is an example of how to compute the items for the variable ‘sales’.

Sales

sales = mydata$sales
radio=mydata$radio
paper=mydata$paper
tv=mydata$tv
pos=mydata$pos
#Max Sales
max_s = max(sales)
max_r = max(radio)
max_pap = max(paper)
max_tv= max(tv)
max_pos=max(pos)
max_s

[1] 20450

max_r

[1] 89

max_pap

[1] 89

max_tv

[1] 280

max_pos

[1] 3

#Min Sales
min_s = min(sales)
min_r = min(radio)
min_pap = min(paper)
min_tv= min(tv)
min_pos=min(pos)
min_s

[1] 11125

min_r

[1] 65

min_pap

[1] 35

min_tv

[1] 250

min_pos

[1] 0

#Range
max_s-min_s

[1] 9325

max_r-min_r

[1] 24

max_pap-min_pap

[1] 54

max_tv-min_tv

[1] 30

max_pos-min_pos

[1] 3

#Mean
mean(sales)

[1] 16717.2

mean(radio)

[1] 76.1

mean(paper)

[1] 62.3

mean(tv)

[1] 266.6

mean(pos)

[1] 1.535

#Standard Deviation
sd(sales)

[1] 2617.052

sd(radio)

[1] 7.354912

sd(paper)

[1] 15.35921

sd(tv)

[1] 11.3388

sd(pos)

[1] 0.7499298

#Variance
var(sales)

[1] 6848961

var(radio)

[1] 54.09474

var(paper)

[1] 235.9053

var(tv)

[1] 128.5684

var(pos)

[1] 0.5623947

#Repeat the above calculations for radio, paper, tv, and pos.

An easy way to calculate all of these statistics of all of these variables is with the summary() function. Below is an example.

summary(mydata)

  case_number        sales           radio           paper             tv             pos       
 Min.   : 1.00   Min.   :11125   Min.   :65.00   Min.   :35.00   Min.   :250.0   Min.   :0.000  
 1st Qu.: 5.75   1st Qu.:15175   1st Qu.:70.00   1st Qu.:53.75   1st Qu.:255.0   1st Qu.:1.200  
 Median :10.50   Median :16658   Median :74.50   Median :62.50   Median :270.0   Median :1.500  
 Mean   :10.50   Mean   :16717   Mean   :76.10   Mean   :62.30   Mean   :266.6   Mean   :1.535  
 3rd Qu.:15.25   3rd Qu.:18874   3rd Qu.:81.75   3rd Qu.:75.50   3rd Qu.:276.2   3rd Qu.:1.800  
 Max.   :20.00   Max.   :20450   Max.   :89.00   Max.   :89.00   Max.   :280.0   Max.   :3.000

#Repeat the above for the varialble sales. There are some statistics not calculated with the summary() function  Specify which.

Task 2: Qualitative Analysis

Now, we will produce a basic blot of the ‘sales’ variable . Here we utilize the plot function and within the plot function we call the variable we want to plot.

plot(sales)

We can customize the plot by adding labels to the x- and y- axis.

#xlab labels the x axis, ylab labels the y axis
plot(sales, type="b", xlab = "Case Number", ylab = "Sales in $1,000")

There are further ways to customize plots, such as changing the colors of the lines, adding a heading, or even making them interactive.

Now, lets plot the sales graph, alongside radio, paper, and tv which you will code. Make sure to run the code in the same chunk so they are on the same layout.

#Layout allows us to see all 4 graphs on one screen
layout(matrix(1:4,2,2))
#Example of how to plot the sales variable
plot(sales, type="b", xlab = "Case Number", ylab = "Sales in $1,000") 
#Plot of Radio. Label properly
plot(radio, type = "b",xlab = "Case Number", ylab = "Radio in 10")

#Plot of Paper. Label properly
plot(paper, type = "b",xlab = "Case Number", ylab = "Paper in 20")
#Plot of TV. Label properly
plot(tv, type = "b",xlab = "Case Number", ylab = "tv in 15")

When looking at these plots it is hard to see a particular trend. One way to observe any possible trend in the sales data would be to re-order the data from low to high. The 20 months case studies are in no particular chronological time sequence. The 20 case numbers are independent sequentially generated numbers. Since each case is independent, we can reorder them.

#Re-order sales from low to high, and save re-ordered data in a new set. As sales data is re-reorded associated other column fields follow.
newdata = mydata[order(sales),]
head(newdata)

# Redefine the new variables 
newsales = newdata$sales
newradio = newdata$radio
newtv = newdata$tv
newpaper = newdata$paper

#Repeat the 4 graphs layout with proper labeling using instead the four new variables for sales, radio, tv, and paper. #Layout allows us to see all 4 graphs on one screen
layout(matrix(1:4,2,2))
#Example of how to plot the sales variable
plot(newsales, type="b", xlab = "Case Number", ylab = "Sales in $1,000") 
#Plot of Radio. Label properly
plot(newradio, type = "b",xlab = "Case Number", ylab = "Radio in 10")

#Plot of Paper. Label properly
plot(newpaper, type = "b",xlab = "Case Number", ylab = "Paper in 20")
#Plot of TV. Label properly
plot(newtv, type = "b",xlab = "Case Number", ylab = "tv in 15")

Shares your observations on what the new plots are revealing in terms of trending relationship.

the inportant detail found is that there was after the the sorting there was a linier regression shown.

Task 3: Standarized Z-Value

Given a sales value of $25000, calculate the corresponding z-value or z-score using the mean and standard deviation calculations conducted in task 1. We know that z-score = (x - mean)/sd.

#  Show calculations here
sales_mean= mean(sales)
sales_sd=sd(sales)
z_s_sales=(25000 - sales_mean)/sales_sd
z_s_sales

[1] 3.164935

Based on the z-value, how would you rate a $25000 sales value: poor, average, good, or very good performance? Explain your logic. ## this z-score is very good performace finding that the average sales value is 16717.2 we can corrolate that with the a ztable and see that the score of 3.169 falls between .9990 to .9993 which is falls into the one percentile this is why we consider this vary good.

Business Analytics Lab Worksheet 04 (bsad_lab04)

CME Group Foundation Business Analytics Lab

Brent Predovich