R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

Note: this analysis was performed using the open source software R and Rstudio.

library(readr)
ad_sales <- read_csv('https://raw.githubusercontent.com/utjimmyx/regression/master/advertising.csv')
## New names:
## Rows: 200 Columns: 6
## ── Column specification
## ──────────────────────────────────────────────────────── Delimiter: "," dbl
## (6): ...1, X1, TV, radio, newspaper, sales
## ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
## Specify the column types or set `show_col_types = FALSE` to quiet this message.
## • `` -> `...1`
plot(sales ~ TV, data = ad_sales)

plot(sales ~ radio, data = ad_sales)

This is the end of part 1 for my exploratory analysis.

library(ggplot2)
head(ad_sales)
## # A tibble: 6 × 6
##    ...1    X1    TV radio newspaper sales
##   <dbl> <dbl> <dbl> <dbl>     <dbl> <dbl>
## 1     1     1 230.   37.8      69.2  22.1
## 2     2     2  44.5  39.3      45.1  10.4
## 3     3     3  17.2  45.9      69.3   9.3
## 4     4     4 152.   41.3      58.5  18.5
## 5     5     5 181.   10.8      58.4  12.9
## 6     6     6   8.7  48.9      75     7.2
ggplot(data = ad_sales, aes(x = TV)) +
  geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Question 1: Is there a relationship between x and y? If so, what does the relationship look like?

they are both positive, linear, and strongly correlated it looks like

Question 2: What is the meaning of a coefficient? Is there a relationship between TV advertising and Sales? If so, what does the relationship look like?

A coefficient is a numerical or constant factor that multiplies a variable in an algebraic equation. the relationship is positive and linear

Question 3: Which marketing questions can we address with a simple regression analysis? Any limitations?

Regression analysis can help address marketing questions like how advertising spend affects sales, the relationship between price and sales volume, and the impact of promotions on customer behavior. However, it assumes linear relationships, may only show correlation (not causation), and can oversimplify complex marketing dynamics by ignoring multiple influencing factors. Additionally, it’s sensitive to outliers, data quality, and multicollinearity, and may not fully capture non-linear or changing market conditions.

Question 4: Can you plot the relationship between radio advertising and Sales? If so, what does the relationship look like?

linear as well and positive, not as tightly correlated as TV

Question 5: Refer to the readings in Chapter 3 (Exploring Data) of the Book, Practical Data Science with R, Second Edition, and use at least one of the ggplot2 methods to explore the variable radio ads.

In Chapter 3 of Practical Data Science with R, Second Edition, the ggplot2 package is used to visualize and explore relationships in data. For the variable “radio ads,” you could use ggplot() to create a scatter plot or a bar chart, depending on whether you’re exploring the distribution or relationship with another variable (like sales). For example, ggplot(data, aes(x=radio_ads, y=sales)) + geom_point() would show the correlation between radio ad spend and sales, helping to visually assess any trends or patterns. # Question 5