Defining The Question

A Kenyan entrepreneur has created an online cryptography course and would want to advertise it on her blog. She currently targets audiences originating from various countries. In the past, she ran adverts to advertise a related course on the same blog and collected data in the process. She would now like to employ your services as a Data Science Consultant to help her identify which individuals are most likely to click on her adverts.

Metric of Success

To provide an accurate depiction of the people most likely to view the clients advertisements and provide recommendations to the client based on the results of the univariate and bivariate analysis conducted on the dataset.

Understanding the context

Clicks on adverts can help you understand how appealing your advert is to people who see it. Highly targeted ads are more likely to receive clicks. This can help you gauge how enticing your advert is. In this case, it would help us know how many people would be interested in the online cryptography course through the number of clicks on our client’s blog.

Experimental Design

Steps taken:

Data Relevance

Loading the Dataset

advert <- read.csv("/home/binti/Downloads/R/advertising.csv")

Previewing the top of our dataset

head(advert)
##   Daily.Time.Spent.on.Site Age Area.Income Daily.Internet.Usage
## 1                    68.95  35    61833.90               256.09
## 2                    80.23  31    68441.85               193.77
## 3                    69.47  26    59785.94               236.50
## 4                    74.15  29    54806.18               245.89
## 5                    68.37  35    73889.99               225.58
## 6                    59.99  23    59761.56               226.74
##                           Ad.Topic.Line           City Male    Country
## 1    Cloned 5thgeneration orchestration    Wrightburgh    0    Tunisia
## 2    Monitored national standardization      West Jodi    1      Nauru
## 3      Organic bottom-line service-desk       Davidton    0 San Marino
## 4 Triple-buffered reciprocal time-frame West Terrifurt    1      Italy
## 5         Robust logistical utilization   South Manuel    0    Iceland
## 6       Sharable client-driven software      Jamieberg    1     Norway
##             Timestamp Clicked.on.Ad
## 1 2016-03-27 00:53:11             0
## 2 2016-04-04 01:39:02             0
## 3 2016-03-13 20:35:42             0
## 4 2016-01-10 02:31:19             0
## 5 2016-06-03 03:36:18             0
## 6 2016-05-19 14:30:17             0

Previewing the tail of our dataset

tail(advert)
##      Daily.Time.Spent.on.Site Age Area.Income Daily.Internet.Usage
## 995                     43.70  28    63126.96               173.01
## 996                     72.97  30    71384.57               208.58
## 997                     51.30  45    67782.17               134.42
## 998                     51.63  51    42415.72               120.37
## 999                     55.55  19    41920.79               187.95
## 1000                    45.01  26    29875.80               178.35
##                             Ad.Topic.Line          City Male
## 995         Front-line bifurcated ability  Nicholasland    0
## 996         Fundamental modular algorithm     Duffystad    1
## 997       Grass-roots cohesive monitoring   New Darlene    1
## 998          Expanded intangible solution South Jessica    1
## 999  Proactive bandwidth-monitored policy   West Steven    0
## 1000      Virtual 5thgeneration emulation   Ronniemouth    0
##                     Country           Timestamp Clicked.on.Ad
## 995                 Mayotte 2016-04-04 03:57:48             1
## 996                 Lebanon 2016-02-11 21:49:00             1
## 997  Bosnia and Herzegovina 2016-04-22 02:07:01             1
## 998                Mongolia 2016-02-01 17:24:57             1
## 999               Guatemala 2016-03-24 02:35:54             0
## 1000                 Brazil 2016-06-03 21:43:21             1

Dataset Columns

names(advert)
##  [1] "Daily.Time.Spent.on.Site" "Age"                     
##  [3] "Area.Income"              "Daily.Internet.Usage"    
##  [5] "Ad.Topic.Line"            "City"                    
##  [7] "Male"                     "Country"                 
##  [9] "Timestamp"                "Clicked.on.Ad"

Cleaning Data

Finding the total missing values in our dataset.

colSums(is.na(advert))
## Daily.Time.Spent.on.Site                      Age              Area.Income 
##                        0                        0                        0 
##     Daily.Internet.Usage            Ad.Topic.Line                     City 
##                        0                        0                        0 
##                     Male                  Country                Timestamp 
##                        0                        0                        0 
##            Clicked.on.Ad 
##                        0
#There are no missing values in our dataset

Checking for duplicates across our rows.

sum(advert[duplicated(advert),])
## [1] 0
#There are no duplicates in this dataset.

The dataset had neither missing values or any duplicated values

Exploring the dataset

Checking the descriptive statistics of the dataset

summary(advert)
##  Daily.Time.Spent.on.Site      Age         Area.Income    Daily.Internet.Usage
##  Min.   :32.60            Min.   :19.00   Min.   :13996   Min.   :104.8       
##  1st Qu.:51.36            1st Qu.:29.00   1st Qu.:47032   1st Qu.:138.8       
##  Median :68.22            Median :35.00   Median :57012   Median :183.1       
##  Mean   :65.00            Mean   :36.01   Mean   :55000   Mean   :180.0       
##  3rd Qu.:78.55            3rd Qu.:42.00   3rd Qu.:65471   3rd Qu.:218.8       
##  Max.   :91.43            Max.   :61.00   Max.   :79485   Max.   :270.0       
##  Ad.Topic.Line          City                Male         Country         
##  Length:1000        Length:1000        Min.   :0.000   Length:1000       
##  Class :character   Class :character   1st Qu.:0.000   Class :character  
##  Mode  :character   Mode  :character   Median :0.000   Mode  :character  
##                                        Mean   :0.481                     
##                                        3rd Qu.:1.000                     
##                                        Max.   :1.000                     
##   Timestamp         Clicked.on.Ad
##  Length:1000        Min.   :0.0  
##  Class :character   1st Qu.:0.0  
##  Mode  :character   Median :0.5  
##                     Mean   :0.5  
##                     3rd Qu.:1.0  
##                     Max.   :1.0

Checking the structure of the dataframe

str(advert)
## 'data.frame':    1000 obs. of  10 variables:
##  $ Daily.Time.Spent.on.Site: num  69 80.2 69.5 74.2 68.4 ...
##  $ Age                     : int  35 31 26 29 35 23 33 48 30 20 ...
##  $ Area.Income             : num  61834 68442 59786 54806 73890 ...
##  $ Daily.Internet.Usage    : num  256 194 236 246 226 ...
##  $ Ad.Topic.Line           : chr  "Cloned 5thgeneration orchestration" "Monitored national standardization" "Organic bottom-line service-desk" "Triple-buffered reciprocal time-frame" ...
##  $ City                    : chr  "Wrightburgh" "West Jodi" "Davidton" "West Terrifurt" ...
##  $ Male                    : int  0 1 0 1 0 1 0 1 1 1 ...
##  $ Country                 : chr  "Tunisia" "Nauru" "San Marino" "Italy" ...
##  $ Timestamp               : chr  "2016-03-27 00:53:11" "2016-04-04 01:39:02" "2016-03-13 20:35:42" "2016-01-10 02:31:19" ...
##  $ Clicked.on.Ad           : int  0 0 0 0 0 0 0 1 0 0 ...

Checking for Outliers

Checking for outliers in the dataset. These show a visual shape of our data distribution.

boxplot(advert$Area.Income,
        main ="Area Income",
        col = "orange",
        border  = 'brown',
        horizontal = TRUE,
        notch = TRUE)

#There are a few outliers in the area income column.
boxplot(advert$Daily.Time.Spent.on.Site,
        main ="Daily Time Spent on Site",
        col = "orange",
        border  = 'brown',
        horizontal = TRUE,
        notch = TRUE)

#There are no outliers in the daily time spent on site column. 
boxplot(advert$Age,
        main ="Age",
        col = "orange",
        border  = 'brown',
        horizontal = TRUE,
        notch = TRUE)

#There are no outliers in the age column.
boxplot(advert$Daily.Internet.Usage,
        main ="Daily Internet Usage",
        col = "orange",
        border  = 'brown',
        horizontal = TRUE,
        notch = TRUE)

#There are no outliers in the daily internet usage column

Exploratory Data Analysis

Univariate Analysis

Measures of Central Tendency

Mean of the numeric columns

colMeans(advert[sapply(advert,is.numeric)])
## Daily.Time.Spent.on.Site                      Age              Area.Income 
##                  65.0002                  36.0090               55000.0001 
##     Daily.Internet.Usage                     Male            Clicked.on.Ad 
##                 180.0001                   0.4810                   0.5000

Median of our numeric columns

ad_time_median <- median(advert$Daily.Time.Spent.on.Site)
print(ad_time_median)
## [1] 68.215
ad_age_median <- median(advert$Age)
ad_age_median
## [1] 35
ad_income_median <- median(advert$Area.Income)
ad_income_median
## [1] 57012.3
ad_internet_usage_median <- median(advert$Daily.Internet.Usage)
ad_internet_usage_median
## [1] 183.13

Mode of our numeric columns.

Let’s create the mode function

getmode <- function(v) {
   uniqv <- unique(v)
   uniqv[which.max(tabulate(match(v, uniqv)))]}

Finding the mode in the age column

getmode(advert$Age)
## [1] 31
getmode(advert$Daily.Time.Spent.on.Site)
## [1] 62.26
getmode(advert$Area.Income)
## [1] 61833.9
getmode(advert$Daily.Internet.Usage)
## [1] 167.22
getmode(advert$City)
## [1] "Lisamouth"
getmode(advert$Ad.Topic.Line)
## [1] "Cloned 5thgeneration orchestration"
getmode(advert$Male)
## [1] 0
getmode(advert$Country)
## [1] "Czech Republic"
getmode(advert$Timestamp)
## [1] "2016-03-27 00:53:11"

Minimum values in the numeric columns

min(advert$Age)
## [1] 19
min(advert$Daily.Time.Spent.on.Site)
## [1] 32.6
min(advert$Area.Income)
## [1] 13996.5
min(advert$Daily.Internet.Usage)
## [1] 104.78

Maximum values in the numeric columns

max(advert$Age)
## [1] 61
max(advert$Daily.Time.Spent.on.Site)
## [1] 91.43
max(advert$Area.Income)
## [1] 79484.8
max(advert$Daily.Internet.Usage)
## [1] 269.96

Range in the numeric columns

range(advert$Age)
## [1] 19 61
range(advert$Daily.Time.Spent.on.Site)
## [1] 32.60 91.43
range(advert$Area.Income)
## [1] 13996.5 79484.8
range(advert$Daily.Internet.Usage)
## [1] 104.78 269.96

Summary * The youngest respondent is 19 and the oldest 61 years of age. * The least time spent on her site is 32 minutes and the highest 91 minutes. * The lowest income earner among the respondents earns 13,996 while the highest earns 79,484. * Daily internet usage ranges from 104 - 269.

Quantiles in the columns

quantile(advert$Age)
##   0%  25%  50%  75% 100% 
##   19   29   35   42   61
quantile(advert$Daily.Time.Spent.on.Site)
##      0%     25%     50%     75%    100% 
## 32.6000 51.3600 68.2150 78.5475 91.4300
quantile(advert$Area.Income)
##       0%      25%      50%      75%     100% 
## 13996.50 47031.80 57012.30 65470.64 79484.80
quantile(advert$Daily.Internet.Usage)
##       0%      25%      50%      75%     100% 
## 104.7800 138.8300 183.1300 218.7925 269.9600

Variance of the numeric columns.

This shows how the data values are dispersed around the mean.

var(advert$Age)
## [1] 77.18611
var(advert$Daily.Time.Spent.on.Site)
## [1] 251.3371
var(advert$Area.Income)
## [1] 179952406
var(advert$Daily.Internet.Usage)
## [1] 1927.415

Finding the standard deviation of the columns.

sd(advert$Age)
## [1] 8.785562
sd(advert$Daily.Time.Spent.on.Site)
## [1] 15.85361
sd(advert$Area.Income)
## [1] 13414.63
sd(advert$Daily.Internet.Usage)
## [1] 43.90234

Frequency Distribution

requency Distribution in the age column

table(advert$Age)
## 
## 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 
##  6  6  6 13 19 21 27 37 33 48 48 39 60 38 43 39 39 50 36 37 30 36 32 26 23 21 
## 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 
## 30 18 13 16 18 20 12 15 10  9  7  2  6  4  2  4  1
# Most respondents fall between theage bracket 25-42. The age with the highest number of readers is 31 which has a total of 61 people in total.

Histogram

Plotting histograms for the columns

hist(advert$Age, col  = "Cyan")

#Most respondents fall in the age bracket 25-40.
hist(advert$Area.Income, col = "Purple")

#The respondents mostly earn between 50K - 70K
hist(advert$Daily.Time.Spent.on.Site, col = "gold")

hist(advert$Daily.Internet.Usage, col = "pink")

### Plotting count plots for Categorical data

Countplots for categorical data were plotted and it was observed that:

library(ggplot2)
ggplot(advert, aes(x=Male)) + geom_bar(fill=rgb(0.4,0.1,0.5))

There were more male than female users that visited the site and clicked on the advert

ggplot(advert, aes(x=factor(`Clicked.on.Ad`))) + geom_bar( fill=rgb(0.6,0.4,0.4))

The number of users that clicked the advert are equal to those that did not click on the advert.

Bivariate Analysis

Ggplots

library(ggplot2)

ggplot(data = advert, aes(x = Area.Income, fill = Clicked.on.Ad))+
        geom_histogram(bins  =20,col = "orange")+
        labs(title = "Income Distribution", x = "Area Income", y= "Frequency", fill = "Clicked on Ad")+ scale_color_brewer(
                palette = "Set1"
        )

ggplot(data = advert, aes(x = Age, fill = Clicked.on.Ad))+
        geom_histogram(bins  =20,col = "orange")+
        labs(title = "Age Distribution", x = "Age", y= "Frequency", fill = "Clicked on Ad")+ scale_color_brewer(
                palette = "Set1"
        )

ggplot(data = advert, aes(x =Daily.Time.Spent.on.Site, fill = Clicked.on.Ad))+
        geom_histogram(bins  =20,col = "orange")+
        labs(title = "Daily Time Spent on Site", x = "Time Spent on Site", y= "Frequency", fill = "Clicked on Ad")+ scale_color_brewer(
                palette = "Set1"
        )

Covariance

Covariance is a statistical representation of the degree to which two variables vary together.

cov(advert$Age, advert$Daily.Time.Spent.on.Site)
## [1] -46.17415
#There is a negative relationship between the age and the time spent on site which means as the age increases, the daily time spent on the site decreases. The opposite is true.
cov(advert$Age, advert$Daily.Internet.Usage)
## [1] -141.6348
#There is a negative relationship between the age and the daily internet usage as well.
cov(advert$Area.Income,advert$Daily.Time.Spent.on.Site)
## [1] 66130.81
#There is a strong positive relationship between the income and daily time spent on site variables. That goes to say that the higher the income, the more the time spent on site and the lower the income, the less the time spent on site.
cov(advert$Age,advert$Area.Income)
## [1] -21520.93
#There is a negative correlation between the age and income variables.

Correlation matrix

cor(advert$Age, advert$Daily.Time.Spent.on.Site)
## [1] -0.3315133
cor(advert$Age,advert$Daily.Internet.Usage)
## [1] -0.3672086
cor(advert$Area.Income,advert$Daily.Internet.Usage)
## [1] 0.3374955
cor(advert$Area.Income,advert$Daily.Time.Spent.on.Site)
## [1] 0.3109544
cor(advert$Age,advert$Area.Income)
## [1] -0.182605
cor(advert[, c("Age","Daily.Time.Spent.on.Site","Daily.Internet.Usage")])
##                                 Age Daily.Time.Spent.on.Site
## Age                       1.0000000               -0.3315133
## Daily.Time.Spent.on.Site -0.3315133                1.0000000
## Daily.Internet.Usage     -0.3672086                0.5186585
##                          Daily.Internet.Usage
## Age                                -0.3672086
## Daily.Time.Spent.on.Site            0.5186585
## Daily.Internet.Usage                1.0000000
cor(advert[,unlist(lapply(advert, is.numeric))])
##                          Daily.Time.Spent.on.Site         Age  Area.Income
## Daily.Time.Spent.on.Site               1.00000000 -0.33151334  0.310954413
## Age                                   -0.33151334  1.00000000 -0.182604955
## Area.Income                            0.31095441 -0.18260496  1.000000000
## Daily.Internet.Usage                   0.51865848 -0.36720856  0.337495533
## Male                                  -0.01895085 -0.02104406  0.001322359
## Clicked.on.Ad                         -0.74811656  0.49253127 -0.476254628
##                          Daily.Internet.Usage         Male Clicked.on.Ad
## Daily.Time.Spent.on.Site           0.51865848 -0.018950855   -0.74811656
## Age                               -0.36720856 -0.021044064    0.49253127
## Area.Income                        0.33749553  0.001322359   -0.47625463
## Daily.Internet.Usage               1.00000000  0.028012326   -0.78653918
## Male                               0.02801233  1.000000000   -0.03802747
## Clicked.on.Ad                     -0.78653918 -0.038027466    1.00000000

Plotting a correlation heatmap for the numerical variables

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(MASS)
## 
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
## 
##     select
library(ggcorrplot)
# Selecting the Numerical Variables of the dataset
corr <- dplyr::select(advert,Age,Area.Income,Clicked.on.Ad,Daily.Internet.Usage,Daily.Time.Spent.on.Site,Male )
# Plotting the Correlation Heatmap
library(ggcorrplot)
ggcorrplot(cor(corr), hc.order = F,type = 
"lower", lab = T,
  ggtheme = ggplot2::theme_gray,
  colors = c("#00798c", "violet", "#edae49"))

Here, it was noted noted that :

There was a strong negative correlation between the Daily Internet usage and Clicked on Ad variables. This means that the higher ones income the less likely they are to click on the blog ads. The same can also be said for the Daily Time Spent on Site and Click on ad variables.

The Click on Ad variable had a strong positive correlation with the Age Variable, the older users were more likely to click on the ad , as we observed above in our analysis.

The clicked on ad variable was also strongly negatively correlated with the Area Income , where the higher ones income was the less likely they were to click on the ad.

Scatter Plots

Scatter plots are used when we want to see a graphical representation of two different variables. They show how the variables are correlated.

Let’s plot a scatter plot for age and daily time spent on site.

ggplot(advert, aes(Area.Income,Age))+geom_point(aes(colour= factor(`Clicked.on.Ad`)))+
  labs(title = "Scatter Plot of Age Distribution vs Area Income",
       x = "Area Income",
       y = "Age")

The scatter plot for the Area Income against Age showed that , majority of the users who did not click on the ad were the high income earners and many of these were aged between 20 and 40 years.

Scatter plot for Income and Daily Internet Usage

ggplot(advert, aes(Area.Income, Daily.Internet.Usage))+
  geom_point(aes(colour= factor(`Clicked.on.Ad`)))+
  labs(title = "Scatter Plot of Area Income vs Daily Internet Usage",
       x = "Area Income",
       y = "Daily Internet Usage")

Scatter Plot of Age Distribution vs Time Spent on Site

ggplot(advert, aes(Age, Daily.Time.Spent.on.Site))+
  geom_point(aes(colour= factor(`Clicked.on.Ad`)))+
  labs(title = "Scatter Plot of Age Distribution vs Time Spent on Site",
       x = "Age",
       y = "Time Spent on Site")

Plotting the Age against Time spent on the site variable we see that the younger demographic are less tolerant to ads despite spending significant amounts of time on the site.

The reason for this may be that younger people , are more tech savvy and therefore are more likely to detect ads and avoid them while using the internet compared to their older counterparts.

Scatter plot for Income Distribution and Daily time spent on site.

ggplot(advert, aes(Daily.Time.Spent.on.Site, Area.Income))+
  geom_point(aes(colour= factor(`Clicked.on.Ad`)))+
  labs(title = "Time spent on site vs Income",
       x = "Daily Time Spent on Site",
       y = "Income Distribution")

The people who were least likely to click on the ad were the higher income earners , this was despite the fact that they seemed to spend a over an hour a day on the site.

The same sentiments can be echoed for the Usage , total internet usage per day, variable . When plotted against income we see that those who spend over 200 minutes online all day and earn more than 50,000 are the least likely to click on ads on the internet.

Scatter plot for Age and Income Distribution

ggplot(advert, aes(Age, Daily.Internet.Usage))+
  geom_point(aes(colour= factor(`Clicked.on.Ad`)))+
  labs(title = "Scatter Plot of Age Distribution vs Daily Usage",
       x = "Age",
       y = "Daily Usage")

Modelling

Multiple Linear Regression

The relationship between the variables has been established by the scatter plots, we’ll therefore begin with modelling.

We’ll begin the variables we’ll use in the model

input <- advert[,c("Clicked.on.Ad","Daily.Time.Spent.on.Site", "Age","Area.Income","Daily.Internet.Usage")]
head(input)
##   Clicked.on.Ad Daily.Time.Spent.on.Site Age Area.Income Daily.Internet.Usage
## 1             0                    68.95  35    61833.90               256.09
## 2             0                    80.23  31    68441.85               193.77
## 3             0                    69.47  26    59785.94               236.50
## 4             0                    74.15  29    54806.18               245.89
## 5             0                    68.37  35    73889.99               225.58
## 6             0                    59.99  23    59761.56               226.74

Applying the lm() function

multiple_lm <- lm(Clicked.on.Ad ~ Daily.Time.Spent.on.Site + Age + Area.Income + Daily.Internet.Usage, data = input)
multiple_lm
## 
## Call:
## lm(formula = Clicked.on.Ad ~ Daily.Time.Spent.on.Site + Age + 
##     Area.Income + Daily.Internet.Usage, data = input)
## 
## Coefficients:
##              (Intercept)  Daily.Time.Spent.on.Site                       Age  
##                2.293e+00                -1.275e-02                 9.017e-03  
##              Area.Income      Daily.Internet.Usage  
##               -6.170e-06                -5.276e-03

Let’s begin with the model assessment

summary(multiple_lm)
## 
## Call:
## lm(formula = Clicked.on.Ad ~ Daily.Time.Spent.on.Site + Age + 
##     Area.Income + Daily.Internet.Usage, data = input)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.63848 -0.11736 -0.03329  0.04825  1.02093 
## 
## Coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               2.293e+00  5.722e-02   40.07   <2e-16 ***
## Daily.Time.Spent.on.Site -1.275e-02  5.064e-04  -25.18   <2e-16 ***
## Age                       9.017e-03  8.297e-04   10.87   <2e-16 ***
## Area.Income              -6.169e-06  5.361e-07  -11.51   <2e-16 ***
## Daily.Internet.Usage     -5.276e-03  1.869e-04  -28.22   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2108 on 995 degrees of freedom
## Multiple R-squared:  0.8232, Adjusted R-squared:  0.8225 
## F-statistic:  1158 on 4 and 995 DF,  p-value: < 2.2e-16

To summarize the statistical objects in tidy tibbles, we use the broom package

library(broom)

tidy(multiple_lm)
## # A tibble: 5 × 5
##   term                        estimate   std.error statistic   p.value
##   <chr>                          <dbl>       <dbl>     <dbl>     <dbl>
## 1 (Intercept)               2.29       0.0572           40.1 8.00e-210
## 2 Daily.Time.Spent.on.Site -0.0127     0.000506        -25.2 1.28e-108
## 3 Age                       0.00902    0.000830         10.9 4.48e- 26
## 4 Area.Income              -0.00000617 0.000000536     -11.5 7.39e- 29
## 5 Daily.Internet.Usage     -0.00528    0.000187        -28.2 3.45e-129

Let’s check our model’s confidence levels now.

library(MASS)
confint(multiple_lm)
##                                  2.5 %        97.5 %
## (Intercept)               2.180655e+00  2.405221e+00
## Daily.Time.Spent.on.Site -1.374279e-02 -1.175543e-02
## Age                       7.388911e-03  1.064533e-02
## Area.Income              -7.221617e-06 -5.117457e-06
## Daily.Internet.Usage     -5.642504e-03 -4.908785e-03

Let’s generate the ANOVA table

anova(multiple_lm)
## Analysis of Variance Table
## 
## Response: Clicked.on.Ad
##                           Df  Sum Sq Mean Sq F value    Pr(>F)    
## Daily.Time.Spent.on.Site   1 139.920 139.920 3150.14 < 2.2e-16 ***
## Age                        1  16.793  16.793  378.08 < 2.2e-16 ***
## Area.Income                1  13.721  13.721  308.91 < 2.2e-16 ***
## Daily.Internet.Usage       1  35.372  35.372  796.35 < 2.2e-16 ***
## Residuals                995  44.195   0.044                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Predicting the response variable

pred <- predict(multiple_lm, input)
head(pred)
##            1            2            3            4            5            6 
## -0.003039961  0.105091778  0.025161263 -0.026268696  0.090933941  0.170612160

Cross Validating the multiple linear regression model

library(caret)
## Loading required package: lattice
multiple_lm2 <- train(Clicked.on.Ad ~ Daily.Time.Spent.on.Site + Age + Area.Income + Daily.Internet.Usage, data = input,
               method = "lm", 
               trControl = trainControl(method = "cv", 
                                        number = 10, 
                                        verboseIter = FALSE))
## Warning in train.default(x, y, weights = w, ...): You are trying to do
## regression and your outcome only has two possible values Are you trying to do
## classification? If so, use a 2 level factor as your outcome column.
summary(multiple_lm2)
## 
## Call:
## lm(formula = .outcome ~ ., data = dat)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.63848 -0.11736 -0.03329  0.04825  1.02093 
## 
## Coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               2.293e+00  5.722e-02   40.07   <2e-16 ***
## Daily.Time.Spent.on.Site -1.275e-02  5.064e-04  -25.18   <2e-16 ***
## Age                       9.017e-03  8.297e-04   10.87   <2e-16 ***
## Area.Income              -6.169e-06  5.361e-07  -11.51   <2e-16 ***
## Daily.Internet.Usage     -5.276e-03  1.869e-04  -28.22   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2108 on 995 degrees of freedom
## Multiple R-squared:  0.8232, Adjusted R-squared:  0.8225 
## F-statistic:  1158 on 4 and 995 DF,  p-value: < 2.2e-16
multiple_lm2
## Linear Regression 
## 
## 1000 samples
##    4 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 900, 900, 900, 900, 900, 900, ... 
## Resampling results:
## 
##   RMSE       Rsquared   MAE      
##   0.2103976  0.8231955  0.1444346
## 
## Tuning parameter 'intercept' was held constant at a value of TRUE

From the above result, we note that our model has an RMSE value of 0.2103 We’ll use the train object as input to the predict method

pred2 <- predict(multiple_lm2, input)
pred2
##             1             2             3             4             5 
## -0.0030399606  0.1050917785  0.0251612626 -0.0262686955  0.0909339409 
##             6             7             8             9            10 
##  0.1706121595  0.0254994749  1.0374704172  0.0198060726  0.2693165738 
##            11            12            13            14            15 
##  1.2021439565 -0.0360254524  0.9234476809  0.0461290855  1.0944498585 
##            16            17            18            19            20 
##  0.6274941846  1.0899980252  0.1862988748  1.1037990072  0.8409229118 
##            21            22            23            24            25 
## -0.0049003846 -0.0353224789  1.1619362991  0.0169481120  0.8272956741 
##            26            27            28            29            30 
##  0.0611947708  0.9841721204  1.0580825003  0.8738770178  0.1302828474 
##            31            32            33            34            35 
##  0.0199715636  0.0678829275  1.0034448129  0.4834923881  1.0849082641 
##            36            37            38            39            40 
##  0.0456002983  0.9650069710  0.1630126102  1.0810072514  1.1234924866 
##            41            42            43            44            45 
##  0.0461844797  0.0442467204  0.0735508445  0.0143415064  0.0568073546 
##            46            47            48            49            50 
##  1.0391165016  0.2041424353  0.1298031558  1.0834223345  1.1280128794 
##            51            52            53            54            55 
##  0.1499916932 -0.0233746955  0.9233760929  1.0128477775  0.9059922821 
##            56            57            58            59            60 
##  0.1918820744  0.5275875512  1.1562357605 -0.0435118037  1.1512417547 
##            61            62            63            64            65 
##  0.1828989190  0.0884173867  0.3018106975  0.1603294812  0.9588071457 
##            66            67            68            69            70 
##  0.2554238662  0.9673749740  0.7569765527  0.0407882698  0.7026867847 
##            71            72            73            74            75 
##  0.8792769624  0.1934922149  0.6650129282  1.1127418303  1.0522255381 
##            76            77            78            79            80 
##  0.1618815765  0.9883866088 -0.0615530419  1.1192336485  1.0634452220 
##            81            82            83            84            85 
##  0.0461265786  0.0286667249  0.9216117369  0.9795876889  0.2388611358 
##            86            87            88            89            90 
##  1.0762316457  0.1172609608  0.9407618243  1.0250634892  1.0943613643 
##            91            92            93            94            95 
##  0.6191673992  0.6410743725  0.0367128204  0.9554451957  0.9799556153 
##            96            97            98            99           100 
##  0.0130218183  1.0834620252  0.9282868033  1.0990107197  0.0381821959 
##           101           102           103           104           105 
##  1.1522493157  0.1282534105  0.1149975385  0.1636177593  0.2081155784 
##           106           107           108           109           110 
##  0.0041817405  0.0386686913  0.9241255090  1.0391380409  0.0811725037 
##           111           112           113           114           115 
##  0.6785797157  0.3182669524  0.1224675618  1.2338812831  0.0863496186 
##           116           117           118           119           120 
## -0.0481323978  0.5914073003  0.9193166126  0.3907172512  0.4559644081 
##           121           122           123           124           125 
##  0.0085726791  0.3212707462  0.0334510556  0.7291757915  0.7910507566 
##           126           127           128           129           130 
## -0.0380281126  0.6476768668  0.1144260972  0.0511052079  0.0845914331 
##           131           132           133           134           135 
##  1.0861122439  1.0549864050  0.5607684423  0.0354753031  1.1773520010 
##           136           137           138           139           140 
##  1.0550290989  1.1329893855  1.1575461625  0.2919600141  0.4625410836 
##           141           142           143           144           145 
##  0.1040320344  0.9038863589  0.9177522240  0.0157398724  0.1929882060 
##           146           147           148           149           150 
##  1.1813716849  0.9382571281  0.8794097398  0.8274409863  0.7582266919 
##           151           152           153           154           155 
##  0.4202382368  0.2518391861  1.1924002330  0.0404891399  0.1579902359 
##           156           157           158           159           160 
## -0.0072789213  1.0341911785  0.5695476061  0.0332810499  0.3624404965 
##           161           162           163           164           165 
##  0.1475679463 -0.0254908156 -0.0261144198  0.3276541689  1.1624631815 
##           166           167           168           169           170 
##  0.8140889568  1.0140866996 -0.0156022105  1.0897808911  0.0627848608 
##           171           172           173           174           175 
##  1.1703668405  0.2779196486  0.0942545225  0.3212357370  0.7607483791 
##           176           177           178           179           180 
##  0.0465683320  1.0927185639  0.1486650953  1.0650831991  0.0169419364 
##           181           182           183           184           185 
##  1.1641295235  0.3191266549  0.9761775028  0.0512528771  0.0221707548 
##           186           187           188           189           190 
##  1.1932857451  1.1311662126 -0.0400241252  0.4262389802  1.1388349582 
##           191           192           193           194           195 
##  1.0144343807  1.0096775629  1.1123283143  1.1716817861 -0.0383208754 
##           196           197           198           199           200 
##  0.7320104414  1.0128534594 -0.0418439135  0.1241002921  0.1346520014 
##           201           202           203           204           205 
##  0.1973207839  0.0071126588  1.0957513695  0.2482315760  0.0761458516 
##           206           207           208           209           210 
##  0.9439533245  0.0351850331  0.1184753309  1.0728324283  1.1730926351 
##           211           212           213           214           215 
##  0.0147769569  0.8168573907 -0.0231814346  0.5305604841 -0.0047018895 
##           216           217           218           219           220 
##  0.6465800409  1.1039675459  1.0524637499  0.7848852185  1.0516078158 
##           221           222           223           224           225 
## -0.0508546621  0.0948511174  0.8631075426  0.5937434303  0.2630972286 
##           226           227           228           229           230 
##  0.5235028616  0.9358907877  1.0666257786  0.1618874914  0.1082768448 
##           231           232           233           234           235 
##  0.0889409906  1.0737833522  0.6211208456 -0.0209295535  0.8667957758 
##           236           237           238           239           240 
##  1.2290657146  0.6272279248  0.4471550083  0.5839279086 -0.0051576601 
##           241           242           243           244           245 
##  0.7253477568  1.0776895868  0.0859188983  0.0279490471  0.2655320592 
##           246           247           248           249           250 
##  0.3467473051  1.0505532079  0.2008930303  1.1010250058  0.7183918942 
##           251           252           253           254           255 
##  0.0283066985  0.8961146819  0.0359383388  0.6954121808  0.9659419469 
##           256           257           258           259           260 
##  0.0579922323 -0.0510282371  1.0429133458 -0.0100510918  0.7330724387 
##           261           262           263           264           265 
##  0.0548188448  0.9227369113  0.6381778162  1.1708992988 -0.0389467444 
##           266           267           268           269           270 
##  0.7502700994  0.5553280090  0.0265947219  0.8190945981  0.0616845376 
##           271           272           273           274           275 
##  0.9280308709  0.0706738035 -0.0488573842  0.3190516605 -0.0169553922 
##           276           277           278           279           280 
##  0.9538522416 -0.0085021936  0.1947670637  0.3371650183  0.0779256731 
##           281           282           283           284           285 
##  1.2068810034  0.8281972216  1.0856498391  0.0138154997  1.1277967291 
##           286           287           288           289           290 
##  0.0893091014  0.8560355196  0.0252907465  0.9422025028  0.4428712723 
##           291           292           293           294           295 
##  0.5770999902  0.0123048254  0.6272574311 -0.0434375465  0.3732462769 
##           296           297           298           299           300 
##  0.0000365101 -0.0106915254  0.0288894035  0.0058241127  0.0828408064 
##           301           302           303           304           305 
## -0.0051298706  0.6859772968  0.9555625012  0.5657962861  1.1195858579 
##           306           307           308           309           310 
##  0.2015992884 -0.0033817940 -0.0435666160  0.1211529730  1.0815419694 
##           311           312           313           314           315 
##  0.0217623892  0.1512934386  0.3762218727 -0.0286520710 -0.0128913425 
##           316           317           318           319           320 
##  0.8277072764  0.2295133915  0.0094332366  0.1368163519  1.0720044615 
##           321           322           323           324           325 
##  0.8154252977  0.1767429496  0.2205077661  0.0509643040 -0.0229211552 
##           326           327           328           329           330 
##  0.9325675814  0.9194689277 -0.0264119807  0.0944234095  0.5081719766 
##           331           332           333           334           335 
##  0.0794406315  0.0108643913  0.8875063088 -0.0621670958  0.1497031276 
##           336           337           338           339           340 
##  1.0582285197  0.1361189279 -0.0173988134  0.0877411659  0.1623605654 
##           341           342           343           344           345 
##  1.0049307424  0.8709875130  0.0061062027  0.0391482283  0.7124448047 
##           346           347           348           349           350 
##  0.2303546426  0.1814579145  0.9363142615  0.2503453735  0.3831419809 
##           351           352           353           354           355 
## -0.0203637425  0.0962020552 -0.0259691459  0.0923183866  0.8678543982 
##           356           357           358           359           360 
## -0.0200666175  1.1607297074  1.1379133914  1.0720359500  0.0281423910 
##           361           362           363           364           365 
##  0.9510929748  0.9187257912  0.0992943618  0.5462019414  0.1152267776 
##           366           367           368           369           370 
##  0.7460658618  0.1538944804  0.0190894285  0.0655792060  0.1002814527 
##           371           372           373           374           375 
##  1.1167555972  1.1840935280  0.0482547383  1.0460974031  0.1138960739 
##           376           377           378           379           380 
##  0.1202052138 -0.0019682000  0.9757381684  0.6648339924 -0.0456667305 
##           381           382           383           384           385 
## -0.0254022457  1.1374412392  0.2633809437  0.0102537924  0.9194736136 
##           386           387           388           389           390 
## -0.0017893540  0.1090201387  1.2168712499  0.1218668323  0.9758641019 
##           391           392           393           394           395 
## -0.0328367368  0.0959680974  0.0674776312  0.2307568229  1.0379817918 
##           396           397           398           399           400 
##  0.1843982512  1.1208963465  0.3294051438  0.1694122222  0.0805246744 
##           401           402           403           404           405 
##  1.0393501736 -0.0466481375  1.2008351825  0.0612770615  1.1262548311 
##           406           407           408           409           410 
## -0.0289196622  0.8681576166  0.9548397876  0.3964529080  1.1536412226 
##           411           412           413           414           415 
##  0.9062148482  0.0451549512  0.2728025320  1.1238171866  0.0407684042 
##           416           417           418           419           420 
##  1.2154918016  0.6814679869  0.0510841485  0.1785974456  0.4012706022 
##           421           422           423           424           425 
##  1.0493186491  0.0699762027  0.9194674722  0.9752018146  1.0007245330 
##           426           427           428           429           430 
##  0.8618207667  0.6524574141 -0.0509509611  0.7821699768 -0.0152009765 
##           431           432           433           434           435 
##  0.0619735898  0.0097698200  0.2750084032  0.0113346699  0.0064942165 
##           436           437           438           439           440 
##  0.7653330570  0.2689999241  0.0384996985  1.0714719581  0.2939553901 
##           441           442           443           444           445 
##  0.8100395182 -0.0480287761  1.0778430589  1.0800608553  1.0746207811 
##           446           447           448           449           450 
## -0.0327333924  0.6086591483  0.1382421812  1.1312539991  0.3032123075 
##           451           452           453           454           455 
##  0.7791332120  0.8215625957 -0.0572823073  0.1796222192  0.4767338621 
##           456           457           458           459           460 
##  0.0578537718  1.0899568594  0.0539237529  0.7278148378  0.0139943104 
##           461           462           463           464           465 
##  1.1732491412  1.0378471659  0.2427713342  0.9994073760  0.1815817838 
##           466           467           468           469           470 
##  0.7948835209  0.5583085766  0.7971703937  1.0635155928  0.1006809745 
##           471           472           473           474           475 
##  0.7398602599  0.1496350032  0.0773010362 -0.0242013826  1.0403489927 
##           476           477           478           479           480 
##  0.0345889999  0.0582860792  1.1446996159  1.0376776844  0.7820113897 
##           481           482           483           484           485 
##  0.3020745827  0.1683177690  0.1017050087  1.0222027668  1.0915863103 
##           486           487           488           489           490 
##  0.7382046202  0.3383238358  0.0399963263  1.1591021500 -0.0481338253 
##           491           492           493           494           495 
##  1.1852264297  1.0057042766  0.3124953027  0.8466143473  0.8322477443 
##           496           497           498           499           500 
## -0.0150944064  0.0009543489  1.1785629028  0.2568767561  0.7496476469 
##           501           502           503           504           505 
##  1.0024061782 -0.0464432182  0.0849743982  1.0320258184  1.0506289878 
##           506           507           508           509           510 
##  0.0755397140 -0.0417845991  1.0098270220  0.5743569953 -0.0441172386 
##           511           512           513           514           515 
##  0.8293877313  0.0277820558 -0.0044226122  1.0708587480 -0.0251383958 
##           516           517           518           519           520 
##  1.0414813512 -0.0010360447  0.5510905556  1.1256654639  1.1254561870 
##           521           522           523           524           525 
##  0.9526784387  0.7744092320  0.0726701666  0.7160859772 -0.0389198423 
##           526           527           528           529           530 
##  0.4657109721  0.9627934449  0.3012656774  1.1713040358  0.3415090025 
##           531           532           533           534           535 
##  0.9149311279  1.1549566037  0.1622024321  0.0817975273  0.0708635834 
##           536           537           538           539           540 
##  0.0479799363  0.2213670591  0.1727521328 -0.0536569438  0.0260877616 
##           541           542           543           544           545 
##  0.0188694725 -0.0226797712  0.1242622758  1.0263994367  0.0448332240 
##           546           547           548           549           550 
##  1.0178545176  0.0318795893 -0.0340971924 -0.0515089744  0.0182348349 
##           551           552           553           554           555 
##  0.0252704140  0.0271756437  1.0843640319  1.0240284801  1.0413425880 
##           556           557           558           559           560 
##  0.0671981071  1.0819910762 -0.0281548731  0.3083345558 -0.0432957112 
##           561           562           563           564           565 
##  1.0200949710  1.1951482902  0.1897388115  0.4748985551  0.7826302242 
##           566           567           568           569           570 
##  0.1485123874  1.1577322391 -0.0573739713  0.3527096754  0.0009524977 
##           571           572           573           574           575 
##  1.1766745000 -0.0164541891 -0.0281664050  0.4663629080  1.1775304801 
##           576           577           578           579           580 
##  1.1194284943  1.1119464028 -0.0383509376  0.1514159993  0.0916321851 
##           581           582           583           584           585 
##  1.0067291981  0.9715260689  0.8767398152  0.5809819862  0.5260103073 
##           586           587           588           589           590 
##  0.1048294337 -0.0492472757  1.2391454692 -0.0476787151  0.8708903934 
##           591           592           593           594           595 
##  0.5594973397  1.0023823320  0.0732301428  0.1291512537  0.8685157366 
##           596           597           598           599           600 
##  0.8039996258 -0.0140308456 -0.0273338465  0.0558433723  0.7050304989 
##           601           602           603           604           605 
##  0.6470821659  1.2027324934  0.8861110795  0.1087285308  0.9057355494 
##           606           607           608           609           610 
##  1.0484188296  0.1442512826  0.0565111483  0.8572880243  0.2914198999 
##           611           612           613           614           615 
##  0.9330884005  0.9791520856  0.1106257150 -0.0131179752  0.0204894681 
##           616           617           618           619           620 
##  1.0898087996  1.1606041672  0.0140374076  1.0754953922  0.0448437200 
##           621           622           623           624           625 
##  0.1394510047  0.0212298127  0.9813413431  0.3978035661  0.1230291921 
##           626           627           628           629           630 
##  0.5207382047  0.1087156160  1.0475289757  1.1001824882 -0.0029897952 
##           631           632           633           634           635 
##  0.0864500231 -0.0275551677  0.0541029394  1.0499357828  0.9976890691 
##           636           637           638           639           640 
##  0.9869308671  1.1036948360  0.1045083679  0.6295731297  0.1149770909 
##           641           642           643           644           645 
##  0.9407271435  0.1448074419 -0.0147364253  0.3227251809  0.0434337767 
##           646           647           648           649           650 
##  0.9842145287  1.1933991883  0.8969816089  0.1971089003 -0.0028795868 
##           651           652           653           654           655 
##  0.2200826324 -0.0205724820 -0.0295585198  0.0865368045  0.0071488145 
##           656           657           658           659           660 
##  1.1859451394  0.1396102895  0.1599304074  0.0944639923  0.0036668240 
##           661           662           663           664           665 
##  0.8227052126  0.4215744349  1.2136902530  0.9140031170  0.0089020649 
##           666           667           668           669           670 
##  0.7118340776 -0.0116813462 -0.0473284574  0.3770626184  0.7241405881 
##           671           672           673           674           675 
##  0.0750885369  0.6647951677  0.0705314119  0.9742047637  0.0433371880 
##           676           677           678           679           680 
## -0.0095508436  0.8754167634  0.8076951745  0.3526023254  1.0369623453 
##           681           682           683           684           685 
##  0.0869394001  0.9321230389  1.0870932064  0.1292345309  1.1025036316 
##           686           687           688           689           690 
## -0.0025921121 -0.0257890274  0.0552665974  0.0143900894 -0.0077805132 
##           691           692           693           694           695 
##  0.0640531055 -0.0211939920  0.8376807252  0.8499741644  0.1994146636 
##           696           697           698           699           700 
## -0.0039900749  1.0428734391  0.0244208027  0.1370770269 -0.0092706431 
##           701           702           703           704           705 
##  0.3984741566  0.8367630783  0.0534510431  0.0496473870  0.0747171469 
##           706           707           708           709           710 
##  0.0247892778  0.5055413020  0.0170249485  0.7477016207  0.7385127678 
##           711           712           713           714           715 
##  1.1731927882  0.0983072415  0.3012826681  1.1850025852  0.1245830630 
##           716           717           718           719           720 
##  0.9827642121  1.0357695643  0.0861476392  0.4328907319  0.8751941486 
##           721           722           723           724           725 
## -0.0470285193  0.9685942913  0.9104308316  0.4523087681  0.0109177349 
##           726           727           728           729           730 
## -0.0230944155  0.0885392680 -0.0477406481  0.2273191204 -0.0179458478 
##           731           732           733           734           735 
##  0.3105934941  0.0887099669  0.1409280294  1.1946803743  0.7617051719 
##           736           737           738           739           740 
## -0.0194907558 -0.0375000293  0.9508011022  0.5123628554  0.0261486539 
##           741           742           743           744           745 
##  0.9848014105  0.1915959224  0.1725794494  0.9634969799  0.7764986789 
##           746           747           748           749           750 
##  0.9618466916  0.0235263590  1.0659094289  0.9628821087  0.6384750557 
##           751           752           753           754           755 
##  0.8781945051 -0.0526036251  0.1970365078  0.1638281297  0.0659212084 
##           756           757           758           759           760 
##  0.1133810518  0.9538079021  0.9230093725  0.7905888754  0.4389330614 
##           761           762           763           764           765 
##  0.0389820674  0.1782139961  1.1193311022  0.8668052845  1.1993900151 
##           766           767           768           769           770 
##  1.0140207093  0.6336924157  1.1349334170  0.7969702756  0.2195017171 
##           771           772           773           774           775 
##  0.0400333515  0.3672939189  0.0320143746  0.5625663276  1.0072049491 
##           776           777           778           779           780 
##  1.1457076552  0.5992759593  0.0276950668  0.9180181829  0.0017969427 
##           781           782           783           784           785 
##  0.2438018160  0.8042920965 -0.0399758629  0.2396784018  1.0603463489 
##           786           787           788           789           790 
##  1.1353896968  0.3598991090  0.3073423716  0.0786546307  1.0513238724 
##           791           792           793           794           795 
##  0.9713639111  1.1396477832  0.3032206807  0.9422517105  1.1999338616 
##           796           797           798           799           800 
##  0.0794056513  0.0863420168 -0.0339589809  0.3403469753  0.0391831739 
##           801           802           803           804           805 
##  0.8367538984  1.0575330786  1.1889804784  1.2343454817  1.1545275661 
##           806           807           808           809           810 
##  0.1503018248  0.8843890824  1.1635752379  1.0781744947  1.0012052425 
##           811           812           813           814           815 
##  0.9369661326 -0.0391831251  0.2113073917  0.0930517067 -0.0308780828 
##           816           817           818           819           820 
##  0.0067103797  1.0755871265  0.7706673194  0.0152419163  0.1522920192 
##           821           822           823           824           825 
##  1.0505215596 -0.0518739256 -0.0172671055  0.0380800356  0.0632493632 
##           826           827           828           829           830 
##  0.1106986861 -0.0329246397  1.1905929553  0.8862853586  1.0408045339 
##           831           832           833           834           835 
##  0.9199147761  1.0199277226  1.1407161845  1.0311207218  0.0556387721 
##           836           837           838           839           840 
##  0.0606048407  1.1668968861  1.0596314455  1.2058822505  1.1016349582 
##           841           842           843           844           845 
##  0.7811688304  1.1030225651  0.1176482501  0.0258489054  0.0215228747 
##           846           847           848           849           850 
##  1.0241756534  1.0987469582  0.2312932678  0.0122165774  1.0750850125 
##           851           852           853           854           855 
## -0.0502260809  1.0332998620  1.1171670888  0.0432738801  0.0740078612 
##           856           857           858           859           860 
##  0.4679588977 -0.0323230817  0.0675234635  1.1106023077 -0.0037448784 
##           861           862           863           864           865 
##  0.2499006909  0.0893727303  0.2843146154 -0.0150840873  0.1242929912 
##           866           867           868           869           870 
##  1.0879957730  0.0729026193  0.0709995522  0.0727587968 -0.0111810814 
##           871           872           873           874           875 
##  0.6514501222 -0.0308412178  0.1579767668 -0.0362301258  0.2847263043 
##           876           877           878           879           880 
##  0.9226743052  0.9487991567  0.2607934788 -0.0016034943  0.1361152375 
##           881           882           883           884           885 
##  1.0259617575 -0.0476945728  0.0810199713  0.8874544702  0.0121155410 
##           886           887           888           889           890 
##  1.2361359660  1.1954377799  1.0605867544  0.0407574310  1.0153153877 
##           891           892           893           894           895 
## -0.0396888431  0.5486825759  0.7592953645  0.0427825768  0.0939242238 
##           896           897           898           899           900 
##  0.1808249616 -0.0501276590  0.7595111795  1.0896016449  1.1586368815 
##           901           902           903           904           905 
##  1.1533603800  0.8953725923  1.1282477240  0.1635492675  0.0404563855 
##           906           907           908           909           910 
## -0.0091313085  0.8311230671  0.0800277472  1.1113463711  0.0328491716 
##           911           912           913           914           915 
##  1.0874046759  0.9628444503  1.1515209715  0.0190213515  1.1051316503 
##           916           917           918           919           920 
##  0.9950264165  1.0042691456  0.0502638000  0.0196598658  0.2966383495 
##           921           922           923           924           925 
## -0.0624007248  0.9660668471  1.1444822753  1.0983128375  0.7437997133 
##           926           927           928           929           930 
##  1.1776988782  0.1351803332  0.2703698152  0.1285084829  0.7183450762 
##           931           932           933           934           935 
## -0.0553228249  1.0000596426  0.8264720894  1.0526590016 -0.0603267311 
##           936           937           938           939           940 
##  0.0987226084  1.0819556291  0.9579860233  0.9695679351  0.4605785465 
##           941           942           943           944           945 
##  1.2344721750  0.4805360809  0.7553108033  1.0134206114  0.8644202527 
##           946           947           948           949           950 
##  0.0731385908  0.1916913609  0.9435439601  0.4037493327  0.2870776789 
##           951           952           953           954           955 
##  0.6852359443  0.9103381717  0.4804487411  0.8807465982  0.2311376513 
##           956           957           958           959           960 
##  1.1626417580  0.8870185038  0.0005105901 -0.0430815797  0.2905373274 
##           961           962           963           964           965 
##  0.9192475132  0.1020508901  0.1370215778  0.1633168614  0.0910256676 
##           966           967           968           969           970 
##  0.9783088179  1.0321595658  0.2360529960  1.1818904032  0.4630651697 
##           971           972           973           974           975 
##  0.9273214585  1.2407846885  1.1162470605  0.0499153412  1.1815762054 
##           976           977           978           979           980 
##  1.1866398516  1.0690888392  0.9374717662  0.3304009239 -0.0186151321 
##           981           982           983           984           985 
##  0.9024876509  0.1105022062  0.7543495764 -0.0485895204  0.0143476999 
##           986           987           988           989           990 
##  0.7055279684 -0.0266310544  1.0523621113  0.1859560200 -0.0253665479 
##           991           992           993           994           995 
##  1.1510376126  1.1715876447  0.7112068269  0.1085834501  0.6860781611 
##           996           997           998           999          1000 
##  0.0923456762  0.9173425996  1.1978601695  0.5058612621  0.8283149091
error <- pred2 - advert$Clicked.on.Ad
error
##             1             2             3             4             5 
## -3.039961e-03  1.050918e-01  2.516126e-02 -2.626870e-02  9.093394e-02 
##             6             7             8             9            10 
##  1.706122e-01  2.549947e-02  3.747042e-02  1.980607e-02  2.693166e-01 
##            11            12            13            14            15 
##  2.021440e-01 -3.602545e-02 -7.655232e-02  4.612909e-02  9.444986e-02 
##            16            17            18            19            20 
## -3.725058e-01  8.999803e-02  1.862989e-01  1.037990e-01 -1.590771e-01 
##            21            22            23            24            25 
## -4.900385e-03 -3.532248e-02  1.619363e-01  1.694811e-02 -1.727043e-01 
##            26            27            28            29            30 
##  6.119477e-02 -1.582788e-02  5.808250e-02 -1.261230e-01  1.302828e-01 
##            31            32            33            34            35 
##  1.997156e-02  6.788293e-02  3.444813e-03 -5.165076e-01  8.490826e-02 
##            36            37            38            39            40 
##  4.560030e-02 -3.499303e-02  1.630126e-01  8.100725e-02  1.234925e-01 
##            41            42            43            44            45 
##  4.618448e-02  4.424672e-02  7.355084e-02  1.434151e-02  5.680735e-02 
##            46            47            48            49            50 
##  3.911650e-02  2.041424e-01  1.298032e-01  8.342233e-02  1.280129e-01 
##            51            52            53            54            55 
##  1.499917e-01 -2.337470e-02 -7.662391e-02  1.284778e-02 -9.400772e-02 
##            56            57            58            59            60 
##  1.918821e-01 -4.724124e-01  1.562358e-01 -4.351180e-02  1.512418e-01 
##            61            62            63            64            65 
##  1.828989e-01  8.841739e-02  3.018107e-01  1.603295e-01 -4.119285e-02 
##            66            67            68            69            70 
##  2.554239e-01 -3.262503e-02 -2.430234e-01  4.078827e-02 -2.973132e-01 
##            71            72            73            74            75 
## -1.207230e-01  1.934922e-01 -3.349871e-01  1.127418e-01  5.222554e-02 
##            76            77            78            79            80 
##  1.618816e-01 -1.161339e-02 -6.155304e-02  1.192336e-01  6.344522e-02 
##            81            82            83            84            85 
##  4.612658e-02  2.866672e-02 -7.838826e-02 -2.041231e-02  2.388611e-01 
##            86            87            88            89            90 
##  7.623165e-02  1.172610e-01 -5.923818e-02  2.506349e-02  9.436136e-02 
##            91            92            93            94            95 
## -3.808326e-01 -3.589256e-01  3.671282e-02 -4.455480e-02 -2.004438e-02 
##            96            97            98            99           100 
##  1.302182e-02  8.346203e-02 -7.171320e-02  9.901072e-02  3.818220e-02 
##           101           102           103           104           105 
##  1.522493e-01  1.282534e-01  1.149975e-01  1.636178e-01  2.081156e-01 
##           106           107           108           109           110 
##  4.181740e-03  3.866869e-02 -7.587449e-02  3.913804e-02  8.117250e-02 
##           111           112           113           114           115 
## -3.214203e-01 -6.817330e-01  1.224676e-01  2.338813e-01  8.634962e-02 
##           116           117           118           119           120 
## -4.813240e-02 -4.085927e-01 -8.068339e-02 -6.092827e-01 -5.440356e-01 
##           121           122           123           124           125 
##  8.572679e-03  3.212707e-01  3.345106e-02 -2.708242e-01 -2.089492e-01 
##           126           127           128           129           130 
## -3.802811e-02 -3.523231e-01  1.144261e-01  5.110521e-02  8.459143e-02 
##           131           132           133           134           135 
##  8.611224e-02  5.498641e-02 -4.392316e-01  3.547530e-02  1.773520e-01 
##           136           137           138           139           140 
##  5.502910e-02  1.329894e-01  1.575462e-01  2.919600e-01  4.625411e-01 
##           141           142           143           144           145 
##  1.040320e-01 -9.611364e-02 -8.224778e-02  1.573987e-02  1.929882e-01 
##           146           147           148           149           150 
##  1.813717e-01 -6.174287e-02 -1.205903e-01 -1.725590e-01 -2.417733e-01 
##           151           152           153           154           155 
##  4.202382e-01  2.518392e-01  1.924002e-01  4.048914e-02  1.579902e-01 
##           156           157           158           159           160 
## -7.278921e-03  3.419118e-02 -4.304524e-01  3.328105e-02 -6.375595e-01 
##           161           162           163           164           165 
##  1.475679e-01 -2.549082e-02 -2.611442e-02  3.276542e-01  1.624632e-01 
##           166           167           168           169           170 
## -1.859110e-01  1.408670e-02 -1.560221e-02  8.978089e-02  6.278486e-02 
##           171           172           173           174           175 
##  1.703668e-01  2.779196e-01  9.425452e-02  3.212357e-01 -2.392516e-01 
##           176           177           178           179           180 
##  4.656833e-02  9.271856e-02  1.486651e-01  6.508320e-02  1.694194e-02 
##           181           182           183           184           185 
##  1.641295e-01 -6.808733e-01 -2.382250e-02  5.125288e-02  2.217075e-02 
##           186           187           188           189           190 
##  1.932857e-01  1.311662e-01 -4.002413e-02 -5.737610e-01  1.388350e-01 
##           191           192           193           194           195 
##  1.443438e-02  9.677563e-03  1.123283e-01  1.716818e-01 -3.832088e-02 
##           196           197           198           199           200 
## -2.679896e-01  1.285346e-02 -4.184391e-02  1.241003e-01  1.346520e-01 
##           201           202           203           204           205 
##  1.973208e-01  7.112659e-03  9.575137e-02  2.482316e-01  7.614585e-02 
##           206           207           208           209           210 
## -5.604668e-02  3.518503e-02  1.184753e-01  7.283243e-02  1.730926e-01 
##           211           212           213           214           215 
##  1.477696e-02 -1.831426e-01 -2.318143e-02 -4.694395e-01 -4.701890e-03 
##           216           217           218           219           220 
## -3.534200e-01  1.039675e-01  5.246375e-02 -2.151148e-01  5.160782e-02 
##           221           222           223           224           225 
## -5.085466e-02  9.485112e-02 -1.368925e-01 -4.062566e-01  2.630972e-01 
##           226           227           228           229           230 
## -4.764971e-01 -6.410921e-02  6.662578e-02  1.618875e-01  1.082768e-01 
##           231           232           233           234           235 
##  8.894099e-02  7.378335e-02 -3.788792e-01 -1.020930e+00 -1.332042e-01 
##           236           237           238           239           240 
##  2.290657e-01 -3.727721e-01  4.471550e-01 -4.160721e-01 -5.157660e-03 
##           241           242           243           244           245 
## -2.746522e-01  7.768959e-02  8.591890e-02  2.794905e-02  2.655321e-01 
##           246           247           248           249           250 
##  3.467473e-01  5.055321e-02 -7.991070e-01  1.010250e-01 -2.816081e-01 
##           251           252           253           254           255 
##  2.830670e-02 -1.038853e-01  3.593834e-02 -3.045878e-01 -3.405805e-02 
##           256           257           258           259           260 
##  5.799223e-02 -5.102824e-02  4.291335e-02 -1.005109e-02 -2.669276e-01 
##           261           262           263           264           265 
##  5.481884e-02 -7.726309e-02 -3.618222e-01  1.708993e-01 -3.894674e-02 
##           266           267           268           269           270 
## -2.497299e-01 -4.446720e-01  2.659472e-02 -1.809054e-01  6.168454e-02 
##           271           272           273           274           275 
## -7.196913e-02  7.067380e-02 -4.885738e-02  3.190517e-01 -1.695539e-02 
##           276           277           278           279           280 
## -4.614776e-02 -8.502194e-03  1.947671e-01  3.371650e-01  7.792567e-02 
##           281           282           283           284           285 
##  2.068810e-01 -1.718028e-01  8.564984e-02  1.381550e-02  1.277967e-01 
##           286           287           288           289           290 
##  8.930910e-02 -1.439645e-01  2.529075e-02 -5.779750e-02 -5.571287e-01 
##           291           292           293           294           295 
## -4.229000e-01  1.230483e-02 -3.727426e-01 -4.343755e-02  3.732463e-01 
##           296           297           298           299           300 
##  3.651010e-05 -1.069153e-02  2.888940e-02  5.824113e-03  8.284081e-02 
##           301           302           303           304           305 
## -5.129871e-03 -3.140227e-01 -4.443750e-02 -4.342037e-01  1.195859e-01 
##           306           307           308           309           310 
## -7.984007e-01 -3.381794e-03 -4.356662e-02  1.211530e-01  8.154197e-02 
##           311           312           313           314           315 
##  2.176239e-02  1.512934e-01 -6.237781e-01 -2.865207e-02 -1.289134e-02 
##           316           317           318           319           320 
## -1.722927e-01  2.295134e-01  9.433237e-03  1.368164e-01  7.200446e-02 
##           321           322           323           324           325 
## -1.845747e-01  1.767429e-01  2.205078e-01  5.096430e-02 -2.292116e-02 
##           326           327           328           329           330 
## -6.743242e-02 -8.053107e-02 -2.641198e-02  9.442341e-02 -4.918280e-01 
##           331           332           333           334           335 
##  7.944063e-02  1.086439e-02 -1.124937e-01 -6.216710e-02  1.497031e-01 
##           336           337           338           339           340 
##  5.822852e-02  1.361189e-01 -1.739881e-02  8.774117e-02  1.623606e-01 
##           341           342           343           344           345 
##  4.930742e-03 -1.290125e-01  6.106203e-03  3.914823e-02 -2.875552e-01 
##           346           347           348           349           350 
##  2.303546e-01  1.814579e-01 -6.368574e-02  2.503454e-01 -6.168580e-01 
##           351           352           353           354           355 
## -2.036374e-02  9.620206e-02 -2.596915e-02  9.231839e-02 -1.321456e-01 
##           356           357           358           359           360 
## -2.006662e-02  1.607297e-01  1.379134e-01  7.203595e-02  2.814239e-02 
##           361           362           363           364           365 
## -4.890703e-02 -8.127421e-02  9.929436e-02 -4.537981e-01  1.152268e-01 
##           366           367           368           369           370 
## -2.539341e-01  1.538945e-01  1.908943e-02  6.557921e-02  1.002815e-01 
##           371           372           373           374           375 
##  1.167556e-01  1.840935e-01  4.825474e-02  4.609740e-02  1.138961e-01 
##           376           377           378           379           380 
##  1.202052e-01 -1.968200e-03 -2.426183e-02 -3.351660e-01 -4.566673e-02 
##           381           382           383           384           385 
## -2.540225e-02  1.374412e-01  2.633809e-01  1.025379e-02 -8.052639e-02 
##           386           387           388           389           390 
## -1.789354e-03  1.090201e-01  2.168712e-01  1.218668e-01 -2.413590e-02 
##           391           392           393           394           395 
## -3.283674e-02  9.596810e-02  6.747763e-02  2.307568e-01  3.798179e-02 
##           396           397           398           399           400 
##  1.843983e-01  1.208963e-01 -6.705949e-01  1.694122e-01  8.052467e-02 
##           401           402           403           404           405 
##  3.935017e-02 -4.664814e-02  2.008352e-01  6.127706e-02  1.262548e-01 
##           406           407           408           409           410 
## -2.891966e-02 -1.318424e-01 -4.516021e-02 -6.035471e-01  1.536412e-01 
##           411           412           413           414           415 
## -9.378515e-02  4.515495e-02  2.728025e-01  1.238172e-01  4.076840e-02 
##           416           417           418           419           420 
##  2.154918e-01 -3.185320e-01  5.108415e-02  1.785974e-01  4.012706e-01 
##           421           422           423           424           425 
##  4.931865e-02  6.997620e-02 -8.053253e-02 -2.479819e-02  7.245330e-04 
##           426           427           428           429           430 
## -1.381792e-01 -3.475426e-01 -5.095096e-02 -2.178300e-01 -1.520098e-02 
##           431           432           433           434           435 
##  6.197359e-02  9.769820e-03 -7.249916e-01  1.133467e-02  6.494216e-03 
##           436           437           438           439           440 
## -2.346669e-01  2.689999e-01  3.849970e-02  7.147196e-02  2.939554e-01 
##           441           442           443           444           445 
## -1.899605e-01 -4.802878e-02  7.784306e-02  8.006086e-02  7.462078e-02 
##           446           447           448           449           450 
## -3.273339e-02 -3.913409e-01  1.382422e-01  1.312540e-01  3.032123e-01 
##           451           452           453           454           455 
## -2.208668e-01 -1.784374e-01 -5.728231e-02  1.796222e-01 -5.232661e-01 
##           456           457           458           459           460 
##  5.785377e-02  8.995686e-02  5.392375e-02 -2.721852e-01  1.399431e-02 
##           461           462           463           464           465 
##  1.732491e-01  3.784717e-02  2.427713e-01 -5.926240e-04  1.815818e-01 
##           466           467           468           469           470 
## -2.051165e-01 -4.416914e-01 -2.028296e-01  6.351559e-02  1.006810e-01 
##           471           472           473           474           475 
## -2.601397e-01  1.496350e-01  7.730104e-02 -2.420138e-02  4.034899e-02 
##           476           477           478           479           480 
##  3.458900e-02  5.828608e-02  1.446996e-01  3.767768e-02 -2.179886e-01 
##           481           482           483           484           485 
##  3.020746e-01  1.683178e-01  1.017050e-01  2.220277e-02  9.158631e-02 
##           486           487           488           489           490 
## -2.617954e-01  3.383238e-01  3.999633e-02  1.591021e-01 -4.813383e-02 
##           491           492           493           494           495 
##  1.852264e-01  5.704277e-03  3.124953e-01 -1.533857e-01 -1.677523e-01 
##           496           497           498           499           500 
## -1.509441e-02  9.543489e-04  1.785629e-01  2.568768e-01 -2.503524e-01 
##           501           502           503           504           505 
##  2.406178e-03 -4.644322e-02  8.497440e-02  3.202582e-02  5.062899e-02 
##           506           507           508           509           510 
##  7.553971e-02 -4.178460e-02  9.827022e-03 -4.256430e-01 -4.411724e-02 
##           511           512           513           514           515 
## -1.706123e-01  2.778206e-02 -4.422612e-03  7.085875e-02 -2.513840e-02 
##           516           517           518           519           520 
##  4.148135e-02 -1.036045e-03 -4.489094e-01  1.256655e-01  1.254562e-01 
##           521           522           523           524           525 
## -4.732156e-02 -2.255908e-01  7.267017e-02 -2.839140e-01 -3.891984e-02 
##           526           527           528           529           530 
##  4.657110e-01 -3.720656e-02  3.012657e-01  1.713040e-01  3.415090e-01 
##           531           532           533           534           535 
## -8.506887e-02  1.549566e-01  1.622024e-01  8.179753e-02  7.086358e-02 
##           536           537           538           539           540 
##  4.797994e-02  2.213671e-01  1.727521e-01 -5.365694e-02  2.608776e-02 
##           541           542           543           544           545 
##  1.886947e-02 -2.267977e-02  1.242623e-01  2.639944e-02  4.483322e-02 
##           546           547           548           549           550 
##  1.785452e-02  3.187959e-02 -3.409719e-02 -5.150897e-02  1.823483e-02 
##           551           552           553           554           555 
##  2.527041e-02  2.717564e-02  8.436403e-02  2.402848e-02  4.134259e-02 
##           556           557           558           559           560 
##  6.719811e-02  8.199108e-02 -2.815487e-02  3.083346e-01 -4.329571e-02 
##           561           562           563           564           565 
##  2.009497e-02  1.951483e-01  1.897388e-01  4.748986e-01 -2.173698e-01 
##           566           567           568           569           570 
##  1.485124e-01  1.577322e-01 -5.737397e-02  3.527097e-01  9.524977e-04 
##           571           572           573           574           575 
##  1.766745e-01 -1.645419e-02 -2.816640e-02 -5.336371e-01  1.775305e-01 
##           576           577           578           579           580 
##  1.194285e-01  1.119464e-01 -3.835094e-02  1.514160e-01  9.163219e-02 
##           581           582           583           584           585 
##  6.729198e-03 -2.847393e-02 -1.232602e-01 -4.190180e-01 -4.739897e-01 
##           586           587           588           589           590 
##  1.048294e-01 -4.924728e-02  2.391455e-01 -4.767872e-02 -1.291096e-01 
##           591           592           593           594           595 
## -4.405027e-01  2.382332e-03  7.323014e-02  1.291513e-01 -1.314843e-01 
##           596           597           598           599           600 
## -1.960004e-01 -1.403085e-02 -2.733385e-02  5.584337e-02 -2.949695e-01 
##           601           602           603           604           605 
## -3.529178e-01  2.027325e-01 -1.138889e-01  1.087285e-01 -9.426445e-02 
##           606           607           608           609           610 
##  4.841883e-02  1.442513e-01  5.651115e-02 -1.427120e-01 -7.085801e-01 
##           611           612           613           614           615 
## -6.691160e-02 -2.084791e-02  1.106257e-01 -1.311798e-02  2.048947e-02 
##           616           617           618           619           620 
##  8.980880e-02  1.606042e-01  1.403741e-02  7.549539e-02  4.484372e-02 
##           621           622           623           624           625 
##  1.394510e-01  2.122981e-02 -1.865866e-02  3.978036e-01  1.230292e-01 
##           626           627           628           629           630 
## -4.792618e-01  1.087156e-01  4.752898e-02  1.001825e-01 -2.989795e-03 
##           631           632           633           634           635 
##  8.645002e-02 -2.755517e-02  5.410294e-02  4.993578e-02 -2.310931e-03 
##           636           637           638           639           640 
## -1.306913e-02  1.036948e-01  1.045084e-01 -3.704269e-01  1.149771e-01 
##           641           642           643           644           645 
## -5.927286e-02  1.448074e-01 -1.473643e-02  3.227252e-01  4.343378e-02 
##           646           647           648           649           650 
## -1.578547e-02  1.933992e-01 -1.030184e-01  1.971089e-01 -2.879587e-03 
##           651           652           653           654           655 
##  2.200826e-01 -2.057248e-02 -2.955852e-02  8.653680e-02  7.148814e-03 
##           656           657           658           659           660 
##  1.859451e-01  1.396103e-01  1.599304e-01  9.446399e-02  3.666824e-03 
##           661           662           663           664           665 
## -1.772948e-01 -5.784256e-01  2.136903e-01 -8.599688e-02  8.902065e-03 
##           666           667           668           669           670 
## -2.881659e-01 -1.168135e-02 -4.732846e-02 -6.229374e-01 -2.758594e-01 
##           671           672           673           674           675 
##  7.508854e-02 -3.352048e-01  7.053141e-02 -2.579524e-02  4.333719e-02 
##           676           677           678           679           680 
## -9.550844e-03 -1.245832e-01 -1.923048e-01  3.526023e-01  3.696235e-02 
##           681           682           683           684           685 
##  8.693940e-02 -6.787696e-02  8.709321e-02  1.292345e-01  1.025036e-01 
##           686           687           688           689           690 
## -2.592112e-03 -2.578903e-02  5.526660e-02  1.439009e-02 -7.780513e-03 
##           691           692           693           694           695 
##  6.405311e-02 -2.119399e-02 -1.623193e-01 -1.500258e-01  1.994147e-01 
##           696           697           698           699           700 
## -3.990075e-03  4.287344e-02  2.442080e-02  1.370770e-01 -9.270643e-03 
##           701           702           703           704           705 
##  3.984742e-01 -1.632369e-01 -9.465490e-01  4.964739e-02  7.471715e-02 
##           706           707           708           709           710 
##  2.478928e-02 -4.944587e-01  1.702495e-02 -2.522984e-01 -2.614872e-01 
##           711           712           713           714           715 
##  1.731928e-01  9.830724e-02  3.012827e-01  1.850026e-01  1.245831e-01 
##           716           717           718           719           720 
## -1.723579e-02  3.576956e-02  8.614764e-02  4.328907e-01 -1.248059e-01 
##           721           722           723           724           725 
## -4.702852e-02 -3.140571e-02 -8.956917e-02  4.523088e-01  1.091773e-02 
##           726           727           728           729           730 
## -2.309442e-02  8.853927e-02 -4.774065e-02  2.273191e-01 -1.794585e-02 
##           731           732           733           734           735 
##  3.105935e-01  8.870997e-02  1.409280e-01  1.946804e-01 -2.382948e-01 
##           736           737           738           739           740 
## -1.949076e-02 -3.750003e-02 -4.919890e-02 -4.876371e-01  2.614865e-02 
##           741           742           743           744           745 
## -1.519859e-02  1.915959e-01  1.725794e-01 -3.650302e-02 -2.235013e-01 
##           746           747           748           749           750 
## -3.815331e-02 -9.764736e-01  6.590943e-02 -3.711789e-02  6.384751e-01 
##           751           752           753           754           755 
## -1.218055e-01 -5.260363e-02  1.970365e-01  1.638281e-01  6.592121e-02 
##           756           757           758           759           760 
##  1.133811e-01 -4.619210e-02 -7.699063e-02 -2.094111e-01 -5.610669e-01 
##           761           762           763           764           765 
##  3.898207e-02  1.782140e-01  1.193311e-01 -1.331947e-01  1.993900e-01 
##           766           767           768           769           770 
##  1.402071e-02 -3.663076e-01  1.349334e-01 -2.030297e-01  2.195017e-01 
##           771           772           773           774           775 
##  4.003335e-02  3.672939e-01  3.201437e-02 -4.374337e-01  7.204949e-03 
##           776           777           778           779           780 
##  1.457077e-01 -4.007240e-01  2.769507e-02 -8.198182e-02  1.796943e-03 
##           781           782           783           784           785 
## -7.561982e-01 -1.957079e-01 -3.997586e-02  2.396784e-01  6.034635e-02 
##           786           787           788           789           790 
##  1.353897e-01  3.598991e-01 -6.926576e-01  7.865463e-02  5.132387e-02 
##           791           792           793           794           795 
## -2.863609e-02  1.396478e-01  3.032207e-01 -5.774829e-02  1.999339e-01 
##           796           797           798           799           800 
##  7.940565e-02  8.634202e-02 -3.395898e-02  3.403470e-01  3.918317e-02 
##           801           802           803           804           805 
## -1.632461e-01  5.753308e-02  1.889805e-01  2.343455e-01  1.545276e-01 
##           806           807           808           809           810 
##  1.503018e-01 -1.156109e-01  1.635752e-01  7.817449e-02  1.205243e-03 
##           811           812           813           814           815 
## -6.303387e-02 -3.918313e-02  2.113074e-01  9.305171e-02 -3.087808e-02 
##           816           817           818           819           820 
##  6.710380e-03  7.558713e-02 -2.293327e-01  1.524192e-02  1.522920e-01 
##           821           822           823           824           825 
##  5.052156e-02 -5.187393e-02 -1.017267e+00  3.808004e-02  6.324936e-02 
##           826           827           828           829           830 
##  1.106987e-01 -3.292464e-02  1.905930e-01 -1.137146e-01  4.080453e-02 
##           831           832           833           834           835 
## -8.008522e-02  1.992772e-02  1.407162e-01  3.112072e-02  5.563877e-02 
##           836           837           838           839           840 
##  6.060484e-02  1.668969e-01  5.963145e-02  2.058823e-01  1.016350e-01 
##           841           842           843           844           845 
## -2.188312e-01  1.030226e-01  1.176483e-01  2.584891e-02  2.152287e-02 
##           846           847           848           849           850 
##  2.417565e-02  9.874696e-02  2.312933e-01  1.221658e-02  7.508501e-02 
##           851           852           853           854           855 
## -5.022608e-02  3.329986e-02  1.171671e-01  4.327388e-02 -9.259921e-01 
##           856           857           858           859           860 
## -5.320411e-01 -3.232308e-02  6.752346e-02  1.106023e-01 -3.744878e-03 
##           861           862           863           864           865 
## -7.500993e-01  8.937273e-02  2.843146e-01 -1.508409e-02  1.242930e-01 
##           866           867           868           869           870 
##  8.799577e-02  7.290262e-02  7.099955e-02  7.275880e-02 -1.118108e-02 
##           871           872           873           874           875 
## -3.485499e-01 -3.084122e-02  1.579768e-01 -3.623013e-02  2.847263e-01 
##           876           877           878           879           880 
## -7.732569e-02 -5.120084e-02  2.607935e-01 -1.603494e-03  1.361152e-01 
##           881           882           883           884           885 
##  2.596176e-02 -4.769457e-02  8.101997e-02 -1.125455e-01  1.211554e-02 
##           886           887           888           889           890 
##  2.361360e-01  1.954378e-01  6.058675e-02  4.075743e-02  1.531539e-02 
##           891           892           893           894           895 
## -3.968884e-02 -4.513174e-01 -2.407046e-01  4.278258e-02  9.392422e-02 
##           896           897           898           899           900 
##  1.808250e-01 -5.012766e-02 -2.404888e-01  8.960164e-02  1.586369e-01 
##           901           902           903           904           905 
##  1.533604e-01 -1.046274e-01  1.282477e-01  1.635493e-01  4.045639e-02 
##           906           907           908           909           910 
## -9.131309e-03 -1.688769e-01  8.002775e-02  1.113464e-01  3.284917e-02 
##           911           912           913           914           915 
##  8.740468e-02 -3.715555e-02  1.515210e-01  1.902135e-02  1.051317e-01 
##           916           917           918           919           920 
## -4.973584e-03  4.269146e-03  5.026380e-02  1.965987e-02  2.966383e-01 
##           921           922           923           924           925 
## -6.240072e-02 -3.393315e-02  1.444823e-01  9.831284e-02 -2.562003e-01 
##           926           927           928           929           930 
##  1.776989e-01  1.351803e-01  2.703698e-01  1.285085e-01 -2.816549e-01 
##           931           932           933           934           935 
## -5.532282e-02  5.964256e-05 -1.735279e-01  5.265900e-02 -6.032673e-02 
##           936           937           938           939           940 
##  9.872261e-02  8.195563e-02 -4.201398e-02 -3.043206e-02  4.605785e-01 
##           941           942           943           944           945 
##  2.344722e-01 -5.194639e-01 -2.446892e-01  1.342061e-02 -1.355797e-01 
##           946           947           948           949           950 
##  7.313859e-02  1.916914e-01 -5.645604e-02 -5.962507e-01 -7.129223e-01 
##           951           952           953           954           955 
## -3.147641e-01 -8.966183e-02 -5.195513e-01 -1.192534e-01  2.311377e-01 
##           956           957           958           959           960 
##  1.626418e-01 -1.129815e-01  5.105901e-04 -4.308158e-02  2.905373e-01 
##           961           962           963           964           965 
## -8.075249e-02  1.020509e-01  1.370216e-01  1.633169e-01  9.102567e-02 
##           966           967           968           969           970 
## -2.169118e-02  3.215957e-02  2.360530e-01  1.818904e-01 -5.369348e-01 
##           971           972           973           974           975 
## -7.267854e-02  2.407847e-01  1.162471e-01  4.991534e-02  1.815762e-01 
##           976           977           978           979           980 
##  1.866399e-01  6.908884e-02 -6.252823e-02  3.304009e-01 -1.861513e-02 
##           981           982           983           984           985 
## -9.751235e-02  1.105022e-01 -2.456504e-01 -4.858952e-02  1.434770e-02 
##           986           987           988           989           990 
## -2.944720e-01 -2.663105e-02  5.236211e-02  1.859560e-01 -2.536655e-02 
##           991           992           993           994           995 
##  1.510376e-01  1.715876e-01 -2.887932e-01  1.085835e-01 -3.139218e-01 
##           996           997           998           999          1000 
## -9.076543e-01 -8.265740e-02  1.978602e-01  5.058613e-01 -1.716851e-01
rmse_xval <- sqrt(mean(error^2))

rmse_xval
## [1] 0.2102259

K-Nearest Neighbours

Features in our dataser have different ranges when compared to other features. If the distance formula was applied to unmodified features, there is a potential for the features with larger ranges to dominate or mask the features with smaller ranges. Because of this, it is important to prepare the data with feature scaling.

# Randomizing our data for better results
random <- runif(1000, 1:4)
## Warning in runif(1000, 1:4): NAs produced
advert_random <- advert[order(random),]

Let’s select the first 6 rows from ad_random

head(advert_random)
##    Daily.Time.Spent.on.Site Age Area.Income Daily.Internet.Usage
## 1                     68.95  35    61833.90               256.09
## 5                     68.37  35    73889.99               225.58
## 9                     74.53  30    68862.00               221.51
## 13                    69.57  48    51636.92               113.12
## 17                    55.39  37    23936.86               129.41
## 21                    77.22  30    64802.33               224.44
##                            Ad.Topic.Line            City Male
## 1     Cloned 5thgeneration orchestration     Wrightburgh    0
## 5          Robust logistical utilization    South Manuel    0
## 9         Configurable coherent function      West Colin    1
## 13 Centralized content-based focus group  West Katiefurt    1
## 17    Customizable multi-tasking website  West Dylanberg    0
## 21 Object-based reciprocal knowledgebase Port Jacqueline    1
##                  Country           Timestamp Clicked.on.Ad
## 1                Tunisia 2016-03-27 00:53:11             0
## 5                Iceland 2016-06-03 03:36:18             0
## 9                Grenada 2016-04-18 09:33:42             0
## 13                 Egypt 2016-06-03 01:14:41             1
## 17 Palestinian Territory 2016-01-30 19:20:41             1
## 21              Cameroon 2016-01-05 07:52:48             0

Using the Max-Min Normalization, we’ll be able to normalize our dataset.

normal <- function(x) (
  return(((x-min(x)) / (max(x) - min(x))))
)
normal(1:4)
## [1] 0.0000000 0.3333333 0.6666667 1.0000000
advert_new <- as.data.frame(lapply(advert_random[1:4], normal))
summary(advert_new)
##  Daily.Time.Spent.on.Site      Age          Area.Income    
##  Min.   :0.0000           Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.3189           1st Qu.:0.2381   1st Qu.:0.5044  
##  Median :0.6054           Median :0.3810   Median :0.6568  
##  Mean   :0.5507           Mean   :0.4050   Mean   :0.6261  
##  3rd Qu.:0.7810           3rd Qu.:0.5476   3rd Qu.:0.7860  
##  Max.   :1.0000           Max.   :1.0000   Max.   :1.0000  
##  Daily.Internet.Usage
##  Min.   :0.0000      
##  1st Qu.:0.2061      
##  Median :0.4743      
##  Mean   :0.4554      
##  3rd Qu.:0.6902      
##  Max.   :1.0000

Creating test and train datasets

train <- advert_new[1:800,]
test <- advert_new[801:1000,]
train_sp <- advert_random[1:800,10]
test_sp <- advert_random[801:1000,10]

Calling the class pacckage that consists of the KNN algorithm. Our confusion matrix is the table (test_sp,model)

library(class)
require(class)
model <- knn(train = train, test = test, cl = train_sp, k = 10)
table(factor(model))
## 
##   0   1 
## 101  99
table(test_sp,model)
##        model
## test_sp  0  1
##       0 92  1
##       1  9 98

Out of 200 observations, the confusion matrix predicted 190 observations giving us an accuracy of 95%

Decision Tress

Complex decisions are often made simpler by using decision trees which breaks the decision down into smallerand much simpler ones using the divide and conquer strategy. Decision Trees basically identify a set of if-else conditions that split data according to the value of the features. We will be implementing the decesion ree model into our dataset.

Calling the necessary libraries:

library(rpart.plot)
## Loading required package: rpart
library(mlbench)
library(rpart)

Data partition To predict the class using rpart () function for the class method. rpart () uses the Gini index measure to split the nodes.

dt <- rpart(Clicked.on.Ad ~ Daily.Time.Spent.on.Site + Age + Area.Income + Daily.Internet.Usage, data = input, method  ="class")
rpart.plot(dt) 

Searching for feature importance

data.frame(dt$variable.importance)
##                          dt.variable.importance
## Daily.Internet.Usage                   339.7809
## Daily.Time.Spent.on.Site               279.0247
## Age                                    126.2649
## Area.Income                            119.2524
pr <- predict(dt, input, type = "class")
table(pr, advert$Clicked.on.Ad)
##    
## pr    0   1
##   0 485  28
##   1  15 472

Out of a total of 1000, the decision tree algorithm predicts 957 correct observationsa. The model will therefore achieve an accuracy of 95.7%

To train our model;

library(caret)
set.seed(15)

model <- train(Clicked.on.Ad ~ Daily.Time.Spent.on.Site + Age + Area.Income + Daily.Internet.Usage ,data = input,method = "ranger") 
## Warning in train.default(x, y, weights = w, ...): You are trying to do
## regression and your outcome only has two possible values Are you trying to do
## classification? If so, use a 2 level factor as your outcome column.
model
## Random Forest 
## 
## 1000 samples
##    4 predictor
## 
## No pre-processing
## Resampling: Bootstrapped (25 reps) 
## Summary of sample sizes: 1000, 1000, 1000, 1000, 1000, 1000, ... 
## Resampling results across tuning parameters:
## 
##   mtry  splitrule   RMSE       Rsquared   MAE       
##   2     variance    0.1773138  0.8741314  0.06594256
##   2     extratrees  0.1728051  0.8812118  0.07273495
##   3     variance    0.1831404  0.8657094  0.06243689
##   3     extratrees  0.1737184  0.8794404  0.06842673
##   4     variance    0.1930556  0.8510197  0.06265898
##   4     extratrees  0.1755956  0.8766744  0.06692780
## 
## Tuning parameter 'min.node.size' was held constant at a value of 5
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were mtry = 2, splitrule = extratrees
##  and min.node.size = 5.
plot(model)

Support Vector Machines

library(caret)
intrain <- createDataPartition(y = advert$Clicked.on.Ad, p= 0.7, list = FALSE)
training <- advert[intrain,]
testing <- advert[-intrain,]

Checking the dimension of the training and testing dataframe.

dim(training);
## [1] 700  10
dim(testing);
## [1] 300  10

Let’s factorize our target variable for accurate results.

training[["Clicked.on.Ad"]] = factor(training[["Clicked.on.Ad"]])

Using the traincontrol() method to control the computational overheads

trctrl <- trainControl(method = "repeatedcv", number = 10, repeats = 5)
svm_Linear <- train(Clicked.on.Ad ~ Daily.Time.Spent.on.Site + Age + Area.Income +Daily.Internet.Usage , data = training, method = "svmLinear",
trControl=trctrl,
preProcess = c("center", "scale"),
tuneLength = 10)

Results of the training model

svm_Linear
## Support Vector Machines with Linear Kernel 
## 
## 700 samples
##   4 predictor
##   2 classes: '0', '1' 
## 
## Pre-processing: centered (4), scaled (4) 
## Resampling: Cross-Validated (10 fold, repeated 5 times) 
## Summary of sample sizes: 630, 630, 630, 630, 630, 630, ... 
## Resampling results:
## 
##   Accuracy   Kappa    
##   0.9697143  0.9394286
## 
## Tuning parameter 'C' was held constant at a value of 1

We’ll use the predict() method to predict results

test_pred <- predict(svm_Linear, newdata = testing)
test_pred
##   [1] 0 0 0 1 1 1 0 1 1 1 1 0 0 0 0 0 0 1 0 0 1 0 1 1 0 0 0 1 0 0 1 1 0 1 1 1 1
##  [38] 1 1 0 1 1 0 1 0 0 1 0 1 1 1 1 0 1 0 0 1 1 1 1 0 1 0 1 1 0 1 0 1 0 0 1 0 1
##  [75] 0 1 0 0 1 1 0 1 0 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 1 0 1 1 0 0 0 0 0
## [112] 1 0 0 0 0 0 1 1 1 0 1 1 0 0 0 0 0 1 1 0 1 0 1 1 0 1 1 0 1 1 0 1 0 0 1 0 0
## [149] 1 0 1 0 0 1 0 0 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 0 1 0 0 1 0 1 1 0 1
## [186] 0 1 1 0 1 1 1 0 0 0 1 0 1 0 0 0 1 0 1 1 0 0 0 1 1 1 0 1 0 1 0 1 0 1 1 0 1
## [223] 0 0 0 0 1 1 1 1 1 1 0 0 1 0 0 0 0 1 1 1 1 1 1 1 1 0 1 0 0 1 0 1 0 1 0 0 0
## [260] 0 1 0 0 0 0 1 0 1 1 1 0 1 1 1 1 0 0 1 1 1 1 0 0 0 0 0 1 0 1 1 1 0 1 0 0 1
## [297] 1 1 1 1
## Levels: 0 1

Checking the accuracy of our model using a confusion matrix.

confusionMatrix(table(test_pred, testing$Clicked.on.Ad))
## Confusion Matrix and Statistics
## 
##          
## test_pred   0   1
##         0 147   9
##         1   3 141
##                                           
##                Accuracy : 0.96            
##                  95% CI : (0.9312, 0.9792)
##     No Information Rate : 0.5             
##     P-Value [Acc > NIR] : <2e-16          
##                                           
##                   Kappa : 0.92            
##                                           
##  Mcnemar's Test P-Value : 0.1489          
##                                           
##             Sensitivity : 0.9800          
##             Specificity : 0.9400          
##          Pos Pred Value : 0.9423          
##          Neg Pred Value : 0.9792          
##              Prevalence : 0.5000          
##          Detection Rate : 0.4900          
##    Detection Prevalence : 0.5200          
##       Balanced Accuracy : 0.9600          
##                                           
##        'Positive' Class : 0               
## 

The SVM model achieves an accuracy level of 96%.

Conclusion and Recommendations

  • Our data shows rhat most of the respodents fall between the 25-41 age bracket, the course should be tailor made in order to attract the age more people within the age bracket. The olderst respondent is noted to be of 19 years while the youngest is 61 years.
  • Our client should consider targetting people with an income of betwen 50,000 to 70,000 since they seem to be more interested and also in a position to afford the course
  • Time spent by most people on the site is 70 - 85 minutes. So, to the course offered should span at around the sme time of even shoeter, in order to keep people more interested.
  • Since SVM has given the highest accuracy level, the client should implement it.
  • To figure out which clients will clieck on her adverts, the client should prioritize the following features: Daily Internet Udage - Daily Time Spent on SIte - Age - Income distribution of the area