ANLY 545 51 Late Spring HW7 Hanwen Zhang

Exercise 8.3. The TV dataset is a large sample of TV viewers whose behavior had been recorded for the Neilsen ratings. This data set contains sample television audience data from Neilsen Media Research for the week starting November 6, 1995.

library(effects)

## Warning: package 'effects' was built under R version 3.4.4

## Loading required package: carData

## Warning: package 'carData' was built under R version 3.4.4

## lattice theme set by effectsTheme()
## See ?effectsTheme for details.

library(vcdExtra)

## Warning: package 'vcdExtra' was built under R version 3.4.2

## Loading required package: vcd

## Loading required package: grid

## Loading required package: gnm

## 
## Attaching package: 'vcdExtra'

## The following object is masked from 'package:carData':
## 
##     Burt

library(nnet)
library(reshape2)
library(dplyr)

## Warning: package 'dplyr' was built under R version 3.4.2

## 
## Attaching package: 'dplyr'

## The following object is masked from 'package:vcdExtra':
## 
##     summarise

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(tidyr)

## Warning: package 'tidyr' was built under R version 3.4.2

## 
## Attaching package: 'tidyr'

## The following object is masked from 'package:reshape2':
## 
##     smiths

library(ca)

data('TV', package = 'vcdExtra')
TV.df <- as.data.frame(TV)
TV.df.new <- cbind(Day = row.names(TV.df), TV.df)


TV.df.tidy <- melt(TV.df.new, id.vars = 'Day', variable.name = 'Time', value.name = 'Freq')
separate.df <- data.frame(do.call('rbind', strsplit(as.character(TV.df.tidy$Time), '.', fixed=TRUE)))
TV.df.tidy <- TV.df.tidy %>%
                select(Day, Freq) %>%
                cbind(separate.df) %>%
                select(1,3,4,2)
colnames(TV.df.tidy) <- c('Day', 'Time', 'Network', 'Freq')

head(TV.df.tidy, 10)

##          Day Time Network Freq
## 1     Monday 8:00     ABC  146
## 2    Tuesday 8:00     ABC  244
## 3  Wednesday 8:00     ABC  233
## 4   Thursday 8:00     ABC  174
## 5     Friday 8:00     ABC  294
## 6     Monday 8:15     ABC  151
## 7    Tuesday 8:15     ABC  181
## 8  Wednesday 8:15     ABC  161
## 9   Thursday 8:15     ABC  183
## 10    Friday 8:15     ABC  281

Collapse the TV data to a 5x3 two-way table (ignore the effect of time) whereas the rows are the days of the week and the columns are networks.

Treating Network as a three-level response variable, fit a generalized logit model to explain the variation in viewing in relation to Day and Time. The TV data is a three-way table, so you will need to convert it to a frequency data frame first.

TV.collapse <- xtabs(Freq ~ Day + Network, data = TV.df.tidy)
TV.collapse

##            Network
## Day          ABC  CBS  NBC
##   Friday    2737 1479 1998
##   Monday    2847 2923 2629
##   Thursday  1766 1335 5886
##   Tuesday   3110 2403 2568
##   Wednesday 2434 1283 2212

Fit the main-effects model, Network ~ Day + Time, with multinom (). Note that you will have to supply the weights argument because each row of TV.df represents the number of viewers in the Freq variable.

TV.multinom <- multinom(Network ~ Day + Time, data = TV.df.tidy, weights = Freq, Hess = TRUE)

## # weights:  48 (30 variable)
## initial  value 41318.808177 
## iter  10 value 38827.142232
## iter  20 value 38783.255700
## iter  30 value 38756.002114
## final  value 38752.186202 
## converged

summary(TV.multinom)

## Call:
## multinom(formula = Network ~ Day + Time, data = TV.df.tidy, weights = Freq, 
##     Hess = TRUE)
## 
## Coefficients:
##     (Intercept) DayMonday DayThursday DayTuesday DayWednesday  Time10:15
## CBS  -0.7081710 0.6728257   0.3491797  0.3888542  0.002870348 0.03169297
## NBC  -0.2707374 0.2496646   1.5221143  0.1340624  0.230785005 0.08270754
##      Time10:30   Time8:00  Time8:15   Time8:30     Time8:45   Time9:00
## CBS 0.07541006 0.29940443 0.3704153 0.37640181  0.005114158 -0.1881158
## NBC 0.16189706 0.02609372 0.0640008 0.03292065 -0.067113285 -0.1845470
##        Time9:15   Time9:30    Time9:45
## CBS -0.06707874 -0.1415254  0.05537347
## NBC -0.23486967 -0.3156046 -0.13569974
## 
## Std. Errors:
##     (Intercept)  DayMonday DayThursday DayTuesday DayWednesday  Time10:15
## CBS  0.05337986 0.04192254  0.04867865 0.04239848   0.04746323 0.06298673
## NBC  0.04749421 0.04015187  0.04011727 0.03985384   0.04174680 0.05518659
##      Time10:30   Time8:00   Time8:15   Time8:30   Time8:45   Time9:00
## CBS 0.06280538 0.06242284 0.06406244 0.06212389 0.06966745 0.06156368
## NBC 0.05493324 0.05700439 0.05883179 0.05716166 0.06146401 0.05398216
##       Time9:15   Time9:30   Time9:45
## CBS 0.06231276 0.06165164 0.06772202
## NBC 0.05592424 0.05531281 0.06195764
## 
## Residual Deviance: 77504.37 
## AIC: 77564.37

Prepare an effects plot for the fitted probabilities in this model.

plot(Effect(c("Day", "Time"), TV.multinom), style= "lines", key.args= list(x= 0.05, y= .9))

Interpret these results in comparison to the correspondence analysis in part(a).

TV.ca2 <- ca(TV.collapse)
plot(TV.ca2)

ANLY 545 51 Late Spring HW7 Hanwen Zhang

Hanwen.Zhang

6/8/2018