The TV dataset is a large sample of TV viewers whose behavior had been recorded for the Neilsen ratings. This data set contains sample television audience data from Neilsen Media Research for the week starting November 6, 1995.
TV data three way table
Collapse the TV data to a 5x3 two-way table (ignore the effect of time) whereas the rows are the days of the week and the columns are networks.
data("TV", package="vcdExtra")
tv.df <- as.data.frame.table(TV)
tv.f <- tv.df[, c(1,3,4)]
tv.data <- group_by(tv.f, Day, Network)
tv.data1 <- dplyr::summarise(tv.data, count=sum(Freq))
tv.table <- dcast(tv.data1, Day ~ Network, value.var="count")
tv.table
## Day ABC CBS NBC
## 1 Monday 2847 2923 2629
## 2 Tuesday 3110 2403 2568
## 3 Wednesday 2434 1283 2212
## 4 Thursday 1766 1335 5886
## 5 Friday 2737 1479 1998
test1 <- polr(Network ~ Day, weight=Freq, data=tv.df)
summary(test1)
##
## Re-fitting to get Hessian
## Call:
## polr(formula = Network ~ Day, data = tv.df, weights = Freq)
##
## Coefficients:
## Value Std. Error t value
## DayTuesday -0.08849 0.02814 -3.145
## DayWednesday -0.03144 0.03127 -1.006
## DayThursday 1.16947 0.02952 39.618
## DayFriday -0.20969 0.03073 -6.824
##
## Intercepts:
## Value Std. Error t value
## ABC|CBS -0.4935 0.0201 -24.5290
## CBS|NBC 0.5985 0.0202 29.6220
##
## Residual Deviance: 78431.92
## AIC: 78443.92
Main Effect model
Fit the main-effects model, Newtork ~ Day+ Time, with multinom(), Note that you will have to supply the weights argument because each row of TV.df represents the number of viewers in the Freq variable.
multinom
test <- multinom(Network ~ Day +Time, weight=Freq, data=tv.df)
## # weights: 48 (30 variable)
## initial value 41318.808177
## iter 10 value 38935.947713
## iter 20 value 38818.222728
## iter 30 value 38756.956301
## final value 38752.186202
## converged
## Call:
## multinom(formula = Network ~ Day + Time, data = tv.df, weights = Freq)
##
## Coefficients:
## (Intercept) DayTuesday DayWednesday DayThursday DayFriday Time8:15
## CBS 0.264060911 -0.2839729 -0.66995721 -0.3236498 -0.6728270 0.07101115
## NBC 0.005020656 -0.1156021 -0.01887989 1.2724501 -0.2496648 0.03790697
## Time8:30 Time8:45 Time9:00 Time9:15 Time9:30 Time9:45
## CBS 0.076997762 -0.29429040 -0.4875213 -0.3664838 -0.4409308 -0.2440312
## NBC 0.006827139 -0.09320663 -0.2106409 -0.2609636 -0.3416990 -0.1617936
## Time10:00 Time10:15 Time10:30
## CBS -0.29940481 -0.2677118 -0.2239947
## NBC -0.02609353 0.0566143 0.1358038
##
## Std. Errors:
## (Intercept) DayTuesday DayWednesday DayThursday DayFriday Time8:15
## CBS 0.04983795 0.03797920 0.04355769 0.04501328 0.04192254 0.06398208
## NBC 0.04832104 0.03808235 0.04004444 0.03845629 0.04015187 0.06059674
## Time8:30 Time8:45 Time9:00 Time9:15 Time9:30 Time9:45
## CBS 0.06204563 0.06958338 0.06154875 0.06229549 0.06164592 0.06774630
## NBC 0.05898492 0.06314121 0.05595830 0.05782973 0.05724928 0.06371518
## Time10:00 Time10:15 Time10:30
## CBS 0.06242284 0.06296253 0.06277900
## NBC 0.05700440 0.05711720 0.05687154
##
## Residual Deviance: 77504.37
## AIC: 77564.37
Polr(level)
#level the time
tv.df1 <- tv.df
levels(tv.df1$Time) <- c(rep("8:00-8:59",5),rep("9:00-9:59",4), rep("10:00-10:44",3))
tv.day.logit <- polr(Network ~ Day+ Time, weight=Freq, data=tv.df1)
summary(tv.day.logit)
##
## Re-fitting to get Hessian
## Call:
## polr(formula = Network ~ Day + Time, data = tv.df1, weights = Freq)
##
## Coefficients:
## Value Std. Error t value
## DayTuesday -0.09104 0.02816 -3.233
## DayWednesday -0.03337 0.03128 -1.067
## DayThursday 1.16384 0.02954 39.395
## DayFriday -0.21310 0.03076 -6.928
## Time9:00-9:59 -0.09659 0.02187 -4.417
## Time10:00-10:44 0.14328 0.02631 5.446
##
## Intercepts:
## Value Std. Error t value
## ABC|CBS -0.5030 0.0230 -21.8660
## CBS|NBC 0.5907 0.0231 25.6235
##
## Residual Deviance: 78354.19
## AIC: 78370.19
Effects plot
Prepare an effects plots for the fitted probabilities in this model
full.plot <- cbind(tv.df1, predict(tv.day.logit, type="probs"))
head(full.plot)
## Day Time Network Freq ABC CBS NBC
## 1 Monday 8:00-8:59 ABC 146 0.3768401 0.2666802 0.3564797
## 2 Tuesday 8:00-8:59 ABC 244 0.3984462 0.2656752 0.3358786
## 3 Wednesday 8:00-8:59 ABC 233 0.3847089 0.2664302 0.3488609
## 4 Thursday 8:00-8:59 ABC 174 0.1588482 0.2016588 0.6394930
## 5 Friday 8:00-8:59 ABC 294 0.4280322 0.2627491 0.3092187
## 6 Monday 8:00-8:59 ABC 151 0.3768401 0.2666802 0.3564797
plotdat <- melt(full.plot, id.vars=c("Network", "Day", "Time","Freq"),
measure.vars=c("ABC", "CBS","NBC") , variable.names="Time", value="Probability")
head(plotdat)
## Network Day Time Freq variable value
## 1 ABC Monday 8:00-8:59 146 ABC 0.3768401
## 2 ABC Tuesday 8:00-8:59 244 ABC 0.3984462
## 3 ABC Wednesday 8:00-8:59 233 ABC 0.3847089
## 4 ABC Thursday 8:00-8:59 174 ABC 0.1588482
## 5 ABC Friday 8:00-8:59 294 ABC 0.4280322
## 6 ABC Monday 8:00-8:59 151 ABC 0.3768401
# the value is from 0 to 1000, i only took 100 to 500 for better visualization.
gg <- ggplot(plotdat, aes(x=Freq, y=value, colour=variable)) +geom_line (size=2.5) + theme_bw() +
xlim(80,500)+ geom_point(color="black", size=1.5) +facet_grid(Network~Day) +labs (title="TV stations view Probability Graph")
direct.label(gg)

plot(Effect("Day", tv.day.logit))
##
## Re-fitting to get Hessian

Interpret result
Interpret these results in comparison to the correspondence analysis in part (a)
From the summary we can see detailed description such as, * one- unit increase in DayTUesday will decrease the log odds of cbs vs abc 0.28
- one- unit increase in DayWednesday will decrease the log odds of cbs vs abc 0.66 …..
in general, we can see cbs is lower than abc and nbc in Monday, Tuesday, Wednsday and Friday. abc has exceptionally higher view’s on Friday. and on Thursday Nbc dominates the chanel, and ABC has the littlest viewers.