One waiter recorded information about each tip he received over a period of a few months working in one restaurant. We will conduct estimation and simulation to see if the size of the party or sex of the bill payer has an effect of the amount of tip the waiter has received in dollars.
library(Zelig)
Loading required package: survival
library(ZeligChoice)
library(faraway)
Attaching package: ‘faraway’
The following object is masked from ‘package:survival’:
rats
library(dplyr)
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
library(tidyr)
library(survival)
library(reshape2)
Attaching package: ‘reshape2’
The following object is masked from ‘package:tidyr’:
smiths
library(tidyverse)
Loading tidyverse: ggplot2
Loading tidyverse: tibble
Loading tidyverse: readr
Loading tidyverse: purrr
Conflicts with tidy packages ------------------------------------------------------------------------
filter(): dplyr, stats
lag(): dplyr, stats
reduce(): purrr, Zelig
data(tips)
head(tips)
z <- zelig(tip ~ sex + smoker + size, model = "poisson", data = tips, cite = F)
non-integer x = 1.010000non-integer x = 1.660000non-integer x = 3.500000non-integer x = 3.310000non-integer x = 3.610000non-integer x = 4.710000non-integer x = 3.120000non-integer x = 1.960000non-integer x = 3.230000non-integer x = 1.710000non-integer x = 1.570000non-integer x = 3.020000non-integer x = 3.920000non-integer x = 1.670000non-integer x = 3.710000non-integer x = 3.500000non-integer x = 3.350000non-integer x = 4.080000non-integer x = 2.750000non-integer x = 2.230000non-integer x = 7.580000non-integer x = 3.180000non-integer x = 2.340000non-integer x = 4.300000non-integer x = 1.450000non-integer x = 2.500000non-integer x = 2.450000non-integer x = 3.270000non-integer x = 3.600000non-integer x = 3.070000non-integer x = 2.310000non-integer x = 2.240000non-integer x = 2.540000non-integer x = 3.060000non-integer x = 1.320000non-integer x = 5.600000non-integer x = 2.050000non-integer x = 2.500000non-integer x = 2.600000non-integer x = 5.200000non-integer x = 1.560000non-integer x = 4.340000non-integer x = 3.510000non-integer x = 1.500000non-integer x = 1.760000non-integer x = 6.730000non-integer x = 3.210000non-integer x = 1.980000non-integer x = 3.760000non-integer x = 2.640000non-integer x = 3.150000non-integer x = 2.470000non-integer x = 2.010000non-integer x = 2.090000non-integer x = 1.970000non-integer x = 3.140000non-integer x = 2.200000non-integer x = 1.250000non-integer x = 3.080000non-integer x = 2.710000non-integer x = 3.400000non-integer x = 1.830000non-integer x = 2.030000non-integer x = 5.170000non-integer x = 5.850000non-integer x = 3.500000non-integer x = 4.300000non-integer x = 3.250000non-integer x = 4.730000non-integer x = 1.500000non-integer x = 1.500000non-integer x = 2.500000non-integer x = 2.500000non-integer x = 3.480000non-integer x = 4.080000non-integer x = 1.640000non-integer x = 4.060000non-integer x = 4.290000non-integer x = 3.760000non-integer x = 2.550000non-integer x = 3.500000non-integer x = 5.070000non-integer x = 1.500000non-integer x = 1.800000non-integer x = 2.920000non-integer x = 2.310000non-integer x = 1.680000non-integer x = 2.500000non-integer x = 2.520000non-integer x = 4.200000non-integer x = 1.480000non-integer x = 2.180000non-integer x = 1.500000non-integer x = 2.830000non-integer x = 1.500000non-integer x = 3.250000non-integer x = 1.250000non-integer x = 2.750000non-integer x = 3.500000non-integer x = 6.700000non-integer x = 2.300000non-integer x = 1.500000non-integer x = 1.360000non-integer x = 1.630000non-integer x = 1.730000non-integer x = 2.500000non-integer x = 2.740000non-integer x = 5.140000non-integer x = 3.750000non-integer x = 2.610000non-integer x = 3.500000non-integer x = 2.500000non-integer x = 3.480000non-integer x = 2.240000non-integer x = 4.500000non-integer x = 1.610000non-integer x = 3.160000non-integer x = 5.150000non-integer x = 3.180000non-integer x = 3.110000non-integer x = 3.550000non-integer x = 3.680000non-integer x = 5.650000non-integer x = 3.500000non-integer x = 6.500000non-integer x = 3.500000non-integer x = 3.500000non-integer x = 1.500000non-integer x = 4.190000non-integer x = 2.560000non-integer x = 2.020000non-integer x = 1.440000non-integer x = 2.010000non-integer x = 2.500000non-integer x = 3.230000non-integer x = 3.410000non-integer x = 2.030000non-integer x = 2.230000non-integer x = 5.160000non-integer x = 2.500000non-integer x = 6.500000non-integer x = 1.100000non-integer x = 1.500000non-integer x = 1.440000non-integer x = 3.090000non-integer x = 2.200000non-integer x = 3.480000non-integer x = 1.920000non-integer x = 1.580000non-integer x = 2.500000non-integer x = 2.720000non-integer x = 2.880000non-integer x = 3.390000non-integer x = 1.470000non-integer x = 1.250000non-integer x = 1.170000non-integer x = 4.670000non-integer x = 5.920000non-integer x = 1.750000
summary(z)
Model:
Call:
z5$zelig(formula = tip ~ sex + smoker + size, data = tips)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.59926 -0.48489 -0.07297 0.31122 2.88786
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.47958 0.11944 4.015 5.94e-05
sexMale 0.04969 0.07847 0.633 0.527
smokerYes 0.06887 0.07706 0.894 0.371
size 0.20947 0.03461 6.052 1.43e-09
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 140.51 on 243 degrees of freedom
Residual deviance: 105.55 on 240 degrees of freedom
AIC: Inf
Number of Fisher Scoring iterations: 4
Next step: Use 'setx' method
From the graph below we see a positive relationship between the size of the party and the amount of tip in dollars the waiter received
a.range = min(tips$size):max(tips$size)
x <- setx(z, size = a.range)
s <- sim(z, x = x)
ci.plot(s)
With every unit increase in standard deviation change in the size of the party we have a nearly .64 dollar increase in the tips in dollars.
x <- setx(z, size = mean(tips$size))
x1 <- setx(z, size = mean(tips$size) + sd(tips$size))
s <- sim(z, x = x, x1 = x1)
summary(s)
sim x :
-----
ev
mean sd 50% 2.5% 97.5%
[1,] 2.915416 0.1650956 2.910553 2.616773 3.257411
pv
mean sd 50% 2.5% 97.5%
[1,] 2.854 1.720361 3 0 7
sim x1 :
-----
ev
mean sd 50% 2.5% 97.5%
[1,] 3.554509 0.2023653 3.554731 3.170115 3.953984
pv
mean sd 50% 2.5% 97.5%
[1,] 3.596 1.897995 3 0 8
fd
mean sd 50% 2.5% 97.5%
[1,] 0.6390932 0.1125904 0.6370569 0.4211652 0.8603845
We are looking at the effect of 1 SD increase in size on the party. We will be able to compare if whether 1 standard deviation increase in size has a direct relation using a simluation.
xl <- setx(z, size = quantile(tips$size, .25))
xl1 <- setx(z, size = quantile(tips$size, .25) + sd(tips$size))
sl <- sim(z, x = xl, x1 = xl1)
fdl <- sl$get_qi(xvalue="x1", qi="fd")
xm <- setx(z, size = quantile(tips$size, .50))
xm1 <- setx(z, size = quantile(tips$size, .50) + sd(tips$size))
sm <- sim(z, x = xm, x1 = xm1)
fdm <- sm$get_qi(xvalue="x1", qi="fd")
xh <- setx(z, size = quantile(tips$size, .75))
xh1 <- setx(z, size = quantile(tips$size, .75) + sd(tips$size))
sh <- sim(z, x = xh, x1 = xh1)
fdh <- sh$get_qi(xvalue="x1", qi="fd")
Raw data
d <- as.data.frame(cbind(fdl, fdm, fdh))
head(d)
tidd <- d %>%
gather(quantile, dif, 1:3)
head(tidd)
tidd %>%
group_by(quantile) %>%
summarise(mean = mean(dif), sd = sd(dif))
library(ggplot2)
ggplot(tidd, aes(dif)) + geom_histogram() + facet_grid(~ quantile)
Conclusion: