Length:The longer the host stays on Airbnb, the more likely they will remain on Airbnb.
Guests:The less number of guests the listing accommodates, the more likely they will remain on Airbnb.
Rating: The higher the rating is, the more likely they will remain on Airbnb.
Rev_AbB: The more revenue generated in one year for Airbnb via its property fee, the more likely they will remain on Airbnb.
Type_Hm,zestimatek,listprice,location is not significantly relevent to the liklihood of remaining on Airbnb
Advice:To entice owners to list accommodations on Airbnb, new hosts who have not stayed long on Airbnb need more support to remain on Airbnb; hosts attach most importance to rating so Airbnb should ensure hosts can get satifying ratings; Airbnb should allow hosts to increase property fees within a reasonable limit; Airbnb can tell hosts to divide house into multiple rooms to accomodate less guests in one room and ensure guests have enough independent space.
Read the data
host<-read.csv("~/Desktop/R/host.csv",header=T,sep=',')
Logistic regression
myresult<-glm(choice~Length+Type_Hm+Guests+Zestimatek+ListPrice+Rating+Rev_AbB+Loc_1+Loc_2+Loc_3+Loc_4+Loc_5,data=host,family="binomial") # is for a group of generalized linear models and 'binomial' specifies logistic regression
summary(myresult)
##
## Call:
## glm(formula = choice ~ Length + Type_Hm + Guests + Zestimatek +
## ListPrice + Rating + Rev_AbB + Loc_1 + Loc_2 + Loc_3 + Loc_4 +
## Loc_5, family = "binomial", data = host)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.4179 -0.8402 0.5268 0.7375 1.8472
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.997e+00 1.067e+00 -2.809 0.004972 **
## Length 2.565e-03 1.112e-03 2.306 0.021100 *
## Type_Hm 5.730e-01 4.467e-01 1.283 0.199646
## Guests -2.506e-01 1.134e-01 -2.210 0.027078 *
## Zestimatek -9.171e-04 2.709e-03 -0.339 0.734950
## ListPrice -8.241e-05 3.621e-03 -0.023 0.981844
## Rating 6.160e-01 1.629e-01 3.782 0.000156 ***
## Rev_AbB 2.102e-03 8.402e-04 2.502 0.012341 *
## Loc_1 -5.853e-02 4.919e-01 -0.119 0.905282
## Loc_2 9.109e-01 5.220e-01 1.745 0.080989 .
## Loc_3 9.114e-01 5.742e-01 1.587 0.112462
## Loc_4 1.005e+00 5.190e-01 1.937 0.052764 .
## Loc_5 8.897e-01 5.317e-01 1.673 0.094262 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 470.54 on 399 degrees of freedom
## Residual deviance: 399.44 on 387 degrees of freedom
## AIC: 425.44
##
## Number of Fisher Scoring iterations: 4
# odds ratios only
exp(coef(myresult))
## (Intercept) Length Type_Hm Guests Zestimatek ListPrice
## 0.04994121 1.00256786 1.77354464 0.77834137 0.99908330 0.99991759
## Rating Rev_AbB Loc_1 Loc_2 Loc_3 Loc_4
## 1.85155282 1.00210452 0.94314780 2.48648155 2.48782433 2.73233475
## Loc_5
## 2.43442559
plot(myresult$fitted.values)
