model2 <-glm.nb( formula = carb ~ mpg + cyl + wt +qsec +vs +am, data = df, control =glm.control(maxit =1000, # keep uptil 1000 epsilon =1e-6, # keep between 1e-8 and 1e-1trace =TRUE# Enable debugging to track what happens during the optimization process ) )
If convergence fails, try to increase the maxit (maxium iterations), and increase the error (relax the epsilon constraint by increasing its magnitude).
A very high value of Theta (dispersion parameter - controls the variance of negative binomial distribution) implies that the variance is approaching the variance of a Poisson distribution, suggesting that overdispersion is minimal or that the optimization algorithm is struggling to fit Theta correctly.
hist(df$carb)
var(df$carb)/mean(df$carb)
[1] 0.9275986
Either way, look at the output.
Common things to check for -
No Extreme Outliers
boxplot(df)
3 * IQR is a good rule of thumb.
No Multicollinearity
vif(glm(carb ~ mpg + cyl + wt + qsec + vs + am, data = df))
mpg cyl wt qsec vs am
6.721984 9.710670 6.483070 5.585198 4.502740 4.042162
Not an issue here, and thus no variables have to be removed.
Fit a Poisson model to generate starting values
poisson_model <-glm(carb ~ mpg + cyl + wt + qsec + vs + am, family = poisson, data = df)
Use coefficients from Poisson model as starting values
model3 <-glm.nb( formula = carb ~ mpg + cyl + wt + qsec + vs + am, data = df, start =coef(poisson_model) )
If the model includes too many predictors relative to the number of observations, it may be overparameterized. Consider simplifying the model by removing less important predictors.
stargazer(model1, model2, model3, model4, type ="text")