Chapter 7 Models Family Choice
In this chapter, we discuss the appropriate models’ family to take into account data characteristics.
Internalizing problems are computed as the sum of 10 items of the SDQ, obtaining discrete scores that range from 0 to 20. Considering data distribution (see Figure~1.9), we choose a Negative Binomial distribution to model the data.
7.1 Zero Inflated Negative Binomial
As in the case of externalizing problems, we evaluate whether a Zero-Inflated model may be appropriate. We compare the number of observed zeros and expected zeros in a Negative Binomial mixed-effects model considering the same predictors as in the case of externalizing problems. Using R formula syntax, we have
# model formula
~ gender + mother * father + (1|ID_class) internalizing_sum
Comparing the number of observed zero and expected zeros, we get
my_check_zeroinflation(fit_int_nb)
## # Check for zero-inflation
##
## Observed zeros: 226
## Predicted zeros: 195
## Ratio: 0.86
## Model is underfitting zeros (probable zero-inflation).
Results indicate that the model is slightly under-fitting the number of zeros. Thus, we can fit a Zero Inflated Negative Binomial (ZINB) model and compare the performance of the two models. Using R formula syntax, we have
# formula for p
~ gender + (1|ID_class)
p
# formula for mu
~ gender + mother * father + (1|ID_class) mu
Below we report results of the analysis of deviance.
anova(fit_int_nb, fit_int_zinb)
## Data: data_cluster
## Models:
## fit_int_nb: internalizing_sum ~ gender + mother * father + (1 | ID_class), zi=~0, disp=~1
## fit_int_zinb: internalizing_sum ~ gender + mother * father + (1 | ID_class), zi=~gender + (1 | ID_class), disp=~1
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## fit_int_nb 19 3680.7 3770.8 -1821.3 3642.7
## fit_int_zinb 22 3669.0 3773.3 -1812.5 3625.0 17.677 3 0.0005127 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Overall, results indicate that the ZINB model performs better than the Negative Binomial model. Note, however, that BIC actually prefers the model without zero inflation. Nevertheless, in the following analyses, we decide to use ZINB models to be consistent with the analysis of the externalizing problems.