Method in R for Performing a Visual Check of the Model Assumptions

Intro

Here is a function in R to help you when you visually check that various model assumptions are valid (normality of random effects, normality of residuals, homogeneity of variance, linear relationship, multicollinearity).

R Code

check_model(x, ...) # S3 method for default check_model( x, dot_size = 2, line_size = 0.8, panel = TRUE, check = "all", alpha = 0.2, dot_alpha = 0.8, colors = c("#3aaf85", "#1b6ca8", "#cd201f"), theme = "see::theme_lucid", detrend = FALSE, verbose = TRUE, ... )

Arguments

Value

The data frame that is used for plotting.

Note

This function only makes the data in a form that is ready to be plotted. To generate the plots, see requirements of installation. Additionally, the function suppresses each of the potential warning messages. In the event that you notice any suspicious plots, please look at the dedicated functions (such as check_collinearity(), check_normality() and so forth.) to obtain messages with warnings and further info.

Linearity Assumption

The Linearity plot is used to validate the assumption that there is a linear relationship present in the data. However, note that the spread of the dots also show potential presence of heteroscedasticity in the data (e.g. non-constant variance); thus, there is an alias "ncv" for the plot of the data. Please exercise some caution when trying to interpreting the meaning of such plots. While these plots are useful to validate the assumptions of the model, the plots don’t necessarily show a so-called “lack of fit”, i.e. missing interactions non-linear relationships. As a result, it’s always suggested to also examine the effect plots, including partial residuals.

Residuals for (Generalized) Linear Models

The plots for validation of homogeneity of variance or normality of residuals (QQ-plot) utilize the standardized Pearson’s residuals which is for GLM (generalized linear models), and standardized residuals for linear models. The plots utilized for checking the normality of the residuals (with an overlaid normal curve) and for checking the validity of the assumption of linearity utilize the default residuals for lm and glm (which are a deviance of the residuals for the glm).

Examples

if (FALSE) { m <- lm(mpg ~ wt + cyl + gear + disp, data = mtcars) check_model(m) if (require("lme4")) { m <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) check_model(m, panel = FALSE) } if (require("rstanarm")) { m <- stan_glm(mpg ~ wt + gear, data = mtcars, chains = 2, iter = 200) check_model(m) } }

Leave a comment

Your email address will not be published. Required fields are marked *