geom_smooth(mapping = NULL, data = NULL, stat = "smooth",
position = "identity", ..., method = "auto", formula = y ~ x,
se = TRUE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE)
method smoothing method (function) to use, eg. "lm", "glm", "gam", "loess", "rlm".
For method = "auto" the smoothing method is chosen based on the size of the largest group (across all panels). loess is used for than 1,000 observations; otherwise
gam is used with formula = y ~ s(x, bs = "cs"). Somewhat anecdotally, loess gives a better appearance, but is O(n^2) in memory, so does not work for larger datasets.
局部回归模型,当\(n<1000\)时的默认选项,平滑程度有\(span\)参数控制.
\(O(n^2)\),因此不适用用大数据.
广义可加模型2(generalized additive model),当\(n>1000\)时的默认选项,来自\(mgcv\)包.
线性模型(linear model),拟合线性模型,通常是一条直线.
稳健线性模型(robust linear model)3,通过删除异常值实现更稳健的线性模型,来自\(MASS\)包.
广义线性模型(generalized linear model),简单最小二乘回归(OLS)的扩展,在OLS的假设中,响应变量是连续数值数据且服从正态分布,而且响应变量期望值与预测变量之间的关系是线性关系。而广义线性模型则放宽其假设,首先响应变量可以是正整数或分类数据,其分布为某指数分布族。其次响应变量期望值的函数(连接函数)与预测变量之间的关系为线性关系。因此在进行GLM建模时,需要指定分布类型和连接函数4。