Chapter_8:回归分析

R表达式中的符号与函数
~	分隔符号
+	分隔预测变量
:	表示预测变量的交互项（单/双因素分析使用较多）
*	表示所有可能交互项的简洁方式
^	表示交互项达到某个次数
·	表示包含除因变量外的所有变量
-	表示从等式中移除某个变量
-l	删除截距项
I()	从算术的角度来解释括号中的元素（重新定义一个新的变量）
function	可以在表达式中用的数学函数
summary()	展示拟合模型的详细结果
coefficients()	列出拟合模型的模型参数（截距项和斜率）
confint()	提供模型参数的置信区间（默认95%）
fitted()	列出拟合模型的预测值
residuals()	列出拟合模型的残差值
anova()	生成一个拟合模型的方差分析表
vcov()	列出模型参数的协方差矩阵
AIC()	输出赤池信息统计量（衡量统计模型拟合优良性的一种标准）
plot()	生成评价拟合模型的诊断图
predict()	用拟合模型对新的数据集预测响应变量值

　　在后期的学习中，我逐渐发现学会查阅官方文档是一件非常重要的事情，因为参考书籍不是百分百正确的，或者随着时间年限的推移，包推陈出新，不断改变，原先的函数可能会发生一些变化。

　　注：以下数据均来自书籍中介绍的数据集

# 用lm()拟合回归模型
#简单线性回归
data(women)
fit <- lm(weight ~ height,
          data = women)
summary(fit)
# Multiple R-squared: R的平方取值
women$weight

fitted(fit)

residuals(fit)

plot(women$height, women$weight,
     xlab = 'Height (in inches)',
     ylab = 'Weight (in pounds)')
abline(fit)

#多项式回归 通过一个二次项来提高回归的预测精度
fit2 <- lm(weight ~ height + I(height ^ 2), data = women)
summary(fit2)
plot(women$height, women$weight,
     xlab = 'Height (in inches)',
     ylab = 'Weight (in pounds)')
lines(women$height, fitted(fit2))

library(car)
scatterplot(
  weight ~ height,
  data = women,
  spread = FALSE,
  lty.smooth = 2,
  pch = 19,
  main = 'Women Age 30-39',
  xlab = 'Height (in inches)',
  ylab = 'Weight (lbs.)'
)
# lm()函数需要一个数据框
states <-
  state.x77states <-
  as.data.frame(state.x77[, c('Murder', 'Population',
                              'Illiteracy', 'Income', 'Frost')])
# 多元回归分析中，第一部最好检查一下变量间的相关性
cor(states)
library(car)
scatterplotMatrix(states,
                  spread = FALSE,
                  lty.smooth = 2,
                  main = 'Scatter Plot Matrix')
#使用lm()拟合多元模型
fit <- lm(Murder ~ Population + Illiteracy + Income + Frost,
          data = states)
summary(fit)
#有显著交互项的多元线性回归
fit <- lm(mpg ~ hp + wt + hp:wt, data = mtcars)
summary(fit)
library(effects)
plot(effect('hp:wt', fit,
            xlevels = list(wt = c(2.2, 3.2, 4.2))),
     multiline = TRUE)
# 并未进行回归诊断，感觉短时间内不需要

　　原文中，代码为 plot(effect('hp:wt', fit,list(wt = c(2.2, 3.2, 4.2))), multiline = TRUE) ，该处 plot(effect(term, mod, xlevels)) 中的 xlevels = 需要写上，不然会报错，Error in vcov.(mod, complete = FALSE) : 没有"vcov."这个函数，详情可以查阅官方文档。

截取部分关于xlevels的说明：More generally, xlevels can be a named list of values at which to set each numeric predictor. For example, xlevels=list(x1=c(2, 4, 7), x2=5) would use the values 2, 4 and 7 for the levels of x1, use 5 equally spaced levels for the levels of x2, and use the default for any other numeric predictors.

　　最后插入一张效果图，日后学精了再来嘲讽一下自己。

posted on 2020-02-02 00:29 Canvas2018 阅读(128) 评论(0) 收藏举报