使用一个简单的例子logistic回归模型拟合到mtcars
数据集和代数描述here,我可以使用产生热图与决策边界:
library(ggplot2)
library(tidyverse)
data("mtcars")
m1 = glm(am ~ hp + wt, data = mtcars, family = binomial)
# Generate combinations of hp and wt across their observed range. Only
# generating 50 values of each here, which is not a lot but since each
# combination is included, you get 50 x 50 rows
pred_df = expand.grid(
hp = seq(min(mtcars$hp), max(mtcars$hp), length.out = 50),
wt = seq(min(mtcars$wt), max(mtcars$wt), length.out = 50)
)
pred_df$pred_p = predict(m1, pred_df, type = "response")
# For a given value of hp (predictor1), find the value of
# wt (predictor2) that will give predicted p = 0.5
find_boundary = function(hp_val, coefs) {
beta_0 = coefs['(Intercept)']
beta_1 = coefs['hp']
beta_2 = coefs['wt']
boundary_wt = (-beta_0 - beta_1 * hp_val)/beta_2
}
# Find the boundary value of wt for each of the 50 values of hp
# Using the algebra in the linked question you can instead find
# the slope and intercept of the boundary, so you could potentially
# skip this step
boundary_df = pred_df %>%
select(hp) %>%
distinct %>%
mutate(wt = find_boundary(hp, coef(m1)))
ggplot(pred_df, aes(x = hp, y = wt)) +
geom_tile(aes(fill = pred_p)) +
geom_line(data = boundary_df)
生产:
请注意,这只考虑了模型的固定效应,所以如果您想以某种方式考虑随机效应,这可能会更复杂。
有趣的问题!例如,如果您提供示例数据,人们可以更轻松地获得帮助。一个类似的逻辑回归模型适用于R中的示例数据集之一以及您的图的代码。我认为你可以通过找到拟合线性预测变量(在对数赔率标度上)为0的点来找到边界线,所以你可以使用一些非常基本的代数来找出“predictor2”的值的方程,该方程将满足给予'predictor1'的一些价值。 – Marius
事实上,我刚刚发现有人在这里写下代数:https://stats.stackexchange.com/a/159977/5443 – Marius