r - An error while looping a linear regression -
i run loop run per each category of 1 of variables , produce prediction per each regression sum of prediction variable deduced target variable .here toy data , code:
df <- read.table(text = "target birds wolfs snakes 3 9 7 3 8 4 b 1 2 8 c 1 2 3 1 8 3 6 1 2 6 7 1 b 6 1 5 c 5 9 7 c 3 8 7 c 4 2 7 b 1 2 3 b 7 6 3 c 6 1 1 6 3 9 6 1 1 b ",header = true)
i wrote code(below) it's aim results of calculation written above got error while :
here code:
b <- list() for(i in c("a","b",'c')){ lmmodel <- lm(target ~ birds+wolfs, data = subset(df, snakes == i) ) b[i] <- sum(predict(lmmodel,newdata=subset(df, snakes == i))) - sum(df$target[which(df$snakes=='a'),]) } b <- as.numeric(b) b
i got error:
error in df$target[which(df$snakes == "a"), ] : incorrect number of dimensions
how can solve issue?
the problem arises mixture of subsetting types here: df$target[which(df$snakes=='a'),]
once use $
output no longer data.frame, , 2 parameter [
subsetting no longer valid. better off compacting to:
sum(df[df$snakes=="a","target"]) [1] 23
as model, can create 1 snakes
covariate, , use predictions sum in snakes groups:
lm(target~birds+wolfs+snakes+0,df) call: lm(formula = target ~ birds + wolfs + snakes + 0, data = df) coefficients: birds wolfs snakesa snakesb snakesc -0.08593 -0.23461 5.15458 5.09446 6.25448 tapply(predict(lm(target~birds+wolfs+snakes+0,df)),df$snakes,sum) b c 23 20 22
and final output of b
variable,
tapply(predict(lm(target~birds+wolfs+snakes+0,df)),df$snakes,sum) - sum(df[df$snakes=="a","target"]) b c 1.776357e-14 -3.000000e+00 -1.000000e+00
but note there small numerical discrepancy value.
alternatively, , check, can specify subsets of data via argument lm
:
sum(predict(lm(target~birds+wolfs,data=df,subset=snakes=="a"))) [1] 23 sum(predict(lm(target~birds+wolfs,data=df,subset=snakes=="b"))) [1] 20 sum(predict(lm(target~birds+wolfs,data=df,subset=snakes=="c"))) [1] 22
Comments
Post a Comment