r - An error while looping a linear regression -


i run loop run per each category of 1 of variables , produce prediction per each regression sum of prediction variable deduced target variable .here toy data , code:

df <- read.table(text = "target birds    wolfs     snakes                      3        9         7                      3        8         4 b                      1        2         8 c                      1        2         3                      1        8         3                      6        1         2                      6        7         1 b                      6        1         5 c                      5        9         7 c                      3        8         7 c                      4        2         7 b                      1        2         3 b                      7        6         3 c                      6        1         1                      6        3         9                      6        1         1 b ",header = true) 

i wrote code(below) it's aim results of calculation written above got error while :

here code:

b <- list()    for(i in c("a","b",'c')){      lmmodel <- lm(target ~ birds+wolfs, data = subset(df, snakes == i) )      b[i] <- sum(predict(lmmodel,newdata=subset(df, snakes == i)))  - sum(df$target[which(df$snakes=='a'),])  }  b <- as.numeric(b)  b 

i got error:

  error in df$target[which(df$snakes == "a"), ] :    incorrect number of dimensions 

how can solve issue?

the problem arises mixture of subsetting types here: df$target[which(df$snakes=='a'),]

once use $ output no longer data.frame, , 2 parameter [ subsetting no longer valid. better off compacting to:

sum(df[df$snakes=="a","target"]) [1] 23 

as model, can create 1 snakes covariate, , use predictions sum in snakes groups:

lm(target~birds+wolfs+snakes+0,df)  call: lm(formula = target ~ birds + wolfs + snakes + 0, data = df)  coefficients:    birds     wolfs   snakesa   snakesb   snakesc   -0.08593  -0.23461   5.15458   5.09446   6.25448  tapply(predict(lm(target~birds+wolfs+snakes+0,df)),df$snakes,sum)   b  c  23 20 22  

and final output of b variable,

tapply(predict(lm(target~birds+wolfs+snakes+0,df)),df$snakes,sum) - sum(df[df$snakes=="a","target"])                         b             c   1.776357e-14 -3.000000e+00 -1.000000e+00  

but note there small numerical discrepancy value.

alternatively, , check, can specify subsets of data via argument lm:

sum(predict(lm(target~birds+wolfs,data=df,subset=snakes=="a"))) [1] 23 sum(predict(lm(target~birds+wolfs,data=df,subset=snakes=="b"))) [1] 20 sum(predict(lm(target~birds+wolfs,data=df,subset=snakes=="c"))) [1] 22 

Comments

Popular posts from this blog

c# - Validate object ID from GET to POST -

node.js - Custom Model Validator SailsJS -

php - Find a regex to take part of Email -