r - Iterated plotting from list of list of dataframes -
i'm trying explore large dataset, both data frames , charts. i'd analyze distribution of each variable different metrics (e.g., sum(x), sum(x*y)) , different sub-populations. have 4 sub-populations, 2 metrics, , many variables.
in order accomplish that, i've made list structure such this:
$variable1 ...$metric1 <--- that's df. ...$metric2 $variable2 ...$metric1 ...$metric2
inside 1 of data_frames (e.g., list$variable1$metric1), i've calculated distributions of unique values variable1 , each of 4 population groups (represented in columns). looks this:
$variable1$metric1 unique_values med_all med_some_not_all med_at_least_some med_none 1 (1) 12-17 years old na na na na 2 (2) 18-25 years old 0.278 0.317 0.278 0.317 3 (3) 26-34 years old 0.225 0.228 0.225 0.228 4 (4) 35 or older 0.497 0.456 0.497 0.456 $variable1$metric2 unique_values med_all med_some_not_all med_at_least_some med_none 1 (1) 12-17 years old na na na na 2 (2) 18-25 years old 0.544 0.406 0.544 0.406 3 (3) 26-34 years old 0.197 0.310 0.197 0.310 4 (4) 35 or older 0.259 0.284 0.259 0.284
what i'm trying figure out way loop through list of lists (probably melting dfs in process) , output ton of bar charts. in case, natural plot format be, each dataframe, stacked bar chart 1 stacked bar each sub-population, grouping variable's unique values.
but i'm not familiar iterated plotting , i've hit dead end. how might plot list structure? alternately, there better structure in should storing information?
here's start:
lst <- list(alpha= list(a= data.frame(matrix(1:4, 2)), b= data.frame(matrix(6:11, 2))), beta = list(c = data.frame(matrix(11:14, 2)))) lst $alpha $alpha$a x1 x2 1 1 3 2 2 4 $alpha$b x1 x2 x3 1 6 8 10 2 7 9 11 $beta $beta$c x1 x2 1 11 13 2 12 14 #we can subset number or name lst[['alpha']] $a x1 x2 1 1 3 2 2 4 $b x1 x2 x3 1 6 8 10 2 7 9 11 lst[[1]] $a x1 x2 1 1 3 2 2 4 $b x1 x2 x3 1 6 8 10 2 7 9 11 #the dollar sign naming convention reminds looking @ list. #let's sum columns of both data frames in alpha list lapply(lst[['alpha']], colsums) $a x1 x2 3 7 $b x1 x2 x3 13 17 21
let's try find sum of each column of each data frame:
lapply(lst, colsums) error in fun(x[[i]], ...) : 'x' must array of @ least 2 dimensions
what happened? r correctly refusing run array function on list. function colsums
needs fed data frames, matrices, , other arrays above one-dimension. have nest lapply
function inside of one. logic can complicated:
lapply(lst, function(x) lapply(x, colsums)) $alpha $alpha$a x1 x2 3 7 $alpha$b x1 x2 x3 13 17 21 $beta $beta$c x1 x2 23 27
we can use rbind
put data.frames together:
rbind(lst$alpha$a, lst$beta$c) x1 x2 1 1 3 2 2 4 3 11 13 4 12 14
be sure not way might thinking (i've done many times):
do.call(rbind, lst) b alpha list,2 list,3 beta list,2 list,2
that isn't result you're looking for. , make sure dimensions , column names same:
do.call(rbind, lst[[1]]) error in rbind(deparse.level, ...) : numbers of columns of arguments not match
r refusing combine data frames have 2 columns in 1 (alpha$a) , 3 columns in other (alpha$b).
i changed lst
make alpha$b
have 2 columns others , combined them:
bind1 <- lapply(lst2, function(x) do.call(rbind, x)) bind1 $alpha x1 x2 a.1 1 3 a.2 2 4 b.1 6 9 b.2 7 10 b.3 8 11 $beta x1 x2 c.1 11 13 c.2 12 14
that combines elements of each list. can combine outer list make 1 big data frame.
do.call(rbind, bind1) x1 x2 alpha.a.1 1 3 alpha.a.2 2 4 alpha.b.1 6 9 alpha.b.2 7 10 alpha.b.3 8 11 beta.c.1 11 13 beta.c.2 12 14
Comments
Post a Comment