r - Data Preparation for Metaanalysis

r - Data Preparation for Metaanalysis - using Metafor -

February 15, 2015

in order calculate effect sizes , run meta-analysis dichotomous predictor of continuous outcome (d or g), dataframe consisting of mean's, sd's, , sample size each study required.

i trying write code create required dataframe raw data. mean process not have completed manually each study.

example raw dataset

study <- c("andrew", "andrew", "andrew", "andrew", "peters", "peters", "peters", "jess", "jess", "jess") score = c(100, 308, 584, 241, 241, 111, 431, 123, 321, 411) sex = c(1, 1, 1, 2, 2, 1, 2, 2, 1, 1) data = cbind(score, sex, study) data   >     score sex study     > [1,] "100" "1" "andrew"  > [2,] "308" "1" "andrew"  > [3,] "584" "1" "andrew"  > [4,] "241" "2" "andrew"  > [5,] "241" "2" "peters"  > [6,] "111" "1" "peters"  > [7,] "431" "2" "peters"  > [8,] "123" "2" "jess"    > [9,] "321" "1" "jess"   > [10,] "411" "1" "jess"

how can turn following file metafor dividing data sex , study?

study       meanmale   meanfemale   sdmale    sdfemale    nrowsmale    nrowsfemale andrew         x           x          x          x            x             x peters         x           x          x          x            x             x jess           x           x          x          x            x             x

i imagine using describeby, statsby, or splitdata sapply work getting required format messy. next aim introduce year column e.g.,

study <- c("andrew", "andrew", "andrew", "andrew", "peters", "peters", "peters", "jess", "jess", "jess")  score = c(100, 308, 584, 241, 241, 111, 431, 123, 321, 411) sex = c(1, 1, 1, 2, 2, 1, 2, 2, 1, 1)  year = (1992, 1992, 1992, 1992, 1988, 1988, 1988, 1977, 1977, 1977)  data = cbind(study, year, score, sex)

to produce following data.frame

study      year  meanmale   meanfemale   sdmale    sdfemale    nrowsmale    nrowsfemale andrew     1992    x           x          x          x            x             x peters     1988    x           x          x          x            x             x jess       1977    x           x          x          x            x             x

we use devel version of data.table i.e. v1.9.5. instructions install devel version here.

we convert 'data.frame' 'data.table' (setdt(data)), grouped 'sex' , 'study', mean, sd , .n (nrows), , use dcast (from data.table can take multiple value.var columns) reshape 'long' 'wide' format.

library(data.table)#v1.9.5+ dcast(setdt(data)[, list(mean= mean(score), sd= sd(score), nrows=.n),  .(sex, study)], study~ c('male', 'female')[sex],            value.var=c('mean', 'sd', 'nrows')) #     study female_mean male_mean female_sd   male_sd female_nrows male_nrows #1: andrew         241  330.6667        na 242.79484            1          3 #2:   jess         123  366.0000        na  63.63961            1          2 #3: peters         336  111.0000  134.3503        na            2          1

edit

from @arun's comments, dcast data.table accepts multiple functions well.

dcast(setdt(data), study ~ c('male', 'female')[sex],         fun.agg=list(mean, sd, length), value.var="score") #    study female_mean_score male_mean_score female_sd_score male_sd_score #1: andrew               241        330.6667              na     242.79484 #2:   jess               123        366.0000              na      63.63961 #3: peters               336        111.0000        134.3503            na #   female_length_score male_length_score #1:                   1                 3 #2:                   1                 2 #3:                   2                 1

or can use reshape base r after getting mean, sd, nrow using aggregate.

d1 <- do.call(data.frame,aggregate(score~., transform(data, sex=c('male',  'female')[sex]), fun=function(x) c(mean=mean(x), sd=sd(x), nrows=length(x))))  reshape(d1, idvar='study', timevar='sex', direction='wide') #  study score.mean.female score.sd.female score.nrows.female score.mean.male #1 andrew               241              na                  1        330.6667 #3   jess               123              na                  1        366.0000 #5 peters               336        134.3503                  2        111.0000 #  score.sd.male score.nrows.male #1     242.79484                3 #3      63.63961                2 #5            na                1

data

data <- data.frame(score, sex, study)

Search This Blog

ANgular