Sequentially create date for each record by ID in R datatable -
i have datatable id , origination date, each unique id represent row. have use variable 'count' (which interval between orig_date , close_date in months) , sequentially replicate orig_date date field shown. code tried taking first value of 'count' (in case 3) , sequentialy replicating orig_date. have different count different id. how use corresponding count each unique id , replicate orig_date column called date
test.data
id count score value orig_date close_date 10748 3 750 450231 2015-03-01 2015-06-01 10845 4 680 590231 2015-01-01 2015-05-01 21758 7 760 650839 2014-11-01 2015-06-01 test.panel <- test.data[rep(sequence(nrow(test.data)),count)] test.panel$date <- ymd(test.panel$orig_date)+ months(1:test.panel$count) given below structure of datatable trying create
id count score value orig_date date 10748 3 750 450231 2015-03-01 2015-03-01 10748 3 750 450231 2015-03-01 2015-04-01 10748 3 750 450231 2015-03-01 2015-05-01 10748 3 750 450231 2015-03-01 2015-06-01 10845 4 680 590231 2015-01-01 2015-01-01 10845 4 680 590231 2015-01-01 2015-02-01 10845 4 680 590231 2015-01-01 2015-03-01 10845 4 680 590231 2015-01-01 2015-04-01 10845 4 680 590231 2015-01-01 2015-05-01 21758 7 760 650839 2014-11-01 2014-11-01 21758 7 760 650839 2014-11-01 2014-12-01 21758 7 760 650839 2014-11-01 2015-01-01 21758 7 760 650839 2014-11-01 2015-02-01 .......................................................... ..........................................................
it simple data.table. recreating sample data:
test.data <- read.table( text = " id count score value orig_date close_date 10748 3 750 450231 2015-03-01 2015-06-01 10845 4 680 590231 2015-01-01 2015-05-01 21758 7 760 650839 2014-11-01 2015-06-01", header = true, stringsasfactors = false, colclasses = c("integer", "integer", "integer","integer", "date", "date") ) str(df) now doing want in data.table:
library(data.table) test.data <- data.table(test.data) test.data[ , list(close_date = seq(orig_date, close_date, = "month")), = c("id", "count", "score", "value", "orig_date")] id count score value orig_date close_date 1: 10748 3 750 450231 2015-03-01 2015-03-01 2: 10748 3 750 450231 2015-03-01 2015-04-01 3: 10748 3 750 450231 2015-03-01 2015-05-01 4: 10748 3 750 450231 2015-03-01 2015-06-01 5: 10845 4 680 590231 2015-01-01 2015-01-01 6: 10845 4 680 590231 2015-01-01 2015-02-01 7: 10845 4 680 590231 2015-01-01 2015-03-01 8: 10845 4 680 590231 2015-01-01 2015-04-01 9: 10845 4 680 590231 2015-01-01 2015-05-01 10: 21758 7 760 650839 2014-11-01 2014-11-01 11: 21758 7 760 650839 2014-11-01 2014-12-01 12: 21758 7 760 650839 2014-11-01 2015-01-01 13: 21758 7 760 650839 2014-11-01 2015-02-01 14: 21758 7 760 650839 2014-11-01 2015-03-01 15: 21758 7 760 650839 2014-11-01 2015-04-01 16: 21758 7 760 650839 2014-11-01 2015-05-01 17: 21758 7 760 650839 2014-11-01 2015-06-01
Comments
Post a Comment