loops - How to repeat this statement in R probably using apply() -
it might seem silly question how repeat line 152 times , not use loop,since later not efficient larger data sets:
reviews = as.vector(t(mydata)[,1])
mydata row in data.frame , reviews array of characters, [,1] first row
the output matrix or worst case data.frame.
i tried this, did not work :
testing = apply(mydata, 1, function(x) {as.vector(t(mydata[,x]))}) error in t(mydata)[, x] : subscript out of bounds
thanks.
edit: quick data sample:
> reviews = as.vector(t(mydata)[,1]) > class(reviews) [1] "character" > length(reviews) [1] 14 > reviews [1] "i involuntarily" [2] "i in transit" [3] "my initial flight" [4] "that still left" [5] "after disembarking" [6] "customs , proceed gate." [7] "i arrived" [8] "when boarding pass scanned" [9] "no reason given bump." [10] "the ua gate staff" [11] "i boarded air canada." [12] "after arriving" [13] "i spent 5 hours" [14] na
mydata data.frame:
> class(mydata) [1] "data.frame" > length(mydata[,1]) [1] 152 > mydata[,1] [1] involuntarily... . [2] first time... . ... ... 152 levels: first time . ...
i have 30.000 of these, want start small, 152 of paragraphs split in individual sentence , put data.frame. each row in data.frame has 5-15 sentences.
i want to able access each row array since need perform action on each row of data.frame
packages used: plyr, sentiment(downloaded here , installed manually)
edit 2:
dput(mydata[1:6, 1:6]) structure(list(v1 = structure(c(70l, 41l, 94l, 114l, 47l, 49l), .label = c(" air canada", "their service", "hours de-icing", "have flown ba", "my booking", "if video screen", "frankfurt flights", "and 150 lines of text data",
here's recommended way ask question, focusing on fact actual data big, complicated, or private share.
question: how apply
function on each row of data.frame?
my data:
# make data s <- "lorem ipsum dolor sit amet, consectetur adipiscing elit, sed eiusmod tempor incididunt ut labore et dolore magna aliqua." mydata <- as.data.frame(matrix(strsplit(s, '\\s')[[1]][1:18], nrow=3, ncol=6), stringsasfactors=false) mydata ## v1 v2 v3 v4 v5 v6 ## 1 lorem sit adipiscing incididunt et ## 2 ipsum amet, elit, eiusmod ut dolore ## 3 dolor consectetur sed tempor labore magna
if have data can use directly, has been suggested multiple times in comments, use of dput
helpful:
mydata <- structure(list(v1 = c("lorem", "ipsum", "dolor"),v2 = c("sit", "amet,", "consectetur"), v3 = c("adipiscing", "elit,", "sed"), v4 = c("do", "eiusmod", "tempor"), v5 = c("incididunt", "ut", "labore"), v6 = c("et", "dolore", "magna")), .names = c("v1", "v2", "v3", "v4", "v5", "v6"), row.names = c(na, -3l), class = "data.frame")
in either order, state (i) trying do, , (ii) have tried , how not working.
my desired output:
converting row vector ... confusing. row vector, don't know trying do. so, i'll come short point: want words on each row in reverse alphabetical order, perhaps this:
## v1 v2 v3 v4 v5 v6 ## 1 sit lorem incididunt et adipiscing ## 2 ut ipsum elit, eiusmod dolore amet, ## 3 tempor sed magna labore dolor consectetur
this time show code you've tried, errors you've encountered, and/or how unerring output not intended.
answer, generically:
several ways each row:
use
apply
, though breaks if havenumeric
,character
intermingled. if try this, you'll see output transpose of may think, in case you'll need wrap (and of other*apply
-based suggestions here)t(...)
. it's little confusing, it's necessary here. oh, , they'llmatrix
class can converteddata.frame
if needed.ret <- apply(mydata, 1, function(r) { do_something(r) })
use
sapply
orlapply
on row indices. note these returning lists or vectors of results, you'll need convert whatever format need.ret <- sapply(1:nrow(mydata), function(i) { do_something(mydata[i,]) }) # if need keep each row's results rather encapsulated, use 1 of following: ret <- sapply(1:nrow(mydata), function(i) { do_something(mydata[i,]) }, simplify=false) ret <- lapply(1:nrow(mydata), function(i) { do_something(mydata[i,]) })
use
foreach
,iterators
.library(foreach) library(iterators) ret <- foreach(df=iter(mydata, by='row'), .combine=rbind) %do% { do_something(df) # 1 row of mydata time }
in case of (contrived) question, here several ways it:
as.data.frame(t(apply(mydata, 1, function(r) sort(r, decreasing=true)))) ## v1 v2 v3 v4 v5 v6 ## 1 sit lorem incididunt et adipiscing ## 2 ut ipsum elit, eiusmod dolore amet, ## 3 tempor sed magna labore dolor consectetur as.data.frame(t(sapply(1:nrow(mydata), function(i) sort(mydata[i,], decreasing=true)))) ## same output library(foreach) library(iterators) ## notice use of as.character(...), perhaps still blasphemy ## structure of data.frame ret <- foreach(df=iter(mydata, by='row'), .combine=rbind) %do% { sort(as.character(df), decreasing=true) } ret ## [,1] [,2] [,3] [,4] [,5] [,6] ## result.1 "sit" "lorem" "incididunt" "et" "do" "adipiscing" ## result.2 "ut" "ipsum" "elit," "eiusmod" "dolore" "amet," ## result.3 "tempor" "sed" "magna" "labore" "dolor" "consectetur"
Comments
Post a Comment