How does aggregate work in scala? -
i knew how normal aggregate works in scala , use on fold. tried lot know how below code works, couldn't. me in explaining how works , gives me output of (10,4)
val input=list(1,2,3,4) val result = input.aggregate((0, 0))( (acc, value) => (acc._1 + value, acc._2 + 1), (acc1, acc2) => (acc1._1 + acc2._1, acc1._2 + acc2._2))
could me in explaining how works , gives me output of (10,4)
when using aggregate, provide 3 parameters:
- the initial value accumulate elements partition, it's neutral element
a function given partition, accumulate result within
a function combine 2 partitions
so in case, initial value partition tuple (0, 0).
then accumulator function defined sum current element you're traversing first element of tuple , increment second element of tuple one. in fact, compute sum of elements in partition , number of elements.
the combiner function combined 2 tuples. defined it, sum sums , count number of elements of 2 partitions. it's not used in case because traverse pipeline sequentially. call .par
on list parallel implementation see combiner in action (note has associative function).
thus (10, 4) because 1+2+3+4=10 , there 4 elements in list (you did 4 additions).
you add print statement in accumulator function (running on sequential input), see how behaves:
acc: (0,0) - value:1 acc: (1,1) - value:2 acc: (3,2) - value:3 acc: (6,3) - value:4
i knew how normal aggregate works in scala , use on fold.
for sequential input, aggregate
foldleft
:
def aggregate[b](z: =>b)(seqop: (b, a) => b, combop: (b, b) => b): b = foldleft(z)(seqop)
for parallel input, list split chunks multiple threads can work separately. accumulator function run on each chunk, using initial value. when 2 threads need merge results, combine function used:
def aggregate[s](z: =>s)(seqop: (s, t) => s, combop: (s, s) => s): s = { tasksupport.executeandwaitresult(new aggregate(() => z, seqop, combop, splitter)) }
this principle of fork-join model requires task can parallelizable well. it's case here, because thread not need know result of thread job.
Comments
Post a Comment