PySpark reduceByKey on multiple values -


if have k,v pair like:

(k, (v1, v2)) (k, (v3, v4)) 

how can sum values such (k, (v1 + v3, v2 + v4)) ?

reducebykey supports functions. lets array of key-value pairs.

output = a.reducebykey(lambda x, y: x[0]+y[0], x[1]+y[1]) 

Comments

Popular posts from this blog

c# - Validate object ID from GET to POST -

node.js - Custom Model Validator SailsJS -

php - Find a regex to take part of Email -