scala - Avoid RDD nested in Spark without Array -


i've big problem!

i have rdd[(int, vector)] , int sort of label.

for example :

(0, (a,b,c) ); (0, (d,e,f) ); (1, (g,h,i) ) 

etc...

now, need use rdd(i call myrdd ) :

myrdd.map{  case(l,v) =>     myrdd.map { case(l_, v_) =>        compare(v, v_)    } } 

now, know it's impossible in spark use rdd nested.

i can bypass problem using array. problem can't use array, or goes in memory.

how resolve problem without using array?

thanks in advance!!!

cartesian sounds should work:

myrdd.cartesian(myrdd).map{   case ((_,v),(_,v_)) => compare(v,v_) } 

Comments

Popular posts from this blog

c# - Validate object ID from GET to POST -

node.js - Custom Model Validator SailsJS -

php - Find a regex to take part of Email -