scala - Subtract an RDD from another RDD doesn't work correctly -
i want subtract rdd rdd. looked documentation , found subtract
can that. actually, when tested subtract
, final rdd remains same , values not removed!
is there other function that? or using subtract
incorrectly?
here code used:
val vertexrdd: org.apache.spark.rdd.rdd[(vertexid, array[int])] val clusters = vertexrdd.takesample(false, 3) val clustersrdd: rdd[(vertexid, array[int])] = sc.parallelize(clusters) val final = vertexrdd.subtract(clustersrdd) final.collect().foreach(println(_))
performing set operations subtract mutable types (array in example) unsupported, or @ least not recommended.
try using immutable type instead.
i believe wrappedarray relevant container storing arrays in sets, i'm not sure.
Comments
Post a Comment