Spark saving RDD[(Int, Array[Double])] to text file got strange result -
i trying save userfeature of matrixfactorizationmodel textfile, according doc rdd of type [(int, array[double])]. called
model.userfeature.saveastextfile("feature") however, results got like:
(1,[d@4b7707f1) (5,[d@513e9aca) (9,[d@7d09bcab) (13,[d@31058458) (17,[d@2a5df2a7) (21,[d@5372efd7) (25,[d@59d1c59a) (29,[d@53ee5e25) (33,[d@498f5a34) (37,[d@4f9967eb) (41,[d@5560afb) (45,[d@2dc7f659) (49,[d@b46fcc) (53,[d@38098dd1) (57,[d@77090fb5) (61,[d@64769e18) what expecting like:
(1, [1.1, 2.3, 0.4, ...]) (2, [0.1, 0.3, 0.4, ...]) ... so what's wrong?
the behavior of saveastextfile use tostring method. so, array, merely hashcode. have 2 options if stick saveastextfile:
.mapvalues(x=>/*turn array data string*/).saveastextfile... or can use map wrap data in custom object custom tostring, or in case list , tostring might work
.mapvalues(_.tolist).saveastextfile
Comments
Post a Comment