Spark saving RDD[(Int, Array[Double])] to text file got strange result -
i trying save userfeature of matrixfactorizationmodel textfile, according doc rdd of type [(int, array[double])]. called
model.userfeature.saveastextfile("feature")
however, results got like:
(1,[d@4b7707f1) (5,[d@513e9aca) (9,[d@7d09bcab) (13,[d@31058458) (17,[d@2a5df2a7) (21,[d@5372efd7) (25,[d@59d1c59a) (29,[d@53ee5e25) (33,[d@498f5a34) (37,[d@4f9967eb) (41,[d@5560afb) (45,[d@2dc7f659) (49,[d@b46fcc) (53,[d@38098dd1) (57,[d@77090fb5) (61,[d@64769e18)
what expecting like:
(1, [1.1, 2.3, 0.4, ...]) (2, [0.1, 0.3, 0.4, ...]) ...
so what's wrong?
the behavior of saveastextfile
use tostring
method. so, array
, merely hashcode
. have 2 options if stick saveastextfile
:
.mapvalues(x=>/*turn array data string*/).saveastextfile...
or can use map
wrap data in custom object custom tostring
, or in case list
, tostring
might work
.mapvalues(_.tolist).saveastextfile
Comments
Post a Comment