java - Spark NotSerializableException -


in spark code, attempting create indexedrowmatrix csv file. however, following error:

exception in thread "main" org.apache.spark.sparkexception: task not serializable ... caused by: java.io.notserializableexception: org.apache.spark.api.java.javasparkcontext 

here code:

sc = new javasparkcontext("local", "app",               "/srv/spark", new string[]{"target/app.jar"});  javardd<string> csv = sc.textfile("data/matrix.csv").cache();   javardd<indexedrow> entries = csv.zipwithindex().map(               new  function<scala.tuple2<string, long>, indexedrow>() {                 /**                  *                  **/                  private static final long serialversionuid = 4795273163954440089l;                  @override                 public indexedrow call(tuple2<string, long> tuple)                         throws exception {                     string line = tuple._1;                     long index = tuple._2;                     string[] strings = line.split(",");                     double[] doubles = new double[strings.length];                      (int = 0; < strings.length; i++) {                          doubles[i] = double.parsedouble(strings[i]);                      }                      vector v = new densevector(doubles);                      return new indexedrow(index, v);                 }             }); 

i had same issue. drove me around twist. java restriction anonymous instances , serializability. solution declare anonymous instance of function named static class implements serializable , instantiate it. declared functions library outer class included static inner class definitions of functions wanted use.

of course, if write in scala, 1 file neater code, not going in instance.


Comments

Popular posts from this blog

c# - Validate object ID from GET to POST -

node.js - Custom Model Validator SailsJS -

php - Find a regex to take part of Email -