mapreduce - In Spark, does the filter function turn the data into tuples? -


just wondering filter turn data tuples? example

val fileslines = sc.textfile("file.txt") val split_lines = fileslines.map(_.split(";"))  val filtereddata = split_lines.filter(x => x(4)=="blue") 

//from here if wanted map data using tuple format ie. x._3 or x(3)

val bluerecords = filtereddata.map(x => x._1, x._2)  

or

val bluerecords = filtereddata.map(x => x(0), x(1)) 

no, filter take predicate function , uses such of datapoints in set return false when passed through predicate, not passed out resultant set. so, data remians same:

fileslines //rdd[string] (lines of file) split_lines //rdd[array[string]] (lines delimited semicolon) filtereddata //rdd[array[string]] (lines delimited semicolon 5th item blue 

so, use filtereddata, have access data array using parentheses appropriate index


Comments

Popular posts from this blog

c# - Validate object ID from GET to POST -

node.js - Custom Model Validator SailsJS -

php - Find a regex to take part of Email -