mapreduce - In Spark, does the filter function turn the data into tuples? -
just wondering filter turn data tuples? example
val fileslines = sc.textfile("file.txt") val split_lines = fileslines.map(_.split(";")) val filtereddata = split_lines.filter(x => x(4)=="blue")
//from here if wanted map data using tuple format ie. x._3 or x(3)
val bluerecords = filtereddata.map(x => x._1, x._2)
or
val bluerecords = filtereddata.map(x => x(0), x(1))
no, filter
take predicate function , uses such of datapoints in set return false when passed through predicate, not passed out resultant set. so, data remians same:
fileslines //rdd[string] (lines of file) split_lines //rdd[array[string]] (lines delimited semicolon) filtereddata //rdd[array[string]] (lines delimited semicolon 5th item blue
so, use filtereddata
, have access data array using parentheses appropriate index
Comments
Post a Comment