r - read HDFS blocks in Rstudio -
i want read hdfs file in rstudio ,it's not csv file easy do, blocks. loaded data database sqoop , , have data divided in blocks. have files this:
/data/_success /data/part-m-00000 /data/part-m-00001 /data/part-m-00002 /data/part-m-00003 /data/part-m-00004 /data/part-m-00005 but can't read files , command can read 1 @ time : hdfs.data <- file.path(hdfs.root,"part-m-00001" // change part-m-0000* every time , * doesn't work read files ...
are text files? should able load in same way csv files.
list_tables <- lapply(list.files(hdfs.root,full.names = true), read.table) library(data.table) table_composite <- rbindlist(list_tables) you should include options of read.table arguments of lapply
alternatively, can read full folder composite csv file.
another option using open-source package rhdfs.
Comments
Post a Comment