Parsing CSV files to arrays from very large sources in java -


i have parser works fine on smaller files of approx. 60000 lines or less have parse csv file on 10 million lines , method isn't working hangs every 100 thousand lines 10 seconds , assume split method, there faster way parse data csv string array?

code in question:

    string[][] events = new string[rows][columns];     scanner sc = new scanner(csvfilename);      int j = 0;     while (sc.hasnext()){         events[j] = sc.nextline().split(",");         j++;     } 

your code won't parse csv files reliably. if had ',' or line separator in value? slow.

get univocity-parsers parse files. 3 times faster apache commons csv, has many more features , use process files billions of rows.

to parse rows list of strings:

csvparsersettings settings = new csvparsersettings(); //lots of options here, check documentation  csvparser parser = new csvparser(settings);  list<string[]> allrows = parser.parseall(new filereader(new file("path/to/input.csv"))); 

disclosure: author of library. it's open-source , free (apache v2.0 license).


Comments

Popular posts from this blog

c# - Validate object ID from GET to POST -

node.js - Custom Model Validator SailsJS -

php - Find a regex to take part of Email -