Parsing CSV files to arrays from very large sources in java -
i have parser works fine on smaller files of approx. 60000 lines or less have parse csv file on 10 million lines , method isn't working hangs every 100 thousand lines 10 seconds , assume split method, there faster way parse data csv string array?
code in question:
string[][] events = new string[rows][columns]; scanner sc = new scanner(csvfilename); int j = 0; while (sc.hasnext()){ events[j] = sc.nextline().split(","); j++; }
your code won't parse csv files reliably. if had ',' or line separator in value? slow.
get univocity-parsers parse files. 3 times faster apache commons csv, has many more features , use process files billions of rows.
to parse rows list of strings:
csvparsersettings settings = new csvparsersettings(); //lots of options here, check documentation csvparser parser = new csvparser(settings); list<string[]> allrows = parser.parseall(new filereader(new file("path/to/input.csv")));
disclosure: author of library. it's open-source , free (apache v2.0 license).
Comments
Post a Comment