java - How to process Multiline CSV Input File for Map Reduce Hadoop? -


i have csv input data file in there several records. each record made of number of lines. (1 line, 2 lines, 5 lines or any). 1 thing sure each record has 24 fields separated "::". each record starts on new line each new line not new record.

default record reader fails problem each new line not new record.

**how take care of input splits. may possible record of 3 lines, 1 line in 1 block , other 2 in other block?

how should distinguish between records before provided input map method?**

i believe has inputformat , record-reader. suggestions , appreciated.

here sample data:

review_id::text::business_id::full_address::schools::longitude::average_stars::date::user_id::open::categories::photo_url::city::review_count::name::neighborhoods::url::votes.cool::votes.funny::state::stars::latitude::type::votes.useful

nan::nan::nan::nan::nan::nan::3.5::nan::cxint2yc-tuygwkpekauew::nan::nan::nan::nan::8::jane a.::nan::http://www.yelp.com/user_details?userid=cxint2yc-tuygwkpekauew::2::1::nan::nan::nan::user::5

nan::nan::nan::nan::nan::nan::3.0::nan::ofaugrtkoumweujbod1mfw::nan::nan::nan::nan::4::amy b.::nan::http://www.yelp.com/user_details?userid=ofaugrtkoumweujbod1mfw::1::1::nan::nan::nan::user::6

fu7tcxnaodnbdlcyfhmmzg::pretty great! okay, place not vegan since have bunch of cheese , egg offerings, see offer plenty of vegan alternatives.

i sort of skeptical being here because prices pretty hefty, felt.
anyway, homemade hot sauce amazing. got eggs benedict dinner , j got omelet. both good. love homefries.. next time come here, want onion rings or fries. onion rings looked amazing.

lastly, food came relatively quickly.

not fan of service. tried seat @ edge facing stoves, without asking, asked booth. @ booth, server didn't refill waters didn't feel bad emphasizing on , on whether or not wanted $5-7 desserts. honestly, slice of pie $6.50? veggie galaxy, t r p p n !

but great food! (especially breaky!)::qw5gr8vw7msok4vroswdma::nan::nan::nan::nan::2011-11-12::z_waxc4rupkp3y12bh1beg::nan::nan::nan::nan::nan::nan::nan::nan::0::1::nan::4::nan::review::0

85tbs2rt5f6kqz5l7_jfrw::great place!

i have menu , outdoor seating keep coming back. food -- had breakfast both times friends had lunch items. great selection. we've been @ off-peak times no waiting , better service.

all in all, it's no dz akins it's worth trying!::-tphabjrkegxv4fr1ke4fq::nan::nan::nan::nan::2010-09-19::1izwxafxuhtnzkoupuob5q::nan::nan::nan::nan::nan::nan::nan::nan::0::0::nan::4::nan::review::0


Comments

Popular posts from this blog

c# - Validate object ID from GET to POST -

node.js - Custom Model Validator SailsJS -

php - Find a regex to take part of Email -