java - How to process Multiline CSV Input File for Map Reduce Hadoop? -
i have csv input data file in there several records. each record made of number of lines. (1 line, 2 lines, 5 lines or any). 1 thing sure each record has 24 fields separated "::". each record starts on new line each new line not new record.
default record reader fails problem each new line not new record.
**how take care of input splits. may possible record of 3 lines, 1 line in 1 block , other 2 in other block?
how should distinguish between records before provided input map method?**
i believe has inputformat , record-reader. suggestions , appreciated.
here sample data:
review_id::text::business_id::full_address::schools::longitude::average_stars::date::user_id::open::categories::photo_url::city::review_count::name::neighborhoods::url::votes.cool::votes.funny::state::stars::latitude::type::votes.useful
nan::nan::nan::nan::nan::nan::3.5::nan::cxint2yc-tuygwkpekauew::nan::nan::nan::nan::8::jane a.::nan::http://www.yelp.com/user_details?userid=cxint2yc-tuygwkpekauew::2::1::nan::nan::nan::user::5
nan::nan::nan::nan::nan::nan::3.0::nan::ofaugrtkoumweujbod1mfw::nan::nan::nan::nan::4::amy b.::nan::http://www.yelp.com/user_details?userid=ofaugrtkoumweujbod1mfw::1::1::nan::nan::nan::user::6
fu7tcxnaodnbdlcyfhmmzg::pretty great! okay, place not vegan since have bunch of cheese , egg offerings, see offer plenty of vegan alternatives.
i sort of skeptical being here because prices pretty hefty, felt.
anyway, homemade hot sauce amazing. got eggs benedict dinner , j got omelet. both good. love homefries.. next time come here, want onion rings or fries. onion rings looked amazing.
lastly, food came relatively quickly.
not fan of service. tried seat @ edge facing stoves, without asking, asked booth. @ booth, server didn't refill waters didn't feel bad emphasizing on , on whether or not wanted $5-7 desserts. honestly, slice of pie $6.50? veggie galaxy, t r p p n !
but great food! (especially breaky!)::qw5gr8vw7msok4vroswdma::nan::nan::nan::nan::2011-11-12::z_waxc4rupkp3y12bh1beg::nan::nan::nan::nan::nan::nan::nan::nan::0::1::nan::4::nan::review::0
85tbs2rt5f6kqz5l7_jfrw::great place!
i have menu , outdoor seating keep coming back. food -- had breakfast both times friends had lunch items. great selection. we've been @ off-peak times no waiting , better service.
all in all, it's no dz akins it's worth trying!::-tphabjrkegxv4fr1ke4fq::nan::nan::nan::nan::2010-09-19::1izwxafxuhtnzkoupuob5q::nan::nan::nan::nan::nan::nan::nan::nan::0::0::nan::4::nan::review::0
Comments
Post a Comment