java - Extracting Table Data using JSoup -
i'm trying extract financial information table using jsoup. i've reviewed similar questions , can examples work (here two:
using jsoup extract html table contents).
i'm not sure why code doesn't work on my url.
below 3 different attempts. appreciated.
string s = "http://financials.morningstar.com/valuation/price-ratio.html?t=axp®ion=usa&culture=en-us"; //attempt 1 try { document doc = jsoup.connect("http://financials.morningstar.com/valuation/price-ratio.html?t=axp®ion=usa&culture=en_us").get(); (element table : doc.select("table#currentvaluationtable.r_table1.text2")) { (element row : table.select("tr")) { elements tds = row.select("td"); if (tds.size() > 6) { system.out.println(tds.get(0).text() + ":" + tds.get(1).text()); } } } } catch (ioexception ex) { ex.printstacktrace(); }
// attempt 2 try { document doc = jsoup.connect(s).get(); (element table : doc.select("table#currentvaluationtable.r_table1.text2")) { (element row : table.select("tr")) { elements tds = row.select("td"); (int = 0; < tds.size(); i++) { system.out.println(tds.get(i).text()); } } } } catch (ioexception ex) { ex.printstacktrace(); }
//attempt 3 try { document doc = jsoup.connect(s).get(); elements tableelements = doc.select("table#currentvaluationtable.r_table1.text2"); elements tablerowelements = tableelements.select(":not(thead) tr"); (int = 0; < tablerowelements.size(); i++) { element row = tablerowelements.get(i); system.out.println("row"); elements rowitems = row.select("td"); (int j = 0; j < rowitems.size(); j++) { system.out.println(rowitems.get(j).text()); } } } catch (ioexception ex) { ex.printstacktrace(); }
answer provided psherno:
print document able read page (use
system.out.println(doc);
). tells me problem may related fact html content looking dynamically added javascript browser, jsoup can't since doesn't have javascript support. in case should use more powerful tool web driver (like selenium).
Comments
Post a Comment