java - Extracting Table Data using JSoup -


i'm trying extract financial information table using jsoup. i've reviewed similar questions , can examples work (here two:

using jsoup extract data

using jsoup extract html table contents).

i'm not sure why code doesn't work on my url.

below 3 different attempts. appreciated.

string s = "http://financials.morningstar.com/valuation/price-ratio.html?t=axp&region=usa&culture=en-us";  //attempt 1 try {     document doc = jsoup.connect("http://financials.morningstar.com/valuation/price-ratio.html?t=axp&region=usa&culture=en_us").get();      (element table : doc.select("table#currentvaluationtable.r_table1.text2")) {         (element row : table.select("tr")) {             elements tds = row.select("td");             if (tds.size() > 6) {                 system.out.println(tds.get(0).text() + ":" + tds.get(1).text());             }         }     } }  catch (ioexception ex) {     ex.printstacktrace(); } 
// attempt 2 try {     document doc = jsoup.connect(s).get();      (element table : doc.select("table#currentvaluationtable.r_table1.text2")) {         (element row : table.select("tr")) {             elements tds = row.select("td");             (int = 0; < tds.size(); i++) {                 system.out.println(tds.get(i).text());             }         }     }         }  catch (ioexception ex) {     ex.printstacktrace(); } 
//attempt 3 try {     document doc = jsoup.connect(s).get();      elements tableelements = doc.select("table#currentvaluationtable.r_table1.text2");      elements tablerowelements = tableelements.select(":not(thead) tr");      (int = 0; < tablerowelements.size(); i++) {         element row = tablerowelements.get(i);         system.out.println("row");         elements rowitems = row.select("td");         (int j = 0; j < rowitems.size(); j++) {             system.out.println(rowitems.get(j).text());         }     }         }  catch (ioexception ex) {     ex.printstacktrace(); } 

answer provided psherno:

print document able read page (use system.out.println(doc);). tells me problem may related fact html content looking dynamically added javascript browser, jsoup can't since doesn't have javascript support. in case should use more powerful tool web driver (like selenium).


Comments

Popular posts from this blog

c# - Validate object ID from GET to POST -

node.js - Custom Model Validator SailsJS -

php - Find a regex to take part of Email -