xml - Java read a w3c.Document from different thread -
i'm trying read xml document multiple threads. documents large! program works if use sinlge thread if use worker thread errors. single thread takes 30-40 seconds load. multiple threads takes 5 seconds. advantage clear. can't work. if can shed light on i'd love know.
here code. not have (removed lot of comments , try-catch blocks , stuff doesn't relate problem)
import org.w3c.dom.*; import java.net.url; private static documentbuilder getbuilder() { try { documentbuilderfactory dbfactory = documentbuilderfactory.newinstance(); return dbfactory.newdocumentbuilder(); } catch (parserconfigurationexception ex) { string err = "error initializing api interface: \n" + ex.getmessage(); system.out.println(err); joptionpane.showmessagedialog(null, err, "loading error", joptionpane.warning_message); system.exit(1); } return null; } public void loadpage(string page) { url url = new url(page); document doc = getbuilder().parse(url.openstream()); element root = doc.getdocumentelement(); nodelist nodes = root.getchildnodes(); (int = 0; < nodes.getlength(); i++) { node n = nodes.item(i); if (n instanceof element) { // move node new document. document doc; doc = getbuilder().newdocument(); node nn = doc.adoptnode(n); doc.appendchild(nn); element ele = (element)nn; new parser(ele).start(); // causes errors... //new parser(ele).run(); // works, isn't threaded. nodelist nodes = root.getchildnodes(); = 0; } } } public class parser implements runnable { private element ele; public parser(element e) { ele = e; } public void start() { new thread(this).start(); } @override public void run() { // parse document here. // errors here if it's multi-threaded. } }
edit:
the errors corrupted data. upon further testing i've discovered problem worse thought. function 'loadpage' above called thread. testing @ point posted question done 1 thread running. have found if run multiple page loads in parrelell (even thought different pages) errors if i'm using 1 thread per page...
my fear there static object being used somewhere causing state corruption , resulting in corrupted data.
an example of corrupted data this.
// when reading xml // <date>2015-6-1</date> public string gettext(element ele) { string val = ele.getfirstchild().getnodevalue(); } // either empty string or somthing "13e13" or ".5d4"
and multiple page loads mean following:
public void loadall(list<string> pages) { for(string page : pages) { new loader(page).start(); } } public class loader implements runnable { private string page; public loader(string p) { page = p; } public void start() { new thread(this).start(); } @override public void run() { loadpage(page); } }
the dom not thread-safe. access needs synchronized, read-only access. that's because dom lazy evaluation, apparent read requests can trigger internal updates. it's pretty horrible. better off using xom or jdom2 if that's possible.
Comments
Post a Comment