xml - Java read a w3c.Document from different thread -


i'm trying read xml document multiple threads. documents large! program works if use sinlge thread if use worker thread errors. single thread takes 30-40 seconds load. multiple threads takes 5 seconds. advantage clear. can't work. if can shed light on i'd love know.

here code. not have (removed lot of comments , try-catch blocks , stuff doesn't relate problem)

        import org.w3c.dom.*;         import java.net.url;         private static documentbuilder getbuilder() {           try {             documentbuilderfactory dbfactory = documentbuilderfactory.newinstance();             return dbfactory.newdocumentbuilder();           } catch (parserconfigurationexception ex) {             string err = "error initializing api interface: \n" + ex.getmessage();             system.out.println(err);             joptionpane.showmessagedialog(null, err, "loading error", joptionpane.warning_message);             system.exit(1);           }           return null;         }         public void loadpage(string page) {           url url = new url(page);           document doc = getbuilder().parse(url.openstream());           element root = doc.getdocumentelement();           nodelist nodes = root.getchildnodes();           (int = 0; < nodes.getlength(); i++) {             node n = nodes.item(i);             if (n instanceof element) {               // move node new document.               document doc;               doc = getbuilder().newdocument();               node nn = doc.adoptnode(n);               doc.appendchild(nn);               element ele = (element)nn;               new parser(ele).start(); // causes errors...               //new parser(ele).run(); // works, isn't threaded.               nodelist nodes = root.getchildnodes();               = 0;             }           }         }         public class parser implements runnable {           private element ele;           public parser(element e) {             ele = e;           }           public void start() {             new thread(this).start();           }           @override           public void run() {             // parse document here.             // errors here if it's multi-threaded.           }         } 

edit:

the errors corrupted data. upon further testing i've discovered problem worse thought. function 'loadpage' above called thread. testing @ point posted question done 1 thread running. have found if run multiple page loads in parrelell (even thought different pages) errors if i'm using 1 thread per page...

my fear there static object being used somewhere causing state corruption , resulting in corrupted data.

an example of corrupted data this.

     // when reading xml      // <date>2015-6-1</date>       public string gettext(element ele) {         string val = ele.getfirstchild().getnodevalue();       }      // either empty string or somthing "13e13" or ".5d4" 

and multiple page loads mean following:

        public void loadall(list<string> pages) {           for(string page : pages) {             new loader(page).start();           }         }         public class loader implements runnable {           private string page;           public loader(string p) {             page = p;           }           public void start() {             new thread(this).start();           }           @override           public void run() {             loadpage(page);           }         } 

the dom not thread-safe. access needs synchronized, read-only access. that's because dom lazy evaluation, apparent read requests can trigger internal updates. it's pretty horrible. better off using xom or jdom2 if that's possible.


Comments

Popular posts from this blog

c# - Validate object ID from GET to POST -

node.js - Custom Model Validator SailsJS -

php - Find a regex to take part of Email -