java - Multi pattern string split -
i have text, consisting of varying regex delimiters, followed text. in example, have 3 regex delimiters (patterna, b, c), , text looks :
|..stringmatchinga..|..text1..|..stringmatchingb..|..text2..|..stringmatchinga..|..text3..|..stringmatchingc..|..text4..|
i looking efficient java solution extract information list of triplet :
{patterna, stringmatchinga, text1}
{patternb, stringmatchingb, text2}
{patterna, stringmatchinga, text3}
{patternc, stringmatchingc, text4}
with information, know each triplet, pattern has been matched, string has matched it.
for moment, have approach, guess far more efficient advanced regex usage ?
string pattern = "?=(patterna|patternb|patternc)"; string()[] tokens = input.split(pattern); for(string token : tokens) { //if start of token matches patterna ... //elseif start of token matches pattern b... //etc... }
remarks :
- patterns mutually exclusive.
- string starts @ least 1 pattern.
you can use loop , inside code block can "eat" found @ beginning of text. in way @ every iteration parsing quite simple , maintenable/expandible.
the simple rule is: eat found , process it.
something
string chunk; while(text.size() >0 { chunk = eat(text,pattern1); if (chunk.lengh()>0}{ ... continue; } chunk = eat(text,pattern2); if (chunk.lengh()>0}{ ... continue; } }
for perfomance reason have compile regexp patterns before entering loop.
(consider using parser generator antlr).
Comments
Post a Comment