regex - Python: Using re module to find string, then print values under string -
i attempting use re module search string of large file. file searching has following format:
220 box 1, step 1 c 15.1760586379 13.7666285127 4.1579861659 f 13.7752750995 13.3845518556 4.1992254467 f 15.1122807811 15.0753387163 3.8457966464 h 15.5298304628 13.5873563855 5.1615910859 h 15.6594416869 13.1246597008 3.3754112615 5 box 2, step 1 c 15.1760586379 13.7666285127 4.1579861659 f 13.7752750995 13.3845518556 4.1992254467 f 15.1122807811 15.0753387163 3.8457966464 h 15.5298304628 13.5873563855 5.1615910859 h 15.6594416869 13.1246597008 3.3754112615 240 box 1, step 2 c 12.6851133069 2.8636250164 1.1788963097 f 11.7935769268 1.7912366066 1.3042188034 f 13.7887138736 2.3739304018 0.4126088380 h 12.1153838312 3.7024696077 0.7164304431 h 13.0962656950 3.1549047758 2.1436863477 c 12.6745394723 3.6338848332 15.1374252921 f 11.8703828307 4.3473226569 16.0480492173 f 12.2304604843 2.3709059503 14.9433964493 h 12.6002811971 4.1968554204 14.1449118786 h 13.7469256153 3.6086212350 15.5204655285
this format continues on box 1 , box 2 ~30000 steps total, each box. have code utilizes re module searches file based on keyword "step". unfortunately, not yield results when run it. need code search 1) only box 1, 2) print/output coordinates(preferably omitting "c's, f's, h's"; coordinates) beginning after step 1 file, 3) increment "step" number 48 , repeat 2). want ignore "5" , "240" in file searching; code should compensate not included in output after search file. have far (it not work):
import re shakes = open("mc_coordinates", "r") = 1 line in shakes: if re.match("(.*)step i(.*)", line): print line i+=48
this example of what code do:
step 1 15.1760586379 13.7666285127 4.1579861659 13.7752750995 13.3845518556 4.1992254467 15.1122807811 15.0753387163 3.8457966464 15.5298304628 13.5873563855 5.1615910859 15.6594416869 13.1246597008 3.3754112615 step 49 12.6851133069 2.8636250164 1.1788963097 11.7935769268 1.7912366066 1.3042188034 13.7887138736 2.3739304018 0.4126088380 12.1153838312 3.7024696077 0.7164304431 13.0962656950 3.1549047758 2.1436863477 12.6745394723 3.6338848332 15.1374252921 11.8703828307 4.3473226569 16.0480492173 12.2304604843 2.3709059503 14.9433964493 12.6002811971 4.1968554204 14.1449118786 13.7469256153 3.6086212350 15.5204655285 step 97 15.1760586379 13.7666285127 4.1579861659 13.7752750995 13.3845518556 4.1992254467 15.1122807811 15.0753387163 3.8457966464 15.5298304628 13.5873563855 5.1615910859 15.6594416869 13.1246597008 3.3754112615
it should noted condensed version, typically there ~250 lines of coordinates in between "step" numbers. ideas or thought appreciated. thanks!!
a quick although maybe not efficent way parse line line , add states.
# untested code, think idea import re shakes = open("mc_coordinates", "r") = 1 output = false # in block should output? line in shakes: if re.match("(.*)step i(.*)", line): # tune match box 1 print line output = true i+=48 elif re.match("(.*)step i(.*)", line): # other box or step output = false elif output: print line # or remove first few chars rid of c,f or hs.
Comments
Post a Comment