bash - grep one liner - extract two different lines from same file -


i've file containing many number of lines following.

  == domain 1  score: 280.5 bits;  conditional e-value: 2.1e-87                  tseeettctttgsg---bttssb-hhhhhhhhhhhhhhhhhhsss---b-hhhhhhhstttstgcgbb-hhhhhhhhhhhtebebttts---sscsesecttgcgscebeeseeeeeessbhhhhhhhhhhhsseeeeeectshhhhteesseesctscetss-eeeeeeeeeeeetteeeeeee-sbtttstbtteeeeesssssgggttsseeee cs   pf00112.18   2 pesvdwrekkgavtpvkdqgscgscwafsavgalegrlaiktkkklvslseqelvdcskeenegcngglmenafeyikknggivtekdypykakekgkckkkkkkekvakikgygkvkenseealkkalakngpvsvaidaseedfqlyksgvyketecsktelnhavlivgygvengkkywivknswgtdwgekgyiriargknnecgieseavyp 218                  p+svd+r+k+ +vtpvk+qg+cgscwafs+vgaleg+l+ kt +kl++ls q+lvdc + en+gc gg+m+naf+y++kn+gi++e+ ypy ++e ++c ++ + +  ak++gy++++e +e+alk+a+a++gpvsvaidas ++fq+y++gvy++++c++++lnhavl+vgyg ++g+k wi+knswg++wg+kgyi +ar+knn cgi++ a++p       1au0:a   2 pdsvdyrkkg-yvtpvknqgqcgscwafssvgalegqlkkkt-gkllnlspqnlvdcvs-endgcgggymtnafqyvqknrgidsedaypyvgqe-escmynptgka-akcrgyreipegnekalkravarvgpvsvaidasltsfqfyskgvyydescnsdnlnhavlavgygiqkgnkhwiiknswgenwgnkgyilmarnknnacgianlasfp 213 

i want extract line preceded pf , associated line after starts digit.

here in case, line starts pf 'pf00112.18' , line starts digit '1au0:a'. these ids change next domain, pf constant , associated id starts digit.

here i've tried grep, hope there must mistake in oneliner. appreciated.

grep '^  pf \|      \d' infile.txt 

expected output:

pf00112.18   2 pesvdwrekkgavtpvkdqgscgscwafsavgalegrlaiktkkklvslseqelvdcskeenegcngglmenafeyikknggivtekdypykakekgkckkkkkkekvakikgygkvkenseealkkalakngpvsvaidaseedfqlyksgvyketecsktelnhavlivgygvengkkywivknswgtdwgekgyiriargknnecgieseavyp 218  1au0:a       2 pdsvdyrkkg-yvtpvknqgqcgscwafssvgalegqlkkkt-gkllnlspqnlvdcvs-endgcgggymtnafqyvqknrgidsedaypyvgqe-escmynptgka-akcrgyreipegnekalkravarvgpvsvaidasltsfqfyskgvyydescnsdnlnhavlavgygiqkgnkhwiiknswgenwgnkgyilmarnknnacgianlasfp 213 

you can use following grep expression:

grep '^[[:space:]]\+pf\|^[[:space:]]\+[[:digit:]]' input.txt 

the first pattern ^[[:space:]]\+pf searches line contains 1 or more spaces @ start, followed term pf. second pattern searches 1 ore more spaces @ start @ line, followed digit.

this can simplyfied to:

grep '^[[:space:]]\+\(pf\|[[:digit:]]\)' input.txt 

since both patterns start 1 or more spaces @ start of line.

let me suggest use egrep instead of grep because extended posix regexes save use escaping:

egrep '^[[:space:]]+(pf|[[:digit:]])' input.txt 

Comments

Popular posts from this blog

javascript - Google App Script ContentService downloadAsFile not working -

javascript - Function overwritting -

php - Find a regex to take part of Email -