python - Getting text without tags using BeautifulSoup? -


i have been using beautifulsoup parse html document , seem have run problem. found text need extract, text plain. there no tags or anything. not sure if need use regex instead in order this, because not know if can grab text beautifulsoup considering not contain tags.

<strike style="color: #777777">975</strike> 487 rp<div class="gs-container default-2-col"> 

i trying extract "487".

thanks!

you can use previous or next tag anchor find text. example, find <strike> element first, , text node next :

from bs4 import beautifulsoup  html = """<strike style="color: #777777">975</strike> 487 rp<div class="gs-container default-2-col">""" soup = beautifulsoup(html)  #find <strike> element first, text element next result = soup.find('strike',{'style': 'color: #777777'}).findnextsibling(text=true)  print(result.encode('utf-8')) #output : ' 487 rp'  #you can simple text manipulation/regex clean result 

note above codes sake of demo, not accomplish entire task.


Comments

Popular posts from this blog

c# - Validate object ID from GET to POST -

node.js - Custom Model Validator SailsJS -

php - Find a regex to take part of Email -