python - Can't solve: TypeError: expected string or buffer -
i'm trying write code scrap numbers html finding span tags , numbers within them.
i keep getting error "expected string or buffer".
i've read solution while doing search through different question, when try " ''.join(some_list)" i'm getting error:
"sequence item 0: expected string, tag found"
tried search one, saw solutions using .get instead of re.findall, error keep appear.
the code:
import urllib beautifulsoup import * url = raw_input('enter url:') stri = urllib.urlopen(url).read() soup = beautifulsoup(stri) #retrieve of span tags spans = ''.join(soup('span')) numlist = list() tag in spans: num = int(re.findall('[0-9]+', tag)) numlist.append(num) print(numlist) i saw several solutions type of errors, can't seem solve it.
what missing?
i added tag.text, , error has changed one, i'm getting: "errno 11004] getaddrinfo failed"
i looked @ different posts couldn't solve it, ran code line line see where's problem is, , found appears when i'm running fourth sentence in original code:
html = urllib.urlopen(url).read() please help?
tag tag object, contains lots of information, not string. if want text inside tag without markup, use tag.text, e.g.:
spans = ''.join(tag.text tag in soup('span')) # `for tag in spans:` makes no sense because spans string or
spans = soup('span') tag in spans: num = len(re.findall('[0-9]+', tag.text)) # note len, not int
Comments
Post a Comment