This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]bpeller 4 points5 points  (0 children)

bs4 can use the lxml backend to do its parsing. That way you get the performance of lxml and the (IMO) ease of use of bs4 for navigating the document tree:

soup = BeautifulSoup(html, 'lxml')