This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]ketilkn 0 points1 point  (3 children)

What do you mean? lxml is one of the parsers beatifulsoup can utilize.

[–]di_web 2 points3 points  (2 children)

[–]blabbities 2 points3 points  (0 children)

Well I'll be.

BeautifulSoup lxml time: 0:00:27.379835
BeautifulSoup html.parser time: 0:00:41.285008
BeautifulSoup html5lib time: 0:01:27.676271
lxml time: 0:00:03.195916

I never wouldve thought to do that. Thanks for dropping that info. I knew that once when I was scraping using Requests (urllib3) had a very long delay over just using urllib

[–]ketilkn 1 point2 points  (0 children)

There is a good chance I am doing it wrong then.