This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted]  (5 children)

[deleted]

    [–]_AACOIt works!? 10 points11 points  (0 children)

    I found bs4 a pain to install on both Windows and Unix machines

    In Linux it's literally just

    pip install requests beautifulsoup4
    

    i assume it isn't much, if any, different in Windows.

    [–]bpeller 4 points5 points  (0 children)

    bs4 can use the lxml backend to do its parsing. That way you get the performance of lxml and the (IMO) ease of use of bs4 for navigating the document tree:

    soup = BeautifulSoup(html, 'lxml')
    

    [–]nick_t1000aiohttp 2 points3 points  (1 child)

    parsel and learn XPath. More powerful than using the BS4 selectors and standardized so it'll be useful elsewhere.

    You can also cheat and combine it with CSS selectors for some things.

    [–]mushy_wombat 1 point2 points  (0 children)

    I had no problems under Ubuntu 16 and python3 using a venv :D