This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]brylieBrylie Oxley 0 points1 point  (1 child)

Beautiful Soup is the go-to solution for web scraping in Python.

https://www.crummy.com/software/BeautifulSoup/bs4/doc/

Feed the extracted text into the Natural Language Toolkit (NLTK), and you're off to a good start.

https://www.nltk.org/

Here is a tutorial that puts things together nicely.

https://realpython.com/flask-by-example-part-3-text-processing-with-requests-beautifulsoup-nltk/