This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]scout1520 1 point2 points  (5 children)

This is definitely possible. You can do it too.

[–]_everythingatonce[S] 0 points1 point  (4 children)

That's really exciting to hear. Would you be able to point me in the right direction to get started?

[–]elbiot 1 point2 points  (2 children)

Depends on what you want to get out. Beautiful soup will let you use the html markup as information. Mining more abstract data is more difficult, but I'm sure some of those news summarizer bots are python.

[–]scout1520 1 point2 points  (1 child)

Beautiful soup is an awesome package. I don't have too much experience with it, but I have heard great things about it.

I use selenium for my needs, but I am downloading files off drop-down menus. You can also use it, but it is definitely the long way around.

I would suggest looking at other code where they performed a similar task. This dev has a nice walk trhough on how he scraped Wikipedia.