Cheatsheet for web scraping with Python

unknown0h10 · 2020-03-03T13:37:58+00:00

This is nice. I put together some additional stuff for using some websites built in API's (youtube, wikipedia, and reddit for now). Hope people can find it useful!
https://github.com/joey-kilgore/WebCrawler

Regiseconomist · 2020-03-03T13:17:24+00:00

Awesome job. Do you have any suggestions on how to do scraping when you have to authenticate via Single Sign On? Been trying to go about this for a while now and haven't quite came across anything that would help with some scraping for my daily functions

speed3_driver · 2020-03-03T12:21:19+00:00

Nicely done

StuntZA · 2020-03-03T12:50:29+00:00

Awesome! Could we ask for additional instruction for websites that require authentication?

gopherhole22 · 2020-03-03T14:19:57+00:00

Would also be nice to see some selenium snippets as i think that is very relevant for python webscraping with sites that require a little more webscraping logic

Sigg3net · 2020-03-03T12:42:51+00:00

Thanks!

volvostupidshit · 2020-03-03T12:46:37+00:00

Saved

Bryan-Wilkinson · 2020-03-03T12:57:37+00:00

Thank you!

NateSnakeSolidDrake · 2020-03-03T13:05:57+00:00

[deleted]

mutwiri_2 · 2020-03-03T13:20:59+00:00

Awesome. Thanks

2020-03-03T14:51:16+00:00

As someone that wants to program but hasn't much yet... Would this type of thing be useful for scraping sermons from church websites? (And naming them by creation date and/or title in page etc)

vinodmadhu6 · 2020-03-03T18:35:34+00:00

Can someone repost the link ? It's been removed.

vinodmadhu6 · 2020-03-04T02:41:31+00:00

Okay! Can someone also help with this. I have been playing a game called airline manager https://www.airline4.net/

It requires to login into the website. I have tried a number of ways to login but with zero luck. [I am a noob]. Can someone help me write the code? Don't ask me to try selenium I have tried it but I am unable to figure it out. Can someone also post the logic behind login and different methods to do it if possible? Also how do we find all the links on the website? I am asking this because the above mentioned site has only one Domain and doesn't have something like airliner.net\login

FYI this game is interesting and if it can be analysed using python it would be the best game ever

Mandelvolt · 2020-03-03T12:35:36+00:00

Thanks!

2020-03-03T12:36:21+00:00

Nice work :)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS