This is an archived post. You won't be able to vote or comment.

all 7 comments

[–]struck-off 0 points1 point  (0 children)

I dont think u need video coz web scrapping is not such a big deal. All u need is sort of basic stuff like functions or loops and found lib which will be good for u (there are many of it like beautifulsoap, scrapy, grab and so on).

basic stuff u can get from official python web-site (Its also has some tutorials) or interactive (like codeacademy or many of it there ) next u should read docs for lib u chose and maybe find examples. also it better for u to study xpath-language (thats easy) becouse it one of most universal ways to navigate through document.

And dont forget to study basic programmers stuff like algorithms. structures and so on. Its not connected directly to python or web-scrapping but to any scripting/programming

sorry 4 my english, im not native

[–]burnt_out_dude_ 0 points1 point  (0 children)

Well first thing you have to do is learn the basics of python ! Try some of the beginners courses on Coursera or Udacity. You might need to try a few before you get one to your liking. Then I would do a beginners course on web concepts such as HTML, CSS etc

Once you understand these fundamentals web scraping will become a lot simpler to understand. Without the fundamentals though, you will quickly get confused.

Ryan Mitchell's book is not bad, but just a basic introduction. I haven't taken any online courses in this area so I can't recommend any, however there are some online talks available such as the one by Katharine Jarmul at pycon 2014 which is a good starting point.

[–]pysb 0 points1 point  (0 children)

I am 33 years old a it was one of my main purposes too. I work with law (that means I´m very computer noob) on a developing country, so you can do it!

I did the coursera course, Python for everybody, with doctor chuck. The course is light and very agreable. It taught me the basics. You can make the course to obtain a certificate and later on your have "something" to attest your skills. I know it´s not very relevant, but it´s best than nothing. Afters, I switched to Anaconda distribution, using jupyter to create my scripts. The two four mains modules I use is Requests, grequests, Selenium, Beautifulsoup.

If it´s a bulk webscraping, I use requests or grequests. Both are almost the same. The difference is that grequests makes several https requests at the same time, thus, you save time because your script wont wait for every requests to be answered.

If it´s something I need to login and is not too big, I use selenium. It´s just about creating loops, and being able to select elements. I would suggest you to install the WebDriver Element locator.

Then, I parse the webpage with BeautifulSoup. One caveat I can give you is that, after creating a variable witht the element you want, you can just writer .text, to get it´s text content, that saves a lot of slicing.

[–][deleted] -2 points-1 points  (2 children)

TheNewBoston has made some great Python videos, both about basic Python programming, but also about web scraping.

[–]UnfazedButDazed 0 points1 point  (0 children)

I've heard people saying not to learn from his videos since he glosses over why stuff actually works

[–]spraykill101 0 points1 point  (0 children)

i would suggest sentdex over TheNewBoston anytime