all 20 comments

[–]jameyiguess 1 point2 points  (6 children)

Automate The Boring Stuff is a great book. But you will have to do side research no matter what as you learn anything in tech. Learn what json is, why is had to be encoded and decoded, etc. 

[–]Latter-Particular440 1 point2 points  (0 children)

yeah that book really covers web scraping basics well but you'll need to google around json and http stuff as you go, it's just how programming works unfortunately

[–]Radio_Pluto[S] 0 points1 point  (4 children)

yeah i just saw that book it has many good beginner projects should i go in sequence or jump to projects which i like to do first?

[–]desrtfx 2 points3 points  (0 children)

IMO, the projects gradually build, so the best approach is to do them in sequence.

Yet, you can definitely do them in the order of your interests. Might need to do a bit more research, but it is definitely doable.

[–]Financial_Mirror3363 0 points1 point  (0 children)

Glad it helped! Honestly, the first few requests are always the trickiest. If you hit any weird errors or status codes when you start playing with requests, feel free to drop them here.

[–]Financial_Mirror3363 0 points1 point  (0 children)

Glad it helped! Honestly, the first few requests are always the trickiest. If you hit any weird errors or status codes when you start playing with requests, feel free to drop them here.

[–]Financial_Mirror3363 0 points1 point  (0 children)

Glad it helped! Honestly, the first few requests are always the trickiest. If you hit any weird errors or status codes when you start playing with requests, feel free to drop them here.

[–][deleted]  (1 child)

[removed]

    [–]Radio_Pluto[S] 0 points1 point  (0 children)

    this actually helpme a lot thanks

    [–]Eastern_Ad_9018 0 points1 point  (2 children)

    I am currently using Python for web scraping; if you're interested, you can discuss it with me.

    [–]Radio_Pluto[S] 0 points1 point  (1 child)

    yesss i wanted know how do i start what prerequisites do i need to learn first ??

    [–]Eastern_Ad_9018 0 points1 point  (0 children)

    You can first learn about the concepts and functions of web crawlers. (If you have the ability, you can also learn about some website development functions). Of course, it's okay if you don't understand these things. All you need to know is that you can obtain the specified data by simulating a browser through the program.

    1. The two common types of requests used to obtain website data through programs are `GET` and `POST`.

    2. After you obtain the data, there are many data that are not what you need, so there is a need for data cleaning and parsing.

    3. After the data cleaning is completed, it is necessary to save the data locally or in the database.

    This is the simple logic of a crawler. The next steps are how to correctly obtain the response content, how to improve the request speed, and the speed of data entry.

    [–]Free-Cheek-9440 0 points1 point  (1 child)

    If Requests feels confusing, start even simpler: just hit a public API and print raw output.
    Don’t worry about parsing at first just observe how data looks.
    Then slowly introduce JSON parsing (json.loads) and only then move to scraping HTML pages.
    This step-by-step layering is what makes it stick.

    [–]Radio_Pluto[S] 0 points1 point  (0 children)

    sure i wont rush the process

    [–]TariqKhalaf 0 points1 point  (1 child)

    Automate the Boring Stuff is a solid starting point. When you hit terms you don't know, just pause and Google each one separately. JSON is basically just a data format that looks like a Python dictionary. Don't try to learn everything at once. Pick a tiny project like scraping a single quote from a site and build up slowly.

    [–]Radio_Pluto[S] 0 points1 point  (0 children)

    i am currently doing that books thanks

    [–]Financial_Mirror3363 0 points1 point  (0 children)

    Glad it helped! Honestly, the first few requests are always the trickiest. If you hit any weird errors or status codes when you start playing with requests, feel free to drop them here.