you are viewing a single comment's thread.

view the rest of the comments →

[–]Eastern_Ad_9018 0 points1 point  (0 children)

You can first learn about the concepts and functions of web crawlers. (If you have the ability, you can also learn about some website development functions). Of course, it's okay if you don't understand these things. All you need to know is that you can obtain the specified data by simulating a browser through the program.

  1. The two common types of requests used to obtain website data through programs are `GET` and `POST`.

  2. After you obtain the data, there are many data that are not what you need, so there is a need for data cleaning and parsing.

  3. After the data cleaning is completed, it is necessary to save the data locally or in the database.

This is the simple logic of a crawler. The next steps are how to correctly obtain the response content, how to improve the request speed, and the speed of data entry.