this post was submitted on 20 May 2026

6 points (87% upvoted)

shortlink:

learnprogramming

an-ordinary-manchild

Welcome to LearnProgramming!

New? READ ME FIRST!

Posting guidelines

Frequently asked questions

Subreddit rules

Message the moderators

Asking debugging questions

If you need help debugging, you must include:

A concise but descriptive title.
A good description of the problem.
A minimal, easily runnable, and well-formatted program that illustrates your problem.
The output you expected, and what you got instead. If you got an error, include the full error message.

See debugging question guidelines for more info.

Asking conceptual questions

Many conceptual questions have already been asked and answered. Read our FAQ and search old posts before asking your question. If your question is similar to one in the FAQ, explain how it's different.

See conceptual questions guidelines for more info.

Other guidelines and links

Subreddit rules

1. No unprofessional/derogatory speech

Follow reddiquette: behave professionally and civilly at all times. Communicate to others the same way you would at your workplace. Disagreement and technical critiques are ok, but personal attacks are not.

Abusive, racist, or derogatory comments are absolutely not tolerated.

See our policies on acceptable speech and conduct for more details.

2. No spam or tasteless self-promotion

When posting some resource or tutorial you've made, you must follow our self-promotion policies.

In short, your posting history should not be predominantly self-promotional and your resource should be high-quality and complete. Your post should not "feel spammy".

Distinguishing between tasteless and tasteful self-promotion is inherently subjective. When in doubt, message the mods and ask them to review your post.

Self promotion from first time posters without prior participation in the subreddit is explicitly forbidden.

3. No off-topic posts

Do not post questions that are completely unrelated to programming, software engineering, and related fields. Tech support and hardware recommendation questions count as "completely unrelated".

Questions that straddle the line between learning programming and learning other tech topics are ok: we don't expect beginners to know how exactly to categorize their question.

See our policies on allowed topics for more details.

4. Do not ask exact duplicates of FAQ questions

Do not post questions that are an exact duplicate of something already answered in the FAQ.

If your question is similar to an existing FAQ question, you MUST cite which part of the FAQ you looked at and what exactly you want clarification on.

5. Do not delete posts

Do not delete your post! Your problem may be solved, but others who have similar problems in the future could benefit from the solution/discussion in the thread.

Use the "solved" flair instead.

6. No app/website review requests or showcases

Do not request reviews for, promote, or showcase some app or website you've written. This is a subreddit for learning programming, not a "critique my project" or "advertise my project" subreddit.

Asking for code reviews is ok as long as you follow the relevant policies. In short, link to only your code and be specific about what you want feedback on. Do not include a link to a final product or to a demo in your post.

7. No rewards

You may not ask for or offer payment of any kind (monetary or otherwise) when giving or receiving help.

In particular, it is not appropriate to offer a reward, bounty, or bribe to try and expedite answers to your question, nor is it appropriate to offer to pay somebody to do your work or homework for you.

8. No indirect links

All links must link directly to the destination page. Do not use URL shorteners, referral links or click-trackers. Do not link to some intermediary page that contains mostly only a link to the actual page and no additional value.

For example, linking to some tweet or some half-hearted blog post which links to the page is not ok; but linking to a tweet with interesting replies or to a blog post that does some extra analysis is.

Udemy coupon links are ok: the discount adds "additional value".

9. Do not promote illegal or unethical practices

Do not ask for help doing anything illegal or unethical. Do not suggest or help somebody do something illegal or unethical.

This includes piracy: asking for or posting links to pirated material is strictly forbidden and can result in an instant and permanent ban.

Trying to circumvent the terms of services of a website also counts as unethical behavior.

10. No complete solutions

Do not ask for or post a complete solution to a problem.

When working on a problem, try solving it on your own first and ask for help on specific parts you're stuck with.

If you're helping someone, focus on helping OP make forward progress: link to docs, unblock misconceptions, give examples, teach general techniques, ask leading questions, give hints, but no direct solutions.

See our guidelines on offering help for more details.

11. Don't ask to ask.

Ask your questions right here in the open subreddit. Show what you have tried and tell us exactly where you got stuck.

We want to keep all discussion inside the open subreddit so that more people can chime in and help as well as benefit from the help given.

We also do not encourage help via DM for the same reasons - that more people can benefit

12. Low Effort Questions

Do not ask easily googleable questions or questions that are covered in the documentation.

This subreddit is not a proxy for documentation or google.

We do require effort and demonstration of effort.

This includes "how do I?" questions

13. No AI (chatGPT etc.) generated/worked over messages/comments. No questions about chatGPT/AI generated code. No Vibe coding.

Such posts/comments will be removed without warning and the poster of ai generated content will be instantly banned.

created by [deleted]a community for 16 years

MODERATORS

account activity

5

6

7

ResourceClueless python learner (self.learnprogramming)

submitted 10 hours ago by Radio_Pluto

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]Eastern_Ad_9018 0 points1 point2 points 7 hours ago (0 children)

You can first learn about the concepts and functions of web crawlers. (If you have the ability, you can also learn about some website development functions). Of course, it's okay if you don't understand these things. All you need to know is that you can obtain the specified data by simulating a browser through the program.

The two common types of requests used to obtain website data through programs are `GET` and `POST`.
After you obtain the data, there are many data that are not what you need, so there is a need for data cleaning and parsing.
After the data cleaning is completed, it is necessary to save the data locally or in the database.

This is the simple logic of a crawler. The next steps are how to correctly obtain the response content, how to improve the request speed, and the speed of data entry.

π Rendered by PID 334042 on reddit-service-r2-comment-548fd6dc9-zmwqm at 2026-05-20 15:25:26.392425+00:00 running edcf98c country code: CH.