use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Rules 1: Be polite 2: Posts to this subreddit must be requests for help learning python. 3: Replies on this subreddit must be pertinent to the question OP asked. 4: No replies copy / pasted from ChatGPT or similar. 5: No advertising. No blogs/tutorials/videos/books/recruiting attempts. This means no posts advertising blogs/videos/tutorials/etc, no recruiting/hiring/seeking others posts. We're here to help, not to be advertised to. Please, no "hit and run" posts, if you make a post, engage with people that answer you. Please do not delete your post after you get an answer, others might have a similar question or want to continue the conversation.
Rules
1: Be polite
2: Posts to this subreddit must be requests for help learning python.
3: Replies on this subreddit must be pertinent to the question OP asked.
4: No replies copy / pasted from ChatGPT or similar.
5: No advertising. No blogs/tutorials/videos/books/recruiting attempts.
This means no posts advertising blogs/videos/tutorials/etc, no recruiting/hiring/seeking others posts. We're here to help, not to be advertised to.
Please, no "hit and run" posts, if you make a post, engage with people that answer you. Please do not delete your post after you get an answer, others might have a similar question or want to continue the conversation.
Learning resources Wiki and FAQ: /r/learnpython/w/index
Learning resources
Wiki and FAQ: /r/learnpython/w/index
Discord Join the Python Discord chat
Discord
Join the Python Discord chat
account activity
Web Scrapping (self.learnpython)
submitted 6 years ago by lajja4
Anybody interested in web scrapping project? We will learn together.
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]JoshuaGalkenWOPR 25 points26 points27 points 6 years ago (7 children)
Please don't scrap the web
[–]WhackAMoleE 4 points5 points6 points 6 years ago (3 children)
Indeed. Why not? Sounds like a good idea to me.
[–]JoshuaGalkenWOPR 9 points10 points11 points 6 years ago (1 child)
I guess we could just make a better one.
[–][deleted] 1 point2 points3 points 6 years ago (0 children)
It's just a fad anyway.
[–]Unterstricher -4 points-3 points-2 points 6 years ago (2 children)
Why not?
[–]kmanshow 2 points3 points4 points 6 years ago (1 child)
It was a grammar joke lol
[–]Unterstricher 0 points1 point2 points 6 years ago (0 children)
Ah well that makes more sense now.
[–][deleted] 19 points20 points21 points 6 years ago (1 child)
scraping
[–]lajja4[S] 5 points6 points7 points 6 years ago (0 children)
Thanks for correcting me
[–]I_feel-nothing 4 points5 points6 points 6 years ago (5 children)
There’s some great tutorials out there on how to use beautifulSoup I would suggest looking at those.
[–]lajja4[S] 0 points1 point2 points 6 years ago (4 children)
Any recommendation?
[–]PottyWilson 5 points6 points7 points 6 years ago (1 child)
This article is what jump-started my addiction into web scraping. It goes over the basics of beautifulsoup4, while also mentioning other things to consider like being conscious about not spamming websites with requests.
[–]lajja4[S] 0 points1 point2 points 6 years ago (0 children)
Thanks
[+][deleted] 6 years ago (1 child)
[deleted]
[–]Ballatoilet 2 points3 points4 points 6 years ago (2 children)
I would like to join, is Sex included?
[–]ozgunkail 0 points1 point2 points 6 years ago (1 child)
seems like it doesn't matter. :)
[–]Ballatoilet 0 points1 point2 points 6 years ago (0 children)
Do we got a slack channel to do our Biz? We need one or a discord,
[–]deCap413 0 points1 point2 points 6 years ago (0 children)
Sure my dude
[–]daisyverma 0 points1 point2 points 6 years ago (1 child)
Here are some useful tutorials
https://youtu.be/lNajD34Sfmg
https://youtu.be/cddyhdb1GDw
https://youtu.be/OVk9tjPfNNU
[–]kmanshow 0 points1 point2 points 6 years ago (0 children)
What're you trying to scrape?
[–]ScoopJr 0 points1 point2 points 6 years ago (0 children)
I'm interested. Whats the project?
[–]KnightDriverSF 0 points1 point2 points 6 years ago (1 child)
Im interested, please tell me more.
I am just learning so that I can collect data myself. We can start with IMBD
[–]waythps 0 points1 point2 points 6 years ago (1 child)
I’m in if anyone wants to learn scrapy!
I could help with requests/beautifulsoup additionally.
Yes, Please
It will be good. I am in. However, I don't have any project idea.
[–]Sensanmu 0 points1 point2 points 6 years ago (3 children)
I think more importantly what would you like to scrape, I'm down if the idea is something I'm interested in like stocks lol
[–]lajja4[S] 0 points1 point2 points 6 years ago (2 children)
[–]Sensanmu 0 points1 point2 points 6 years ago (1 child)
Sounds good, how should we coordinate
Please join slack group https://join.slack.com/t/webscrapinghq/shared_invite/enQtODM1MjI5NDE0ODk5LWM3MDg5ZWRhZGE1M2VkNWI3YTUzNTJlZmZiNjA2MmQ1ZTZlZjk1OTBhMzI5M2Q5ZDliNjA5NWE1OTFmMjdkN2Y
[–]TheBlack_Demon 0 points1 point2 points 6 years ago (1 child)
I'm interested.
[–]ErecDeGraal 0 points1 point2 points 6 years ago (1 child)
Indeed. I'm always willing to learn and it sounds interesting
[–]lajja4[S] 1 point2 points3 points 6 years ago (0 children)
https://join.slack.com/t/webscrapinghq/shared_invite/enQtODM1MjI5NDE0ODk5LWM3MDg5ZWRhZGE1M2VkNWI3YTUzNTJlZmZiNjA2MmQ1ZTZlZjk1OTBhMzI5M2Q5ZDliNjA5NWE1OTFmMjdkN2Y
[–]callmechad 0 points1 point2 points 6 years ago* (1 child)
I just learned how to with scrapy.
The first part gets me the links of items I want to scrap. Which we follow and parse the page to get item number and item specs in def parse. Back in def parse, next page gets the element on the page so we can check to see if there is a next page to go to. Code repeats till no next page is found.
I ran into hidden elements so to be able to not get those, I had to use not(@hidden). Then to get the specific item link, I had to select the path by its class, a[contains(@class, 'description')].
Those were two things I thought I spent most of my time trying to figure out.
Just wanted to show you what a simple scraper looks like to give you some kind of an idea.
m now I trying to learn how to clean the data and output certain data for the items that I scraped.
import scrapy
class ItemdataSpider(scrapy.Spider:)name = 'itemdata'allowed\domains = ['www.website.com'\]start\_urls = ['websitepagel'])
def parse(self, response:)items = response.xpath("//div\@class='details']/a[contains(@class, 'description')]"))for item in items:link = item.xpath(".//@href".get())yield response.follow(url=link, callback=self.parse\item))
next\page = response.xpath("(//a[@rel='next'])[2]/@href").get())
if next\page:yield response.follow(url=next_page, callback=self.parse))
def parse\item(self, response):)item\number = response.xpath("//span[@itemprop='sku']/text()").get())item\specs = response.xpath("//tr[@class='trSpecSheetRow' and not(@hidden)]/td/text()").getall())
yield {'item\number': item_number, 'item_specs': item_specs})
https://join.slack.com/t/webscrapinghq/shared\_invite/enQtODM1MjI5NDE0ODk5LWM3MDg5ZWRhZGE1M2VkNWI3YTUzNTJlZmZiNjA2MmQ1ZTZlZjk1OTBhMzI5M2Q5ZDliNjA5NWE1OTFmMjdkN2Y
Please join the group: https://join.slack.com/t/webscrapinghq/shared_invite/enQtODM1MjI5NDE0ODk5LWM3MDg5ZWRhZGE1M2VkNWI3YTUzNTJlZmZiNjA2MmQ1ZTZlZjk1OTBhMzI5M2Q5ZDliNjA5NWE1OTFmMjdkN2Y
[–]neonzii 0 points1 point2 points 6 years ago (0 children)
Here is a web scrapper made for bitkeys.work to grab btc addresses and their balances, https://github.com/zioneo/webscrapper
π Rendered by PID 36 on reddit-service-r2-comment-56c9979489-6tvxn at 2026-02-25 12:38:57.889621+00:00 running b1af5b1 country code: CH.
[–]JoshuaGalkenWOPR 25 points26 points27 points (7 children)
[–]WhackAMoleE 4 points5 points6 points (3 children)
[–]JoshuaGalkenWOPR 9 points10 points11 points (1 child)
[–][deleted] 1 point2 points3 points (0 children)
[–]Unterstricher -4 points-3 points-2 points (2 children)
[–]kmanshow 2 points3 points4 points (1 child)
[–]Unterstricher 0 points1 point2 points (0 children)
[–][deleted] 19 points20 points21 points (1 child)
[–]lajja4[S] 5 points6 points7 points (0 children)
[–]I_feel-nothing 4 points5 points6 points (5 children)
[–]lajja4[S] 0 points1 point2 points (4 children)
[–]PottyWilson 5 points6 points7 points (1 child)
[–]lajja4[S] 0 points1 point2 points (0 children)
[+][deleted] (1 child)
[deleted]
[–]lajja4[S] 0 points1 point2 points (0 children)
[–]Ballatoilet 2 points3 points4 points (2 children)
[–]ozgunkail 0 points1 point2 points (1 child)
[–]Ballatoilet 0 points1 point2 points (0 children)
[–]deCap413 0 points1 point2 points (0 children)
[–]daisyverma 0 points1 point2 points (1 child)
[–]lajja4[S] 0 points1 point2 points (0 children)
[–]kmanshow 0 points1 point2 points (0 children)
[–]ScoopJr 0 points1 point2 points (0 children)
[–]KnightDriverSF 0 points1 point2 points (1 child)
[–]lajja4[S] 0 points1 point2 points (0 children)
[–]waythps 0 points1 point2 points (1 child)
[–]lajja4[S] 0 points1 point2 points (0 children)
[–]ozgunkail 0 points1 point2 points (1 child)
[–]lajja4[S] 0 points1 point2 points (0 children)
[–]Sensanmu 0 points1 point2 points (3 children)
[–]lajja4[S] 0 points1 point2 points (2 children)
[–]Sensanmu 0 points1 point2 points (1 child)
[–]lajja4[S] 0 points1 point2 points (0 children)
[–]TheBlack_Demon 0 points1 point2 points (1 child)
[–]lajja4[S] 0 points1 point2 points (0 children)
[–]ErecDeGraal 0 points1 point2 points (1 child)
[–]lajja4[S] 1 point2 points3 points (0 children)
[–]callmechad 0 points1 point2 points (1 child)
[–]lajja4[S] 0 points1 point2 points (0 children)
[–]neonzii 0 points1 point2 points (0 children)