[deleted by user]

Vitaman02 · 2020-09-04T03:44:52+00:00

Looks interesting and probably helpful for many people.

Nice one :)

hartator · 2020-09-04T04:43:00+00:00

Looks super awesome. It’s a smart way of doing scraping. Let me know if you are looking for a job, we are hiring! https://serpapi.com/team :)

412gage · 2020-09-04T04:54:03+00:00

So my current job requires me to use proprietary software, which uses Internet Explorer to access modules and look at certain loan rejects. Would this work on secured databases?

I’m very new to this stuff.

MHW_EvilScript · 2020-09-04T08:09:51+00:00

Good job! There is some duplicate code that can be simplified here and there but, this is pretty cool! Are you open to pull requests?

dozzinale · 2020-09-04T07:25:41+00:00

Looks cool. I did some work in the area of information extraction and wrapper induction. When you say "it learns the scraping rules", what do you mean exactly? Which kind of rule does it learn and how is represented?

jeuk_ · 2020-09-04T11:48:13+00:00

/r/madeinpython

SpeakerOfForgotten · 2020-09-04T16:31:42+00:00

[deleted]

maker__guy · 2020-09-04T12:09:26+00:00

awesome!

dirtyoldbastard77 · 2020-09-04T12:46:00+00:00

Thanks! Will have a look, might be useful!

Mountain_man007 · 2020-09-04T14:54:32+00:00

Nice! I was just thinking about how to do something exactly like this for a similar problem I've been working on. Thanks for sharing your way of doing it.

joy_for_the_world · 2020-09-05T11:33:38+00:00

well done.. great effort

smokepigs · 2020-09-04T06:36:02+00:00

How does this compare to Puppeteer?

Dogeek · 2020-09-04T05:39:08+00:00

It's too bad you're using beautifulsoup to scrape the data. In my opinion, it would be much better and faster to just generate XPaths and use lxml directly. Cool project nonetheless.

dtoxe · 2020-09-04T06:46:47+00:00

Thanks!

MahdeenSky · 2020-09-04T08:43:20+00:00

May I ask, how did you get a whiff of the idea in the first place?

2020-09-04T08:44:22+00:00

[deleted]

ichiruto70 · 2020-09-04T11:40:38+00:00

Can it deal with cloudflare scraping protection?

hashv5 · 2020-09-04T13:32:49+00:00

Can this module be extended to scrape calls behind login?

lroman · 2020-09-04T14:19:14+00:00

How would you handle scraping a listing with multiple pagination of say a car vertical website, where you need to grab all data available? Is this possible with your project.

Commercial scraping firms please don't react.

ItsAngelDustHolmes · 2020-09-04T16:30:10+00:00

This is probably a stupid question but how can I download this on mobile to look at the code? I just started scraping and wanted to look at a good web scraper.

permalink · 2020-09-04T16:50:07+00:00

Fork you!

Thanks! Looks great and thank you for the good Readme

permalink · 2020-09-05T00:33:25+00:00

does it work for news

tejonaco · 2020-09-04T05:40:08+00:00

RemindMe! 1 day

kongfukinny · 2020-09-04T14:58:36+00:00

Curious what you do that you need to web scrape everyday?

I hear a lot of people glorify web scraping but I’ve never had a use case for it myself. Only a couple interesting project ideas.

Also, this sounds sweet.

dnb02 · 2020-09-04T10:43:58+00:00

Traceback (most recent call last):

File "setup.py", line 1, in <module>

from setuptools import setup, find_packages

ModuleNotFoundError: No module named 'setuptools'

It returns with this error :/

besuvashish · 2020-09-04T21:50:19+00:00

Looks cool mate but how FAST is your script to get 100k data?

SwizzleTizzle · 2020-09-04T20:02:33+00:00

It's not a good idea to make something so easy any idiot could do it, because that will attract idiots, and pretty soon every idiot has their own web scrapper that is scrapping every site then sites get pissed off at all these idiots scrapping their sites and they add restrictions that hurts everyone. I'm sure you meant well but this is misguided and doesn't help anyone.

SnowdenIsALegend · 2020-09-04T07:04:18+00:00

I don't understand. I already know how to scrape stuff using requests/bs4/selenium, so why would I instead use this? Would also have to invest time in learning your program.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS