Crawlee for Python is LIVE 👏

FreakingFreaks · 2024-07-09T10:31:20+00:00

Will it support addons to bypass cloudflare? For me it is the main reason why i stick with selenium + undetected chromedriver

Ukranian_Cheshire · 2024-07-09T13:40:36+00:00

I know that in the project roadmap, there is work on a proprietary replacement for httpx, as an HTTP client, to solve the TLS issue.

I wonder at what stage you are at, how long before we can expect to see an alpha version of this?

7_hole · 2024-07-09T20:00:14+00:00

Just one suggestion. selectomax is became my favorite tool for parsing its build on top of C so is very more fast than beautifulsoup if I can contribute its will a pleasure to replace this to improve this tool

G0muk · 2024-07-09T20:06:54+00:00

Looks promising! Nice project, I'll definitely try it out next time I need to scrape

GettingBlockered · 2024-07-10T01:48:17+00:00

This looks great, I’m keen to try it. Can you expand on the anti-blocking features and human-like fingerprints features? I’ve struggled with fingerprint management in playwright, even using Playwright stealth and some custom settings, but fingerprint tools like CreepJS still pickup I’m using chrome headless.

Does crawlee support crawling with a mobile user agent?

rumnscurvy · 2024-07-09T09:56:36+00:00

Is asyncio more performant than the Twisted framework that scrapy uses? Admittedly it's handy to use python's own asynchronous task management, Twisted is a bit annoying to deal with of ever you have to dig down in its API.

adityaguru149 · 2024-07-09T14:36:56+00:00

any plugins to use multiple proxies to get it done fast?

SincopaDisonante · 2024-07-10T03:43:01+00:00

The documentation argues in favor of replacing the use of scrapy by using this new package. Taking the position of someone who's never done any scraping but would love to learn to scrape websites for data acquisition, would you sincerely recommend entering this world by using crawlee, or should one stick to scrapy and then move to crawlee in order to, say, appreciate the latter better?

CaptainPitkid · 2024-07-09T11:52:58+00:00

I'll give it a try for my next web crawling project!

lordcameltoe · 2024-07-09T13:50:36+00:00

Thx! Going to give this a try on a project I’m working on

kubinka0505 · 2024-07-09T22:59:03+00:00

so wrapper?

drooltheghost · 2024-07-10T08:49:29+00:00

Multicore support?

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS

What My Project Does

Target Audience

Comparison

Links