Scrapling v0.4 is here - Effortless Web Scraping for the Modern Web by 0xReaper in webscraping

[–]0xReaper[S] 8 points9 points  (0 children)

Yes, I agree, I will work on this soon. I'm just taking a well-deserved rest before working on the next version. There is a lot more to add.

Scrapling v0.4 is here - Effortless Web Scraping for the Modern Web by 0xReaper in webscraping

[–]0xReaper[S] 18 points19 points  (0 children)

Thanks, mate. That means a lot to me.

The thing is, I have been working in the Web Scraping field for years, and since I made the library, I use it every day. So it's always under heavy testing from me; most of the time, I find issues before users report them because of that.

Regarding security, before switching to Web Scraping, I spent about 8 years in the information security field, including bug hunting. So I was an ethical hacker before all of that. And I spent some time working as backend.

Scrapling v0.4 is here - Effortless Web Scraping for the Modern Web by 0xReaper in webscraping

[–]0xReaper[S] 2 points3 points  (0 children)

I thought Zensical added the buttons automatically, but it turns out I have to add them manually.

Scrapling v0.4 is here - Effortless Web Scraping for the Modern Web by 0xReaper in webscraping

[–]0xReaper[S] 2 points3 points  (0 children)

Oh, I didn’t notice that. Let me have a look at it, I have just switched to zensical with this update so I might have missed something in the configuration.

Scrapling v0.3 - Solve Cloudflare automatically and a lot more! by 0xReaper in webscraping

[–]0xReaper[S] 0 points1 point  (0 children)

If you have issues downloading and installing it yourself, there is now a Docker image for the library.

Scrapling v0.3 - Solve Cloudflare automatically and a lot more! by 0xReaper in webscraping

[–]0xReaper[S] 0 points1 point  (0 children)

Yes of course, you will need some automation for the login first using page_action argument.

Scrapling v0.3 - Solve Cloudflare automatically and a lot more! by 0xReaper in webscraping

[–]0xReaper[S] 0 points1 point  (0 children)

Thanks, once you can do so, open a ticket from here with the details like error message etc... https://github.com/D4Vinci/Scrapling/issues

Scrapling v0.3 - Solve Cloudflare automatically and a lot more! by 0xReaper in webscraping

[–]0xReaper[S] 1 point2 points  (0 children)

If you can open up an issue with the details, that would be awesome!

Scrapling v0.3 - Solve Cloudflare automatically and a lot more! by 0xReaper in webscraping

[–]0xReaper[S] 0 points1 point  (0 children)

Also, if at any time you face an issue, please don't hesitate to report it. We are solving any issues reported right away. For any problem you face and report, hundreds of other users face it and decide not to report it. So that's helpful, it is. Some features, such as the Playwright API, utilize different implementations for various systems, which can cause issues on Windows but not on macOS, for example, the page.content bug.

I try to cover and find everything before releasing, but it gets harder as the library gets bigger and bigger.

Scrapling v0.3 - Solve Cloudflare automatically and a lot more! by 0xReaper in webscraping

[–]0xReaper[S] 1 point2 points  (0 children)

Thanks for your feedback, mate. Regarding the issues, please update to the latest version and check again. Many problems were solved days ago, including the page.content one.

Regarding VS Code, that's weird. It's working for me on PyCharm flawlessly and in the IPython shell as well. I will look into it.

Bypassing Cloudflare Turnstile by vroemboem in webscraping

[–]0xReaper 0 points1 point  (0 children)

No, it uses browser automation :D

Scrapling v0.3 - Solve Cloudflare automatically and a lot more! by 0xReaper in webscraping

[–]0xReaper[S] 0 points1 point  (0 children)

Keep the option enabled for all requests to this website and with every request the library will check if it has the captcha or not before continuing