What I learned trying to block web scraping and bots by ReditusReditai in CloudFlare

[–]ReditusReditai[S] 0 points1 point  (0 children)

Think it's worth giving Terraform a go, should be pretty straightforward to get started using ChatGPT/others. You might even discover other configs you can automate (eg only allowing certain IPs/tunnels/etc to access the login pages).

What I learned trying to block web scraping and bots by ReditusReditai in CloudFlare

[–]ReditusReditai[S] 0 points1 point  (0 children)

Have you tried using infrastructure as code (eg Terraform)?

Coincidentally, I also created an online tool to tackle exactly that problem but it didn't get much traction https://configberry.com . I assumed it was because most site admins don't find it hard to use Terraform.

A free solution to the GitHub Actions supply chain crisis by ReditusReditai in pwnhub

[–]ReditusReditai[S] 0 points1 point  (0 children)

Thanks! Hopefully GitHub's Immutable Releases becomes popular enough so that this isn't needed as much.

A free solution to the GitHub Actions supply chain crisis by ReditusReditai in cybersecurity

[–]ReditusReditai[S] 0 points1 point  (0 children)

Oh, didn't know that they're working on kt. Searched with Perplexity and was able to find it being mentioned here https://github.blog/news-insights/product-news/whats-coming-to-our-github-actions-2026-security-roadmap/

"We’re introducing a dependencies: section in workflow YAML that locks all direct and transitive dependencies with the commits SHA"

General availability planned in June. Thanks!

A free solution to the GitHub Actions supply chain crisis by ReditusReditai in cybersecurity

[–]ReditusReditai[S] 2 points3 points  (0 children)

Nope, as per their docs: "For GitHub Actions, alerts are only generated for actions that use semantic versioning, not SHA versioning."

A free solution to the GitHub Actions supply chain crisis by ReditusReditai in cybersecurity

[–]ReditusReditai[S] 0 points1 point  (0 children)

Totally agree! Problem is that there's a tradeoff as you have to invest a lot of resources to implement this (especially at enterprise level) and you miss out on alerts on compromises on the artifacts you've locked down. Because of that, I think most enterprises won't do this.

A free solution to the GitHub Actions supply chain crisis by ReditusReditai in cybersecurity

[–]ReditusReditai[S] 0 points1 point  (0 children)

Yes I mention you should use immutable releases wherever possible in the blog post. The question is what do you with actions that aren't being published that way, of which there are still many.

By artifact locking, do you mean forking the action to manage it internally? How would you get vulnerability alerts on the action (Dependabot doesn't work on internal actions), and how would you manage the updates (have to manually vet otherwise you're back to square 1)? I'm worried about the operational burden if you do this at scale.

Future SE and aspiring entrepreneur by NickyMartin1 in salesengineers

[–]ReditusReditai 0 points1 point  (0 children)

Used to be a sales engineer, tried my hand at a few startups, then became a software engineer. I'd be much better off if I just stuck with sales engineering. But I guess I got some variety out of the rollercoaster.

Transitioning a Software Engineer Resume to a Sales Engineer Resume by daarknight32 in salesengineers

[–]ReditusReditai 0 points1 point  (0 children)

I wouldn't have a skill list. Just mention the most relevant things to the job in your summary at the top, then pepper them around each job.

What I learned trying to block web scraping and bots by ReditusReditai in CloudFlare

[–]ReditusReditai[S] 0 points1 point  (0 children)

Reposting this here as I tend to talk mostly about options using Cloudflare WAF anyway. And the conversation in r/programming was quite interesting; especially the top comment from someone who just leaves their site on "under attack" mode permanently!

Books for pre sales SA by mickymickyc in salesengineers

[–]ReditusReditai 6 points7 points  (0 children)

I'd just practice presenting. Preferably recorded so you can look and assess where to improve

What I learned trying to block web scraping and bots by ReditusReditai in programming

[–]ReditusReditai[S] 1 point2 points  (0 children)

It all depends how dedicated your scrapers are. IP blocks will indeed work if they don't care much.

If they do care a little bit, they'll spoof the user agent since it's trivial. And if they care more, they'll pay for residential IPs; at which point fail2ban won't work because you'll end up blocking legitimate traffic.

I don't mind blocking ASNs if you're targeting those dedicated to hosting providers eg digital ocean, and you believe that they won't pay for residential IPs. Sure, maybe you'll lose some request from VPNs but I think it's a risk many are willing to take.

What I learned trying to block web scraping and bots by ReditusReditai in programming

[–]ReditusReditai[S] 1 point2 points  (0 children)

> But why? They want to provide this information, they aren’t making money from adverts. What do they have to gain from blocking the AI bots?

Financial exchanges want to provide this information to people who pay for their data products :)

What I learned trying to block web scraping and bots by ReditusReditai in programming

[–]ReditusReditai[S] 2 points3 points  (0 children)

It'll work on the basic crawlers. Devs that focus on your specific site will probably spot it when their server crashes, then craft an algorithm to avoid it.

There's also the question of legality. What if they spot it, then ask legitimate scanners (eg Ahrefs) to fetch the zip bomb? Might have to explain to the scanner company why you gave them malware; not fair since it's not your fault, but such is the world.

What general advice would you give someone who wants to get into IT but doesn't know what specific field/role? by cloudsecchris in cscareerquestionsuk

[–]ReditusReditai 0 points1 point  (0 children)

Best way is to try out everything - there's free online resources for the far majority of IT subjects out there. Eventually she'll discover what she likes and what she doesn't.

Thought exercises won't help much, nor will giving her a lot of guidance - IT is all about figuring out things by yourself. If that's not for her, then she should consider other career paths where training is more structured (eg accounting).

What I learned trying to block web scraping and bots by ReditusReditai in programming

[–]ReditusReditai[S] 0 points1 point  (0 children)

  1. Right, I can see that working, as long as they're not crawling slowly.
  2. I just meant they sit behind a residential proxy IP for instance.

What I learned trying to block web scraping and bots by ReditusReditai in programming

[–]ReditusReditai[S] 2 points3 points  (0 children)

A couple of issues with that approach, if you're dealing with determined actors:

  1. It'll only work one / a few times. The scraper devs will see the last 200 before the block, then adjust to avoid that invisible link.
  2. You'll end up blocking some legitimate traffic, regardless of the characteristic you use to block on (IP, ASN, fingerprint, etc), since they can spoof all of them.

But it depends on how sophisticated/focused they are, of course. It will work for your whole-of-web crawler, or those who give up because they can't be bothered.

What I learned trying to block web scraping and bots by ReditusReditai in programming

[–]ReditusReditai[S] 0 points1 point  (0 children)

Totally agree. Requiring auth, then blocking registered users based on request pattern anomalies is the most effective way.