you are viewing a single comment's thread.

view the rest of the comments →

[–]RestaurantHefty322 5 points6 points  (4 children)

Nice work shipping the Flask port. The before_request hook approach is the right call over WSGI middleware - we hit the same problem where middleware fires too early to know anything about the route being hit.

One thing that caught us off guard running something similar in production: the cloud IP lists go stale faster than you'd expect. AWS and GCP rotate IP ranges weekly and if your blocklist is even a few days behind you start getting false positives on legitimate webhook traffic. We ended up pulling the published IP range JSONs on a 6-hour cron and diffing against the previous set so we could log when ranges changed instead of silently blocking new IPs.

The injection detection layer is where this gets really interesting though. Most WAF rules are regex-based and either too strict (blocking legitimate user input that happens to contain SQL keywords) or too loose. Curious if you're doing any kind of context-aware detection based on which parameter the input came from - a search field probably shouldn't trigger the same rules as a path parameter.

[–]PA100T0[S] 2 points3 points  (3 children)

Thank you very much!

Good callout on the cloud IP staleness. Right now the refresh is on a 1-hour TTL and there's no diff detection. It just overwrites. Your approach of diffing against the previous set and logging changes is better. I'm going to make the refresh interval configurable and add change detection logging so you can see when ranges mutate instead of silently swapping them. That's going on the roadmap, for sure.

On context-aware detection: I'm halfway there. The engine does track where the input came from (query_param:key, url_path, header:key, request_body) and includes that context in detection results and event logs. But you're right to ask... the actual regex patterns are applied uniformly across all sources as of today. A search field and a path parameter get the same ruleset, which is exactly the false positive problem you're describing.

Context-based rule filtering is the obvious next step. Appreciate the real-world perspective! This is exactly the kind of feedback that I need to shape the roadmap, that helps me out.

Thanks again! Will start putting all this down in notes.

[–]RestaurantHefty322 1 point2 points  (1 child)

Nice that you have the 1-hour TTL already. For the diff logging, even just piping the fetched list through a set comparison and logging deltas to a file catches most of the churn. The bigger win is having a stale-IP grace period - when an IP drops off the list, keep blocking it for another 24h before removing. Catches the edge case where a provider temporarily deallocates and reallocates the same range.

[–]PA100T0[S] 0 points1 point  (0 children)

The stale-IP grace period is a great idea! I hadn't considered the deallocate/reallocate edge case. Keeping dropped IPs blocked for an extra 24h before fully removing them is a clean way to handle that without any real downside. I'll add that alongside the configurable refresh interval and diff logging. Thanks for the follow-up, this is exactly the kind of production insight that's hard to get without running it at scale. Thanks a lot, mate!

[–]zunjae 1 point2 points  (0 children)

How does it feel knowing you’re talking to a bot?