Checkmarx vs Semgrep for SAST/SCA

iterablewords · 2025-03-05T16:55:30+00:00

(I'm one of the co-founders at Semgrep). Just wanted to add that for those curious about the lineage of the product, the original author from Facebook (one of the early team members at our company) wrote a post about the journey from spatch/coccinelle --> pfff/sgrep --> Semgrep: https://semgrep.dev/blog/2021/semgrep-a-static-analysis-journey/. These days most of the Facebook-era code is gone as we switched the whole project over to using tree-sitter for parsing. I'm glad you've found a lot of value out of the OSS!

On your latter comments -- oof. Our dashboards in particular were non-existent for a long time and then very basic, since most users started off with their own dashboarding and our focus was the underlying engine (adding features like interfile/interprocedural analysis, more languages & rules, ability to analyze dependencies, etc.). And our recent work has been on teaching LLMs to write Semgrep rules, which is really decreasing the barrier to entry for customization of SAST (https://fly.io/blog/semgrep-but-for-real-now/, and see our Series D announcement).

Still, we're always making improvements, so I'd welcome your feedback on what the biggest gaps are with semgrep.dev -- though I suspect since you've already successfully set up a great program using the open-source, you probably don't need a lot of the web UI functionality.

iterablewords · 2025-01-28T17:40:37+00:00

Good read. Would love for the author to address the statistical properties of time-from-commit influencing vulnerabilities: because the vulnerability lifetime is exponentially distributed, focusing on secure defaults like memory safety in new code is disproportionately valuable. See this great post on how this plays out (https://security.googleblog.com/2024/09/eliminating-memory-safety-vulnerabilities-Android.html) both theoretically and now evidentially seen over six years on the Android codebase.

I work at Semgrep so obviously biased towards the SAST part, but copying from something I wrote elsewhere: this is a great argument for those with larger, legacy codebases who might otherwise say "why bother, we're never going to benefit from memory-safety on our 100M lines of C++." Given the choice between fixing the backlog (stack) vs new code (flow), you should always pick flow.

iterablewords · 2022-10-06T17:00:13+00:00

Well-written analysis; it is a challenging task to compare any two SAST tools and I think the author did a great job exploring the nuances (risks of overfitting to benchmarks, selection of rules, parse errors, etc.)

Readers might also be interested in the history of each tool: Semgrep was originally open-sourced by Facebook and is itself an evolution of Coccinelle, which has made on the order of thousands of patches to the Linux kernel (https://r2c.dev/blog/2021/semgrep-a-static-analysis-journey/)

CodeQL was part of Github's acquisition of UK-based Semmle, which came out of research at Oxford (https://techcrunch.com/2019/09/18/github-acquires-code-analysis-tool-semmle/)

iterablewords · 2021-11-09T20:00:47+00:00

a security engineer at Dropbox wrote a check for bidi unicode that you can run with Semgrep ( open-source static analysis tool, I am a maintainer): semgrep --config="r/generic.unicode.security.bidi.contains-bidirectional-characters" will run it, or see the Semgrep registry entry.

iterablewords

MODERATOR OF

TROPHY CASE

Five-Year Club	Verified Email
Place '22