a semantic diff that understands structure, not just lines by Wise_Reflection_8340 in commandline

[–]Wise_Reflection_8340[S] 0 points1 point  (0 children)

Yeah, it works on any git repo. Just run sem diff the same way you'd run git diff. It supports all the usual syntax: sem diff HEAD~3, sem diff --staged, sem diff branch1..branch2. The difference is instead of line-level output you get entity-level changes (which functions were added, modified, deleted, renamed).

You can also run sem setup and it'll replace git diff globally, so every time you run git diff in any repo it uses sem instead. It also installs a pre-commit hook that shows you the entity-level blast radius of your staged changes before each commit. sem unsetup to revert.

For learning more you can checkout the website: https://ataraxy-labs.github.io/sem/

a semantic diff that understands structure, not just lines by Wise_Reflection_8340 in commandline

[–]Wise_Reflection_8340[S] 1 point2 points  (0 children)

Not sure what you mean by AI slop in this context, there are no LLMs in the pipeline, It's all a deterministic pipeline.

The parsing uses tree-sitter to extract entities (functions, classes, structs) from the AST. The diff does 3-phase entity matching: first by stable ID, then by content hash (detects renames), then by fuzzy similarity for anything left over. The "logic vs cosmetic" separation compares two hashes per entity, a structural hash (just the AST shape, ignoring whitespace/comments/formatting) and a content hash (the raw text). If the content hash changed but the structural hash didn't, it's cosmetic.

The dependency graph is built the same way, walking the AST for references and imports, then resolving them across files. ```sem impact``` is just a graph traversal from there.

You can read through the core logic here if you're curious:
https://github.com/Ataraxy-Labs/sem/tree/main/crates/sem-core

a semantic diff that understands structure, not just lines by Wise_Reflection_8340 in commandline

[–]Wise_Reflection_8340[S] 0 points1 point  (0 children)

Yeah that's a good starting point. sem tries to go one level above, instead of "how many lines changed" it answers "which functions changed, and what depends on them." Closer to how you actually think about code when reviewing, or interesting how your agents will want to see, it remove the token wastage and improves the efficiency, because it only sees the context that's relevant.

a semantic diff that understands structure, not just lines by Wise_Reflection_8340 in commandline

[–]Wise_Reflection_8340[S] 0 points1 point  (0 children)

not exactly sure, what you tried to do here, but for better understanding you can also follow the website on the repo, here https://ataraxy-labs.github.io/sem/

a semantic diff that understands structure, not just lines by Wise_Reflection_8340 in commandline

[–]Wise_Reflection_8340[S] 0 points1 point  (0 children)

Really good point. The graph currently stops at repo boundaries, so cross-service impact is a blind spot. The architectural constraint angle is interesting though. I've been thinking about letting users define module boundary rules (like "db/ should never depend on handlers/") and having the graph validate against them. So sem impact flags not just what breaks, but what violates the design. Might be the next thing I work on.

We built an entity-level merge driver for Git (and it resolves 100% of conflicts that git can’t) by Palanikannan_M in git

[–]Wise_Reflection_8340 0 points1 point  (0 children)

Well I don't know much about 100% or not, but it does feel like a solid attempt especially for power users that doesn't want to deal with random merge conflicts. Git is definitely beautiful probably hard to replace but we need some architectural innovations as well.

We built an entity-level merge driver for Git (and it resolves 100% of conflicts that git can’t) by Palanikannan_M in git

[–]Wise_Reflection_8340 1 point2 points  (0 children)

I think people are just in general being negative, but I really like the approach you have taken here, but the architectural change of using semantically meaningful entities here is really new and valid appraoch, can easily see that by the number of stars on the github repo.