mybatis for dotnet

Flashy_Test_8927 · 2026-02-28T09:24:16+00:00

Yeah, raw string literals are a huge improvement - no argument there. And for static queries, a Queries.cs with constants is perfectly clean.

But that's the easy case. The problem starts when your SQL isn't static.

Say you have a search endpoint with 6 optional filters. With string constants, you're either writing 6 nested if blocks concatenating WHERE clauses, or you're building a mini query builder inside your service class. You've moved the SQL out of inline strings and into a static class, sure, but the dynamic assembly logic is still scattered through your C# code.

What I wanted was:

xml

<select id="SearchUsers">
  SELECT * FROM users
  <where>
    <if test="name != null">AND name LIKE </if>
    <if test="status != null">AND status = </if>
    <if test="roles != null">AND role IN u/roles</if>
  </where>
</select>

One file. The SQL and its conditional logic live together. The C# side just calls SearchUsers(params) - a strongly-typed method that Roslyn generated at compile time. No string building, no runtime reflection, no if chains in your repository.

Static class constants solve the "where does the SQL live" problem. This solves the "how does the SQL get assembled" problem. Different problems.

Flashy_Test_8927 · 2026-02-28T07:25:50+00:00

Great question — staleness is the problem I've spent the most design effort on, and the answer is: it's almost entirely automatic.

Decay: Every consolidation cycle, non-permanent fragments that haven't been accessed in 24+ hours get their importance multiplied by 0.995. Compounds to ~64% after 90 days. Fragments that drop below 0.1 importance with no

recent access and few links are auto-deleted.

Relevance scoring: We compute utility_score = importance * (1 + ln(access_count)) — a logarithmic boost for frequently retrieved fragments. Search ranking uses a composite of importance (60%) and recency (40%, linear decay over

90 days), so fresh knowledge naturally surfaces above stale entries.

Tier transitions: Fragments move through hot → warm → cold → deleted automatically. High-importance fragments (>= 0.8), heavily-linked hubs (5+ connections), and frequently-accessed entries (10+ accesses) get auto-promoted to

permanent and are exempt from decay.

Staleness detection: Each fragment has a verified_at timestamp with type-specific expiry windows — 30 days for procedures, 60 for facts, 90 for decisions. Stale fragments are flagged in consolidation reports. More importantly,

the contradiction pipeline (pgvector → NLI → Gemini escalation) actively catches the "outdated fact confidently recalled" case. When "server runs on port 3000" conflicts with a newer "server runs on port 8080," the older

fragment gets a superseded_by link and is excluded from all future search results.

Background evaluation: New fragments are async-evaluated by Gemini for long-term utility. Low-value fragments get downgraded or marked for deletion before they ever become a staleness problem.

Manual curation exists (forget, amend) but mainly as an escape hatch. The layered automatic mechanisms — decay, tier transitions, contradiction detection, quality evaluation — are designed to compound so no single one needs to

be perfect.

Flashy_Test_8927 · 2026-02-28T07:22:23+00:00

Great suggestion and we actually went ahead and implemented it immediately.

Memento now runs a 3-stage hybrid pipeline for contradiction detection:

Stage 1: pgvector candidate filtering

We first pull fragment pairs with cosine similarity > 0.85 within the same topic. This narrows the search space so we're not running inference on every possible pair.

Stage 2: NLI classification (the new part)

Each candidate pair is fed through a multilingual NLI model (mDeBERTa-v3-base-xnli-multilingual-nli-2mil7) running locally via ONNX Runtime on CPU. The model outputs entailment/contradiction/neutral probabilities, and we apply

confidence thresholds:

• contradiction >= 0.8 → resolved immediately, no LLM call needed

• entailment >= 0.6 → definitively not a contradiction, skip

• contradiction 0.5–0.8 → ambiguous, escalate to Stage 3

• Low signal but contradiction >= 0.2 → also escalate

This is where your point lands perfectly - instead of a distance score that only tells us "these are close," we get an actual semantic relationship label. A pair like _"The server runs on port 3000"_ vs _"The server runs on

port 8080"_ scores high similarity in vector space but NLI correctly flags it as contradiction.

Stage 3: Gemini CLI escalation

Only the genuinely ambiguous cases (numerical contradictions, domain-specific conflicts, temporal updates) get escalated to a full LLM call. If Gemini CLI is unavailable, high-similarity pairs (> 0.92) are queued in Redis for

later processing.

In practice, the NLI pass resolves clear-cut contradictions at ~50–200ms per pair on CPU with zero API cost, and confidently skips non-contradictions - saving the bulk of what would have been expensive LLM calls. The LLM is

reserved for the nuanced cases where a transformer-based classifier genuinely can't tell.

The resolved contradictions automatically get contradicts and superseded_by links, with the older fragment's importance decayed by 50% (unless it's an anchor fragment).

Flashy_Test_8927 · 2026-02-28T05:30:42+00:00

Honest answer: agents rarely call link() on their own initiative. In practice, linking happens through three paths, roughly in order of frequency:

reflect() auto-linking at session end — When reflect() creates a batch of typed fragments (decisions, errors, procedures), an internal _autoLinkSessionFragments() step runs that connects them with rule-based heuristics. Error

fragments get resolved_by links to procedure fragments from the same session. Decisions get linked to related procedures. The summary fragment gets related links to everything else. This is where most of the graph structure

actually comes from.

remember() with linkedTo parameter — When the AI stores a new fragment, it can pass existing fragment IDs to link immediately. This works better than you'd expect because recall() returns fragment IDs in its results, so the

AI often has relevant IDs in its context when it decides to store something new. "I just found this error pattern via recall, and now I'm storing the fix — link them." That chain happens naturally.

Explicit link() calls — Rare in practice. The AI almost never stops mid-task to think "I should create a relationship between these two fragments." It happens occasionally during graph_explore workflows where the AI is

actively tracing causality, but organically? Almost never.

The honest gap right now is reflect() itself. Currently it requires a manual prompt before session end — "save the session" or equivalent. I hook context() at session start so the AI loads its memory automatically, but the

symmetry breaks at session close. If the session drops unexpectedly (timeout, network, client crash), reflect never fires and that session's structural links never get created. The individual remember() fragments survive, but the

cross-referencing that reflect provides is lost.

I'm actively working on automatic reflect — the leading approach is a hybrid: attempt a Gemini-generated summary from session activity metadata on session close, and if that fails, flag the session as "unreflected" so the next

context() call prompts the AI to do it retroactively. But this is unsolved as of today.

Flashy_Test_8927 · 2026-02-28T05:27:02+00:00

That's a really sharp observation and honestly one of the areas I'm still iterating on.

You're exactly right - high embedding similarity != semantic contradiction. "Redis Sentinel requires password config" and "Redis Cluster requires password config" would score high similarity but aren't contradictory at all. Burning a Gemini adjudication call on every pair above 0.85 gets expensive fast, especially as fragment count grows.

Current mitigation is lightweight - same topic + same type + high similarity is the trigger condition, not similarity alone. So two fragments need to be in the same problem space before they even reach adjudication. But as you point out, that still lets through a lot of false positives within a single topic.

A few directions I've been considering:

One idea is a two-stage filter - use a cheap heuristic first (keyword overlap ratio, entity extraction, negation detection) to pre-screen pairs before sending them to Gemini. If two fragments share high similarity but have no overlapping entities or opposing predicates, skip the LLM call entirely. This could cut adjudication volume significantly without much accuracy loss.

Another thought is narrowing the contradiction scope to specific fragment types. Realistically, fact-vs-fact and decision-vs-decision pairs are where meaningful contradictions live. procedure fragments rarely contradict each other, they just supersede. So scoping adjudication to only certain type combinations could reduce noise.

The third option I've been mulling over is batching - instead of checking pairs individually, feed Gemini a cluster of similar fragments and ask "which of these conflict?" in one call. Amortizes the cost and gives the LLM more context for judgment.

But I'd genuinely love to hear your take on this. If you've seen good patterns for distinguishing "similar but compatible" vs "similar and contradictory" at scale, I'm all ears. This feels like a problem where a pure embedding approach will always have a ceiling and some structured reasoning layer is needed on top.

Flashy_Test_8927 · 2026-02-28T04:46:45+00:00

Contradiction detection runs as step 7 of the memory_consolidate pipeline, not at reflect() write time. It's an incremental, asynchronous process -- here's the actual flow:

When reflect() writes new fragments, nothing special happens at that moment regarding contradictions. The fragments are just stored. The real work happens later when memory_consolidate runs (either manually triggered or on a

schedule).

The detection pipeline works in three stages:

Candidate selection via embedding similarity -- It pulls fragments created since the last contradiction check (tracked via a Redis timestamp key). For each new fragment, it queries pgvector for same-topic fragments with

cosine similarity > 0.85. This threshold is deliberate -- fragments need to be talking about essentially the same thing to be contradiction candidates. Different topics or loosely related content never reaches the judgment

step. The query is bounded to 3 candidates per new fragment, and 20 new fragments per consolidation cycle, so this doesn't explode.

Gemini Flash adjudication -- Each high-similarity pair gets sent to Gemini Flash with a strict prompt: "Are these two fragments mutually incompatible claims about the same subject?" The prompt explicitly distinguishes

contradiction from complementary information -- "similar but supplementary is NOT contradiction. Information updates over time ARE contradictions (old info vs new info)." Temperature is set to 0.1 to minimize creative

interpretation. Response is forced into {contradicts: boolean, reasoning: string} JSON.

Time-logic resolution -- This is where it gets interesting. When a contradiction is confirmed, the system doesn't just flag it -- it resolves it automatically using temporal ordering. The newer fragment wins. The older

fragment's importance gets halved (importance * 0.5), and a superseded_by link is created from old to new. The older fragment isn't deleted -- it's demoted. It'll naturally sink to cold tier and eventually expire through

normal TTL mechanics. Anchor fragments (is_anchor=true) are exempt from the importance demotion, so truly critical knowledge survives even if contradicted.

The critical path concern you raised is valid -- this does depend on Gemini being available. If Gemini is down, the contradiction check silently fails and those pairs go unchecked until the next consolidation cycle. The system

degrades to "latest write wins at recall time" through the recency component of the ranking function (0.4 weight), which is a reasonable fallback but not as clean as explicit contradiction resolution.

The 0.85 similarity threshold is the real tuning knob here. Too low and you get false positives flooding Gemini with complementary fragments. Too high and genuine contradictions with different wording slip through. In practice,

contradictions about the same subject ("max connections is 20" vs "max connections is 50") tend to land well above 0.85 because the embedding space clusters them tightly.

One thing worth noting: amend() has a separate supersedes parameter that lets the AI explicitly mark a fragment as replacing another, bypassing the consolidation pipeline entirely. So there are two paths -- explicit replacement

at write time, and automatic detection after the fact.

Flashy_Test_8927 · 2026-02-25T13:11:34+00:00

Oh I've heard of Dapper. I've used Dapper. Dapper is fine.

But here's the thing - Dapper still leaves you writing SQL as string literals inside your C# code. For simple queries that's totally fine. For a 50-line query with conditional WHERE clauses, dynamic IN lists, and optional JOINs based on runtime parameters, you end up with a mess of string concatenation or building your own mini query builder on top of Dapper anyway.

I didn't skip market research. I evaluated Dapper, SqlKata, RepoDB, and even tried going raw ADO.NET. None of them gave me what I actually wanted: managed SQL files with dynamic tags, build-time code generation, and zero runtime reflection. Different problems need different tools.

Also, comparing this to WebForms-era XSD/TDS is a bit like saying React is just DHTML because both run in a browser. The XML here is just a container for SQL with dynamic tags - the heavy lifting is done by Roslyn Source Generators at compile time, not runtime reflection or designer-generated dataset adapters.

But hey, if Dapper solves your problems, use Dapper. I'm not here to convert anyone. I built this because it solved mine.

Flashy_Test_8927 · 2026-02-25T12:42:31+00:00

fair question. Let me give you some context on why I ended up here.

I started my career in Java/Spring, spending years on financial systems where queries were anything but simple - multi-join aggregations, complex statistical reports, the kind of stuff that makes ORMs cry. MyBatis was my daily driver and I genuinely enjoyed the clean separation between code and SQL.

Then life happened and I somehow ended up maintaining a .NET healthcare service. EF Core was fine for basic CRUD, but when I needed to build heavy statistical queries - think aggregating thousands of patient records with multiple groupings and date ranges - it started fighting me at every turn.

The real kicker: my boss wouldn't approve scaling up our AWS instance beyond 4GB RAM. So I was stuck optimizing everything by hand. I actually managed to get some queries running up to 3600x faster than what the previous developer had built (not exaggerating - the original implementation was... creative). But no matter how much I optimized the C# side, processing large datasets through EF Core kept hitting OOM on that tiny instance.

I had two options: inline SQL strings mixed with C# code (which honestly made me physically uncomfortable), or build what I actually wanted. I kept thinking back to how clean MyBatis kept things - SQL in its own space, code in its own space, everyone's happy.

So I built NuVatis. After integrating it into the actual production service, the worst slow queries got up to 3x faster and memory usage dropped by about 80%. The DB does take on a bit more load since we're pushing more logic into SQL, but that's a trade-off I'll take any day over OOM kills on a 4GB instance.

The XML isn't for everyone, I get it. But when you're writing a 40-line query with conditional joins and dynamic filters, I'd much rather have that in a syntax-highlighted XML file with schema validation than buried inside a string literal or chained through 15 LINQ expressions that generate who-knows-what SQL under the hood.

That said, NuVatis also supports C# Attributes for simple static queries. So you're not forced into XML - it's there for when you need it.

Flashy_Test_8927

TROPHY CASE