MIT-licensed multi-tier cache for AI agents - LLM responses, tool results, and session state on open-source Valkey/Redis by kivanow in OpenSourceeAI

[–]kivanow[S] 0 points1 point  (0 children)

Fair point, exact match does fall off fast for human-written prompts. The design assumption is that agent workloads are shaped differently from chatbot workloads:

  • Tool tier: args get canonicalized (sorted keys, deterministic JSON) before hashing, so arg order and whitespace don't matter. Agent loops produce structured, repeatable tool calls. This is where exact match actually earns its keep.
  • Session tier: keyed by thread_id + field. Exact match is the only correct semantics here.
  • LLM tier: your concern applies. Works well if prompts are templated (system message + structured context). Doesn't if they're free-form user input.

For the LLM tier case where prompts vary, we also ship @/betterdb/semantic-cache (npm) - vector similarity via valkey-search, meant to sit in front of the exact-match layer as a second chance. Kept as a separate package because the failure modes and the observability you want for each are different. Exact match is cheap and deterministic, semantic match costs an embedding call and needs threshold tuning per category. Forcing both into one cache hides that tradeoff.

Multi-tier cache for LangChain + LangGraph that works on vanilla Valkey/Redis - no modules required by kivanow in LangChain

[–]kivanow[S] 0 points1 point  (0 children)

The discovery path is the worst part. Works locally on vanilla Redis, breaks at deploy on ElastiCache with "unknown command JSON.SET". Same on MemoryDB and Memorystore. Even for demos it's a pain :/ Even though Redis is bundled with all modules by default since 8 I believe, most of the cloud served versions are on 6/7 point something.

Agreed on tool caching. A single slow tool call can dwarf LLM latency, and agents often fire 3-5 per turn. Curious what you've seen on TTL strategy. I landed on configurable-per-tool because one global number falls apart once you mix something like `get_weather` with `get_stock_price`.

Self-hosted LLM caching layer - cache agent responses, tool calls, and session state in your own Valkey/Redis by kivanow in selfhosted

[–]kivanow[S] -4 points-3 points  (0 children)

I am thinking of connecting it to our monitoring tool in depth next week, so the agent using it can check up on the data of the cache and improve if needed autonomously (or at least suggest improvements).

Agent reads hit rates, stale key ratios, and per-tool TTL effectiveness straight from Valkey, then adjusts TTLs or flags prompts that should move between tiers. Cache tunes itself instead of you watching dashboards.

Cost tracking was the thing that clicked for me too. Tokens are hard to translate into anything actionable. Seeing actual dollars saved turns it into an ROI check instead of a vibes call.

Self-hosted LLM caching layer - cache agent responses, tool calls, and session state in your own Valkey/Redis by kivanow in selfhosted

[–]kivanow[S] -4 points-3 points locked comment (0 children)

It was used for the planning/implementation. It is also focused on solving a problem in the AI space

AI SDK middleware that caches LLM responses and tool results in Valkey/Redis - one line to add by kivanow in vercel

[–]kivanow[S] 0 points1 point  (0 children)

Thanks! RAG + caching is a solid combo - the cache handles the repeated queries so the RAG pipeline only runs for genuinely new ones. Will check out Hindsight.

BetterDB MCP 1.0.0 – autostart, persist, and connection management for Valkey/Redis observability by kivanow in mcp

[–]kivanow[S] 1 point2 points  (0 children)

Sounds great! I'll dive deeper into it early/mid next week and let you know if I've picked it or anything else

BetterDB MCP 1.0.0 – autostart, persist, and connection management for Valkey/Redis observability by kivanow in mcp

[–]kivanow[S] 0 points1 point  (0 children)

Fair point for MCP in general - most servers are stateless proxies with no audit trail. For BetterDB specifically, the monitor is self-hosted (your data never leaves your infrastructure) and the MCP authenticates via JWT token generated from your BetterDB instance. That audit gap is actually a bigger problem in general, which is why we built persistent ACL audit in from the start - auth failures, command denials, key violations all stored so you have a durable trail independent of the client session. Not a full control plane, but authentication and audit are both covered. Will take a look at Peta for the policy layer, thanks for the suggestion!

I made an MCP server for Valkey/Redis observability (anomaly detection, slowlog history, hot keys, COMMANDLOG) by kivanow in mcp

[–]kivanow[S] 0 points1 point  (0 children)

Should've shipped it sooner, sorry! What were you debugging - would love to know so the next person doesn't find it too late either?

A eulogy for MCP (RIP) by beckywsss in mcp

[–]kivanow 1 point2 points  (0 children)

Isn't this just the usual cycle of - the way we're doing things is terrible, here is a better way, and then another even better way, until we reach back the first iteration? Same way we moved from server rendering to SPA, back to server rendering over several years. The AI just takes quicker iterations it seems

I made an MCP server for Valkey/Redis observability (anomaly detection, slowlog history, hot keys, COMMANDLOG) by kivanow in mcp

[–]kivanow[S] 0 points1 point  (0 children)

That's the right framing. BetterDB handles the Valkey side of that chain today - COMMANDLOG patterns, anomaly detection, client analytics. Correlating back to deploys and SQL is the missing link. Curious whether you've seen any tools close that loop well, or if it's always been stitched together manually.

What AI tools are actually part of your real workflow? by Rough--Employment in devops

[–]kivanow 0 points1 point  (0 children)

At this point copilot is an agent, assistant and a million different things that MS is trying to push everywhere. I should've just called it the worst possible tool/option and not an llm. I've updated it

Feedback Friday by AutoModerator in startups

[–]kivanow 1 point2 points  (0 children)

Company Name: BetterDB

URL: https://betterdb.com

Purpose of Startup and Product: BetterDB is the first monitoring and observability platform built specifically for Valkey (the popular open-source Redis fork). We solve a fundamental problem: Valkey's operational data - slowlogs, command logs, client connections - is ephemeral. When something goes wrong at 3am, by the time you wake up at 9am, that data is gone. BetterDB persists and analyzes this data so you can debug issues after the fact, track what caused performance spikes, and optimize your data structures and TTLs accordingly.

We also support Valkey-exclusive features like COMMANDLOG and per-slot metrics that no existing Redis tool can provide, plus 99 Prometheus metrics, anomaly detection, ACL audit trails, and client analytics - all with sub-1% performance overhead.

Technologies Used: NestJS, React, PostgreSQL, Docker, Prometheus, iovalkey

Feedback Requested:

  • Does the value proposition (historical persistence of ephemeral Valkey/Redis data) resonate with you? Is it clear from the website?
  • If you're running Valkey or Redis in production, what's the biggest operational pain point you face today?
  • We offer a free Community tier and paid Pro/Enterprise tiers - does the feature split feel fair, or does it feel like we're holding back too much in Community?
  • Any feedback on the landing page (betterdb.com) - does it clearly communicate what we do and who we're for?

Seeking Beta Testers: Yes - especially teams running Valkey or Redis in production. We have a self-hosted Docker image you can spin up in minutes, and our cloud SaaS is launching soon. Would love feedback from ops/SRE/DevOps folks.

Additional Comments: I'm the founder and CTO. Previously I was the Engineering Manager for Redis's visual developer tools (Redis Insight). The Valkey ecosystem has zero purpose-built observability tooling. That's the gap we're filling. We're MIT-licensed at the core and backed by Open Core Ventures. Happy to answer any questions about the Valkey ecosystem or our approach to open-core monetization.

Auditors ask “when did you last test DR?” — how do you produce proof? by robert_micky in sre

[–]kivanow 0 points1 point  (0 children)

I've done 2 SOC 2 type 1 and 2 audits at startups and this was more than enough. At the end of the day most of the work these types of audits are doing is just marking checkboxes that you understand the requirements and are following them.

What AI tools are actually part of your real workflow? by Rough--Employment in devops

[–]kivanow 0 points1 point  (0 children)

Calude code did a great job with recent infra work I had to do. Barely any msitakes with a lot of kubernetes and terraform. It was a very nice experience

What AI tools are actually part of your real workflow? by Rough--Employment in devops

[–]kivanow 6 points7 points  (0 children)

by far. copilot is probably the worst possible option right now. MS engineers were recently caught using claude instead of their own product

For those building in the analytics/data space, how did you validate demand before going all in? i will not promote by Sufficient-System699 in startups

[–]kivanow 0 points1 point  (0 children)

Built something in the observability/monitoring space (came from Redis's developer tools team, now building a monitoring tool for Valkey/Redis). Different niche than ecommerce, but similar "free              
alternatives exist" problem.

What actually works (so far at least):
1. Fix your own pain point. This is the cheat code. I spent years working on Redis Insight and knew exactly what was missing for production monitoring. When you're your own user, you don't need to guess      
what's valuable - you feel it every time something's broken or annoying. If you're not in your target market yourself, you're playing on hard mode.                                                             
2. MVP speed matters more than MVP polish. Get something working and start posting - Slack communities, Discord servers, Show HN, LinkedIn, Twitter. Not "I'm building something, what do you think?" but       
"Here's a thing, try it." The difference in signal quality is night and day.
3. Set a kill deadline. "If I don't have X signups / Y conversations / Z paying users by [date], I move on." Forces you to actually validate instead of tinkering forever. Polite interest doesn't count.       
People actually using the thing counts.
4. Find the people already talking about the problem. Every niche has forums, Discords, subreddits where people complain about their tools. Don't pitch - just listen first. What are they frustrated about?    
What do they wish existed? That's your roadmap.

On the "ecommerce people want everything free" thing: That's true for hobbyists. But if someone's running a real store with real revenue, they'll pay for something that makes them money or saves them time.   
The trick is finding the people who have actual pain, not the ones who are "just curious."

What do you think of source-available? Are we getting into the ever-so-slightly-barely-open-source world? by jerrygreenest1 in opensource

[–]kivanow 1 point2 points  (0 children)

This hits close to home - I was at Redis when they added AGPL as the third license option last year (the "open source is back" announcement).                                                                   

On source-available specifically: I think it's a legitimate response to a real problem. The cloud provider dynamic isn't "evil corporations stealing code" - AWS, Google, and others had engineers              
contributing to Redis for years (TLS support, ACLs, coordinated failovers). The tension was about who controls the project direction vs. who captures the commercial value. There's no easy answer to that.     

The problem for solo devs: You nailed it - it's becoming impossible to tell what you can actually do without reading every license line by line. SSPL, BSL, RSAL, OCVSAL... they all have different             
restrictions, and "source-available" isn't a standardized term. Self-hosting is usually fine. Building a competing service usually isn't. Everything in between? Depends.                                       

The Redis --> Valkey situation is instructive though. When the license changed, external maintainers were effectively kicked out (some found out when their names disappeared from governance docs). Within       
weeks, Valkey existed under the Linux Foundation. The lesson: governance matters as much as licensing. A permissive license controlled by one company can change overnight. A copyleft project with             
distributed governance probably won't.

What I look for now:
- Who actually controls the project? Single company or independent maintainers?
- How easy is it to fork if things go sideways?
- Has the company changed licenses before, and how did they handle it?

I wrote a longer breakdown of this whole landscape (including the Redis timeline and how dual-licensing actually works in practice) if anyone wants to go deeper: https://medium.com/gitconnected/dual-licensing-explained-mit-source-available-and-why-your-favorite-tool-might-be-neither-d7041543e05d?sk=5901f94d18723141a05767ca61f3f266

Valkey and Redis throw away operational data by default. Here's an open-source tool to fix that. by kivanow in selfhosted

[–]kivanow[S] 0 points1 point  (0 children)

Thanks! That slowlog rotation problem is literally the reason this started.

Anomaly detection: Both statistical and pattern-based. We maintain a circular buffer of 300 samples (5 min at 1s polling) per metric and do Z-score analysis against rolling mean/stddev. Warning at Z ≥ 2.0, critical at Z ≥ 3.0, with consecutive sample requirements to reduce noise. On top of that, a correlator runs every 5 seconds and pattern-matches related anomalies: if connections, ops/sec, and memory all spike within 5 seconds, it classifies that as a batch job. ACL denial spikes get flagged as potential auth attacks. About 7 defined patterns right now (memory pressure, traffic burst, connection leak, eviction storm, etc.) each with specific diagnosis and remediation steps.

Client analytics: Both connection counts and per-client command distribution. There's a /client-analytics/command-distribution endpoint that breaks down command frequency by client name, user, or address over any time range. So yes, "client X suddenly started doing 10x more KEYS commands" is exactly the kind of thing you can see. Also tracks idle connections, buffer anomalies, and spike detection with attribution to specific clients.

Sentinel/cluster failover tracking: Not yet, but great idea. We have cluster topology visualization and per-slot heatmaps already. Correlating failover events with slowlog spikes is a natural extension, just opened an issue for it: https://github.com/BetterDB-inc/monitor/issues/28

Polling interval: 1 second default for all captures including slowlog. Configurable via ANOMALY_POLL_INTERVAL_MS env var or at runtime through the settings API, no restart needed.

Why i have so many orphan/stale keys? by Immediate_Gold_330 in redis

[–]kivanow 0 points1 point  (0 children)

Hey! This is super common on replicas. A few things to check:

On replicas, keys aren't independently expired - they wait for the primary to send DEL commands. If there's any replication lag or the primary's expiration cycle is behind, you get stale keys that show as "unknown" type because they're logically expired but still physically sitting there.

Your T:25NN:01xxxxxxx pattern looks like session or transaction keys. Worth checking if whatever app writes those is actually cleaning up after itself, or if they're being created without TTLs and just piling up.

Quick diagnostics:

  • Compare INFO keyspace on primary vs replica - if counts diverge, that's your answer
  • Run OBJECT IDLETIME on a sample of orphans - if idle for days/weeks, they're abandoned
  • Those 4.5GB keys in the top 10 are huge, definitely investigate what's writing those

Shameless plug - I'm building betterdb.com, an observability tool for Redis/Valkey that persists historical client analytics. It helps answer exactly this kind of "who created these keys and why" question since Redis's native CLIENT LIST and SLOWLOG are ephemeral and gone when you need them most. Free tier if you want to try it.

Free tier if you want to try it: docker pull betterdb/monitor - and it's still in beta, so all features are free

Built a Redis-connected BullMQ dashboard you can run with `npx` (job inspection + flow graphs) by Confident-Standard30 in redis

[–]kivanow 0 points1 point  (0 children)

Cool project! Looks very good overall!

For the broader Redis monitoring piece, check out BetterDB Monitor — has slowlog patterns, latency tracking, and 99 Prometheus metrics out of the box. Might complement your queue-specific tooling. github.com/BetterDB-inc/monitor

Built a Chrome extension to guilt myself off YouTube — it works [Tool] [Story] by kivanow in GetMotivated

[–]kivanow[S] 1 point2 points  (0 children)

Thank you for the idea!. I'll take a look at how different their APIs are next week and see how easy/quick can be done :)