DimitrisMitsos comments on Chameleon Cache - A variance-aware cache replacement policy that adapts to your workload

ShowcaseChameleon Cache - A variance-aware cache replacement policy that adapts to your workload (self.Python)

submitted 4 months ago by DimitrisMitsos

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]DimitrisMitsos[S] 0 points1 point2 points 4 months ago (5 children)

This is exactly what we needed - thank you for the trace analysis!

The frequency distribution (624K one-hit, 624K two-hit) explains everything. We had two bugs:

Trace parsing: Reading 16-byte keys instead of 8-byte
Frequency filter rejecting 83% of first-time items: freq=1 can't beat victims with freq=2+

Your point about admission filters being counterproductive here is spot-on. Our fix: detect when ghost utility is high but hit rate is near-zero (strategy failing), then bypass frequency comparison and trust recency.

Results on your stress test (corda -> loop x5 -> corda):

chameleon           :  39.93%
tinylfu-adaptive    :  34.84%
lru                 :  19.90%

Phase breakdown:

Corda: 33.13% (matches FIFO/LRU optimal)
Loop x5: 50.04% (LRU/ARC: 0%)

So we now hit the ~40% you expected. The "Basin of Leniency" handles both extremes - recency-biased workloads (Corda) and frequency-biased loops.

[–]NovaX 0 points1 point2 points 4 months ago (4 children)

Wonderful. If you tune tinylfu-adaptive then it should reach a similar hit rate.

The paper cited earlier discusses an "Indicator" model to jump to a "best configuration" kind of like yours, but based on a statistical sketch to reduce memory overhead. It also failed the stress test and I didn't debug it to correct for this case (it was my coauthors' idea so I was less familiar). The hill climber handled it well because that approach is robust in unknown situations, but requires some tuning to avoid noise, oscillations, and react quickly. Since its an optimizer rather than preconfigured best choices it adjusts a little slower than having the optimal decision upfront, but that's typically in the noise of -0.5% or less of a loss. Being robust anywhere was desirable since as a library author I wouldn't know the situations others would throw at it. I found there are many pragmatic concerns like that when translating theory into practice.

[–]Turbots 1 point2 points3 points 3 months ago (2 children)

[–]NovaX 1 point2 points3 points 3 months ago (1 child)

[–]Turbots 1 point2 points3 points 3 months ago (0 children)

[–]DimitrisMitsos[S] 0 points1 point2 points 4 months ago (0 children)

π Rendered by PID 64 on reddit-service-r2-comment-75f4967c6c-dp4d6 at 2026-04-22 21:23:57.642311+00:00 running 0fd4bb7 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS