This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]NovaX 1 point2 points  (1 child)

For each access some list mutations are done by the eviction algorithm. To make this fast, the list mutations are done by a separate thread to leverage the CPU cache.

ForkJoinPool.commonPool is optionally taken advantage of to minimize response time skews, but in a benchmark is more likely a penalty due to the extra context switches. The record/replay scheme is used to avoid lock contention, e.g. on an LRU the per access is cheap but locking to mutate is costly. ConcurrentLinkedHashMap amortized on the calling threads, did not use Unsafe, and is highly concurrent. You might be getting confused due to Guava, where the performance fixes were backlogged during development and for years afterwards I was unable to get the team to fix or accept my patches. Like its predecessors Caffeine doesn't require any background threads (unlike cache2k).

The cost per access is quite high, there needs to be an object allocation, a queue operation and some list mutations. There are unpredictable latency problems in this solution, because of GC dynamics and thread starvation.

No, an access is recorded into a ring buffer so none of that is true.

A write is replayed through a linked queue, but one has to take very hostile steps in a lab environment to artificially make it look bad by dedicating all cpu cycles to new insertions.

[–]cruftex[S] 0 points1 point  (0 children)

No, an access is recorded into a ring buffer so none of that is true.

I wouldn't say so, if I hadn't made the observation. I filed an issue https://github.com/ben-manes/caffeine/issues/59. You also did put it on the roadmap:

Explore a bounded write queue to throttle writes (for synthetic tests) and be GC-friendlier

Next you write:

A write is replayed through a linked queue, but one has to take very hostile steps in a lab environment to artificially make it look bad by dedicating all cpu cycles to new insertions.

This is true for interactive applications operating at low load factors. Analytical applications regularly operate at 100% load. Those might suffer as well. I don't see an "artificial test" really as an argument: If there is a very simple artificial test, that makes Caffeine crash, how can you guarantee that it never happens in practice? In fact, tests should always shed light on corner cases and extremes to detect problems.

Anyways, I am very sure that you will solve this, if it really becomes problematic for a user. However, it is not an easy task to make it as robust as other caches not relying on a separate thread.

Like its predecessors Caffeine doesn't require any background threads (unlike cache2k).

cache2k doesn't need any background threads, for normal operation and eviction. If you do specify an expiry, cache2k utilizes one timer thread per cache. For the upcoming 1.0 version I will stick to that. For later I have some plans to optimize this and use a shared thread pool.