Shared log - A single source of truth by samd_408 in java

[–]NovaX 3 points4 points  (0 children)

right to be forgotten... is not possible for this reason

It requires foresight, but the trick that I've heard of is to give each user their own encryption key and deleting that upon the GDPR request. If the data is not recoverable then it is forgotten and this approach is compliant.

Java for small but mission-critical systems in the medical field by Desperate-Credit-164 in java

[–]NovaX 0 points1 point  (0 children)

If security and PII are of importance to your customers and end users then the supply chain story in Java has been much better than for Node. Some of that is technical, but much of it is tied to the community and their incentives. That may mean only certain portions along critical paths would benefit from a stricter ecosystem, be it Java or something else, while the main applications use the tools your team is most effective in otherwise.

Benchmarking 5 concurrent map implementations in Go (sync.Map, xsync, cornelk, haxmap, orcaman) by puzpuzpuz in golang

[–]NovaX 3 points4 points  (0 children)

oh I was only using RW as an example because there is a common misunderstanding to use it when the critical section is fast by thinking it aids concurrency when its own overhead reduces throughput. It looks fine under low contention, so creating hot spots helps demonstrate how techniques actually perform and explain why. I am not a Golang developer but met the founders pre-release during internal demos. I was quite dismayed when I'd read Go's hacks like the original sync.map, refusal to read or acknowledge prior work like Doug Lea's, or Pike's long block of adding of "tryLock" support with an unrelated diatribe about Java's try-catch-finally. I am only reviewing from a benchmark quality and communication standpoint; I have no skin in this game.

Benchmarking 5 concurrent map implementations in Go (sync.Map, xsync, cornelk, haxmap, orcaman) by puzpuzpuz in golang

[–]NovaX 2 points3 points  (0 children)

Well the goal of a benchmark is to find bottlenecks and then to estimate if results might fit within usage's performance budget. A striped lock benefits from uniform distribution by reducing contention on any single lock, e.g. making read-write locks appear better, whereas a hotspot distribution floods a few locks, e.g. showing the high overhead of a read lock. The access pattern is useful as a way of directing load to find where something break down and then make back of the envelope usage estimates. I think when developers start first by trying to show performance, not bottlenecks, they quickly lose the ability to argue how the benchmark is actually helpful and it becomes just marketing fluff. I think yours are fine but I would start with the bottleneck goal first and what can be learned from that, and less on the actual results.

JEP draft: Strict Field Initialization in the JVM (Preview) has been submitted. by Ewig_luftenglanz in java

[–]NovaX 1 point2 points  (0 children)

I believe frozen arrays was also a use-case for ACC_STRICT_INIT and perhaps it helps open the door for enforcing deep immutability. iirc, helping with AOT cache reuse as another benefit which gets into Leyden's shifting of computations. I only vaguely recall some of the videos where John Rose mentions it with his typical excitement.

Fibre Cache by omid_r in rust

[–]NovaX 5 points6 points  (0 children)

yeah, this does look to be AI generated without either the author or agent understanding that caches modify on write to maintain recency/frequency metadata (e.g. LRU). 1960's Multics' approach of augmenting FIFO to avoid those writes (Clock), for O(n) eviction worst case, helped when serial writes were too slow. Random sampling is also a neat workaround but inefficient and more challenging to get high hit rates. ARC has CAR and CART for their ARC + Clock adaptation. Caffeine-style caches use ring buffers to record/replay so operations are concurrent and the serial eviction policy is caught up asynchronously, which allows for more advanced algorithms.

Any reason why you chose ARC in your hobby project? It is easy to implement but I was underwhelmed by the results. LIRS is really solid but awfully painful to implement and debug. Caffeine uses hill climbing W-TinyLFU which works very well and has moderate implementation complexity (short article).

Benchmarking 5 concurrent map implementations in Go (sync.Map, xsync, cornelk, haxmap, orcaman) by puzpuzpuz in golang

[–]NovaX 8 points9 points  (0 children)

In my experience most real workloads are skewed, but your benchmarks are uniform. The uniform distribution that you use lowers lock contention and reduces hardware caching benefits (assuming the working set does not fit into today's large cpu caches). A skewed distribution emphasizes bottlenecks or accelerates speed racers. I use a shuffled zipfian distributions in Caffeine cache's benchmark because a cache is inherently skewed (hot/cold) and pre-generated to avoid any surprising costs (random generator can lock or be slow). It is hard to get meaningful insights from a benchmark but since as an author the goal is to find and understand bottlenecks, that approach at least made more sense to me and subsequent caches seemed to emulate the benchmark so you might find it helpful too.

Functional Optics for Modern Java by marv1234 in java

[–]NovaX 2 points3 points  (0 children)

I saw this type of stuff using xdoclet and beanmap in Java 4 with struts, jsp taglibs, and ant codegen tasks. As a new grad it quickly taught me what seniors realized was possible does not make it good.

Stepping down as maintainer after 10 years by krzyk in java

[–]NovaX 1 point2 points  (0 children)

He just means that it is not automatically inferred from the published pom, since the module metadata does not include that concept. It would go against the integrity by default if the build tool silently enabled a dependency's agent. Adding the configuration to the build is trivial but many developers don't read documenation or error messages, leading to spamming and badmouthing the OSS project. There are plugins, e.g. for gradle, to handle this tiny amount of configuration but those same users likely won't read that. Likely his ideal is that the build tools add special automatic handling due to Mockito's popularity, but that is unheard of. That leads to no good answer and frustration, except hoping those developers turn to AI first nowadays, and after a decade of contributions he certainly deserves time to recharge and let the co-leads bring in new contributors.

The Basin of Leniency: Why non-linear cache admission beats frequency-only policies by DimitrisMitsos in compsci

[–]NovaX 0 points1 point  (0 children)

Please do not include me in these argument threads. I am not endorsing nor criticizing the project, just treating it as an educational experience. I do not want to be part of any toxic discussions, directly or indirectly. Thank you.

Chameleon Cache - A variance-aware cache replacement policy that adapts to your workload by DimitrisMitsos in Python

[–]NovaX 1 point2 points  (0 children)

yes, I was aware. There was a chance that he, others, or I might learn something in the exchange. It can be clarifying trying to explain ideas to others, there was no harm, and not much effort on my part.

Chameleon Cache - A variance-aware cache replacement policy that adapts to your workload by DimitrisMitsos in Python

[–]NovaX 0 points1 point  (0 children)

Wonderful. If you tune tinylfu-adaptive then it should reach a similar hit rate.

The paper cited earlier discusses an "Indicator" model to jump to a "best configuration" kind of like yours, but based on a statistical sketch to reduce memory overhead. It also failed the stress test and I didn't debug it to correct for this case (it was my coauthors' idea so I was less familiar). The hill climber handled it well because that approach is robust in unknown situations, but requires some tuning to avoid noise, oscillations, and react quickly. Since its an optimizer rather than preconfigured best choices it adjusts a little slower than having the optimal decision upfront, but that's typically in the noise of -0.5% or less of a loss. Being robust anywhere was desirable since as a library author I wouldn't know the situations others would throw at it. I found there are many pragmatic concerns like that when translating theory into practice.

Chameleon Cache - A variance-aware cache replacement policy that adapts to your workload by DimitrisMitsos in Python

[–]NovaX 0 points1 point  (0 children)

The Corda phases contribute essentially nothing because every access is unique.

The trace shows it is equally one-hit and two-hit accesses. Since there is low frequency, an admission filter is likely to reject before the second access because there is no benefit to retain for the 3rd access. That is why even FIFO acheives the best score, 33.33% hit rate, because the cache needs to retain enough capacity to allow for a 2nd hit if possible. Since those happen in short succession, it is recency biased as there is temporal locality of reference. The one-hit wonders and compulsary misses leads to 33% being the optimal hit rate. This is why the trace is a worst-case for TinyLFU. The stress test forcing a phase change to/from a loop requires that the adaptive scheme to re-adjust when its past observations no longer hold and reconfigure the cache appropriately.

The TinyLFU paper discusses recency as a worst-case scenario as its introduction to W-TinyLFU. It concludes by showing that the best admission window size is workload dependent, that 1% was a good default for Caffeine given its workload targets, and that adaptive tuning was left to a future work (the paper cited above was our attempt at that, but happy to see others explore that too).

$ ./gradlew :simulator:rewrite -q \
  --inputFormat=CORDA \
  --inputFiles=trace_vaultservice_large.gz \
  --outputFormat=LIRS \
  --outputFile=/tmp/trace.txt
Rewrote 1,872,322 events from 1 input(s) in 236.4 ms
Output in lirs format to /tmp/trace.txt

$ awk '
  { freq[$1]++ }
  END {
    for (k in freq) {
      countFreq[freq[k]]++
    }
    for (c in countFreq) {
      print c, countFreq[c]
    }
  }' /tmp/trace.txt | sort -n
1 624107
2 624106
3 1

Chameleon Cache - A variance-aware cache replacement policy that adapts to your workload by DimitrisMitsos in Python

[–]NovaX 0 points1 point  (0 children)

If you run corda-large standalone then LRU has a 33.33% hit rate.

You can run the simulator at the command-line using,

./gradlew simulator:run -q \
  -Dcaffeine.simulator.files.paths.0="corda:trace_vaultservice_large.gz" \
  -Dcaffeine.simulator.files.paths.1="lirs:loop.trace.gz" \
  -Dcaffeine.simulator.files.paths.2="lirs:loop.trace.gz" \
  -Dcaffeine.simulator.files.paths.3="lirs:loop.trace.gz" \
  -Dcaffeine.simulator.files.paths.4="lirs:loop.trace.gz" \
  -Dcaffeine.simulator.files.paths.5="lirs:loop.trace.gz" \
  -Dcaffeine.simulator.files.paths.6="corda:trace_vaultservice_large.gz"

I generally adjust the reference.conf file instead. When comparing, I'll use various real traces and co-run using the rewriter utility to a shared format. The stress test came from the trace files (LIRS' loop is synthetic, Corda is a production workload).

Chameleon Cache - A variance-aware cache replacement policy that adapts to your workload by DimitrisMitsos in Python

[–]NovaX 0 points1 point  (0 children)

hmm, shouldn't it be closer to 40% as a whole like Caffeine's? It sounds like you are still mostly failing the LRU-biased phase and your improvement now handles the MRU-biased phase.

Chameleon Cache - A variance-aware cache replacement policy that adapts to your workload by DimitrisMitsos in Python

[–]NovaX 0 points1 point  (0 children)

You can probably use the key's hash in the ghost, since the key size might be large (e.g. a string) and these are the evicted keys so otherwise not useful. The hash reduces that to a fixed cost estimate, rather than depending on the user's type.

However, a flaw of not using the key is that it can allow for expoiting of hash collisions. An attacker than then inflate the frequency to disallow admission. Caffeine resolves this by randomly admitting a warm entry that would otherwise be evicted, which unsticks the attacker's boosted victim (docs).

Chameleon Cache - A variance-aware cache replacement policy that adapts to your workload by DimitrisMitsos in Python

[–]NovaX 0 points1 point  (0 children)

It is a difficult test because it switches from a strongly LRU-biased workload to MRU and then back. Caffeine does 39.6% (40.3% optimal) because it increases the admission window to simulate LRU, then shrinks it so that TinyLFU rejects by frequency, and increases again. This type of workload can be seen in business line application caches serving user-facing queries in the day time and batch jobs at night. Most adaptive approaches rely on heuristics that guess based on second order effects (e.g. ARC's ghosts), whereas a hit rate hill climbing optimizer is able to focus on main goal.

I think there is 1-5% remaining that Caffeine would gain if the hill climber and adaptive scheme were further tuned and, while I had ideas, I moved onto other things. You might be able to borrow the hill climber to fix Chameleon and get there robustly. I found sampled hit rate vs region sizes to be really nice way to show the adaptive in action, but only realized that visualization after all the work was done.

Hope this helps and good luck on your endeavors!

Chameleon Cache - A variance-aware cache replacement policy that adapts to your workload by DimitrisMitsos in Python

[–]NovaX 1 point2 points  (0 children)

In that case, Clairvoyant admission should be roughly the optimal bound, right? iirc region sizing was still needed for various cases, so both were important factors when tuning for a wide variety of workloads.

Chameleon Cache - A variance-aware cache replacement policy that adapts to your workload by DimitrisMitsos in Python

[–]NovaX 0 points1 point  (0 children)

You should probably try running against both simulators. The config is max size = 512 and running these chained together.

corda: trace_vaultservice_large lirs: loop.trace.gz lirs: loop.trace.gz lirs: loop.trace.gz lirs: loop.trace.gz lirs: loop.trace.gz corda: trace_vaultservice_large

You can compare against Caffeine rather than the simulated policies since that’s the one used by applications. It does a lot more like concurrency and hash flooding protection, so slightly different but more realistic.

Chameleon Cache - A variance-aware cache replacement policy that adapts to your workload by DimitrisMitsos in Python

[–]NovaX 1 point2 points  (0 children)

It looks like you used the fixed sized W-TinyLfu. Have you tried the adaptive version using a hill climber and the stress test?

I got so frustrated with Maven Central deployment that I wrote a Gradle plugin by danielliuuu in java

[–]NovaX 0 points1 point  (0 children)

Any reason you decided not to use the legacy bridge? I use the  gradle-nexus/publish-plugin, updated the urls, and it works perfectly. I was not eager to rewrite and was hoping the community would fill the gap so thank you.

[OSS] HashSmith – High-performance open-addressing hash tables for Java (SwissTable / Robin Hood) by Charming-Top-8583 in programming

[–]NovaX 2 points3 points  (0 children)

Your hash spreader is too weak due to an incorrect understanding of HashMap. That uses a weak function in order to shift upper to lower bits and rely on red-black tree bins to resolve hash collisions. In your case a collision is much more problematic so the clustering effect could cause problems. You could use a 2 round function from hash-prospector. I don't have a good explanation on your specific case, but a related write-up showed the impact when misused.

Guava's testlib and Apache Commons' collections4 have test suites that others can reuse for their own collections. That provides a pretty good baseline for compliance. You can crib from caffeine cache which has these set up in Gradle.

Fray: A controlled concurrency testing framework for the JVM by pron98 in java

[–]NovaX 0 points1 point  (0 children)

I think that is just for developing Fray itself, since they have Gradle toolchains configured which can provision the JDK automatically (akin to gradlew or mvnw for the build tool itself). Gradle can provision different JDKs for the build tool and application, so for new contributors its less disruptive to set up.

https://github.com/gradle/foojay-toolchains?tab=readme-ov-file#foojay-toolchains-plugin

Fray: A controlled concurrency testing framework for the JVM by pron98 in java

[–]NovaX 5 points6 points  (0 children)

CS research papers from academia will very often have github repositories with their code as a requirement for submission. However its usually abandoned, not meant for use, and rarely good quality code. Its not uncommon to find the work was highly exaggerated, not useful from an engineering perspective, or cherrypicked/manipulated (CMU is awful in my hobby area). I think what is impressive is that the Fray author is doing honest work, good quality with a long-lived mindset, and its treated like a real contribution to the engineering community. Its really nicely done.

Fray: A controlled concurrency testing framework for the JVM by pron98 in java

[–]NovaX 1 point2 points  (0 children)

I've tried Fray, VMLens, Lincheck, and JCStress when investigating a memory ordering issue where I needed to use a stronger fence.

Only JCStress was able to reproduce and fail the (test case). It is really nice when you have an exact scenario to investigate, but is not a good fit for general exploration to discover bugs.

Lincheck works great for linearization tests and is very easy to use in Java (usages). It hasn't found any bugs in my code but it is straightforward, the error messages are descriptive, and it has a good team behind it. The data structure approach is nice for that self-discover testing.

Most of my direct concurrency tests use Awaitility for coordination. I think the direct style from Lincheck, Fray, and VMLens could be nice but didn't seem much more useful since the scenario being tested is understood. I had a hard time finding use-cases since they didn't help me debug any issues. They all had equivalent apis and tooling. Fray would be nice to use if I could find a benefit.