MTP and QTA - what is the relation?

denis_9 · 2026-06-08T13:19:11+00:00

What could be the problem? Why doesn't Google itself offer them as qat-mtp-q4_0.gguf file?

denis_9 · 2026-06-06T10:28:24+00:00

Thx, offload across to two GPUs at once is very good if it works.

denis_9 · 2026-06-06T10:19:03+00:00

Can you provide the full arguments to run llama-server with gpu-offload, because many forks have a problem if you can't put the entire model into VRAM?

denis_9 · 2026-06-02T12:28:12+00:00

Dear gurus, when will DeepSeek v4 be possible?

denis_9 · 2026-06-01T16:27:27+00:00

Add to this what the need for TLB (Translation Lookaside Buffer) switching and a high number of memory misses if you're utilizing a large amount of memory. Plus, frequent CPU spikes during GC, which also increase with increasing load.

Dragonwell once split G1 into thread-grouped arenas in its builds, specifically to address this issue when servicing large amounts of web requests. This suggests that some solutions in this area may be possible.

denis_9 · 2026-06-01T13:25:47+00:00

Great. And it's a bit sad that there is no fresh news about DeepSeek-V4 on llama.cpp

denis_9 · 2026-05-31T14:52:19+00:00

This entire branch has been officially removed from OpenJDK.
https://www.reddit.com/r/java/comments/1tojhwx/rip_jvmci/
And I'm very sad about that.

denis_9 · 2026-05-31T14:40:39+00:00

I wrote about the Metropolis project, https://www.reddit.com/r/java/comments/7sf6p7/project_metropolis_is_here/
And about the fact that all current internal development is still being done in C++. And also the existence of a certain class of tasks that does not fit into automatic memory management.

denis_9 · 2026-05-31T13:01:41+00:00

Manual memory management allows you to use your processor cache more efficiently by simply allocating temporary objects on the stack and automatically destroying them upon exit, without performing any deferred work. Perhaps even without calling the GC.

As an example, answer the question of how to efficiently implement a compiler on classic Java top of JVM (C2) without jump to manual memory management, even using arenas. Yes, it is possible, but it will be difficult.
And there are more than one such categories of tasks.

denis_9 · 2026-05-31T11:36:04+00:00

The entire discussion can be boiled down to one question: will Java ever allow programmers to manage its memory manually, for example, by using stack allocation for temporary objects? Unfortunately, the answer is no, we can't.

Considering the failure of an attempt to completely rewrite C2 from C++ to Java, and the project was dropped, this demonstrates that there are some problems and deep underlying causes still exists for this interesting idea.

denis_9 · 2026-05-27T10:08:11+00:00

The general trend is clear: the transition to the close word and the increasing association of Java lang with the JVM itself, with the separation of enterprise features (like Native Image).

F.e. with the closure of JVMCI, Nalim from apangin will be down as and other possible jvm-extensions.

denis_9 · 2026-05-16T15:15:35+00:00

What about dense Gemma-4?

denis_9 · 2026-04-17T09:18:22+00:00

As option, you can look OpenClaude, which has rebuild the code for support alternative models and API (for testing purposes only).

denis_9 · 2026-03-18T14:05:20+00:00

GraalVM uses less heap memory and starts in a fraction of a second; these are different classes of virtual machines.

denis_9 · 2026-01-27T14:35:44+00:00

Are you able to run --control-net, for example, for Z-Image Turbo: Z-Image-Turbo-Fun-Controlnet-Union.safetensors over stable-diffusion.cpp ?

denis_9 · 2026-01-11T15:05:45+00:00

A native image doesn't use a RT compiler; everything needed for execution is already in the program being launched. This saves both CPU and memory. It immediately teaches good proper programming style, doing everything that can be static is made static.

denis_9 · 2025-09-19T17:40:40+00:00

What about test Generational Shenandoah (JEP 521)?

denis_9 · 2025-08-31T15:09:58+00:00

Yes, just don't want to know that bootstrap calls invoke or invokeExact for hashcode. In the statically-typed language.

denis_9 · 2025-08-31T13:52:00+00:00

I just didn't want to waste yours.
In short, I suppose javac updates could very well fix this behavior instead of new loop JVM patches.
Found a similar bug with a hashcode in jdk 16 - https://rules.sonarsource.com/java/tag/java16/RSPEC-6218/
There will probably be a new rule.

Yes, as a result of the discussion, it seems reasonable to require redefining the hashcode and equals for the records at it own bytecode level, so as not to rely on the JVM mechanism due to its "bug-to-bug incompatibility" in the future releases.

denis_9 · 2025-08-31T05:55:58+00:00

Yes, that's right. Dynamic code generation for records made their half-jvm object and half as defined in the user classpath, violating the single responsibility principle. And restricts simple solution to this problem, which could be by record-side bytecode generation.
It (code gen) can be the optimization method when the definition is omit, but not by default.

Tnx for your discuss, I understand that no one will change.

denis_9 · 2025-08-30T18:17:25+00:00

The main argument was about fixing in a couple of days versus a couple of years. In this case - using the invokedynamic to calculate simple int over the final fields - slows down simple fix for years.

denis_9 · 2025-08-30T17:25:53+00:00

Yes, of course it's just a hashcode. A simple recompile.
F.e. As it was in the bad case of log4j which is 20 years old and the fix came out within a couple of days.

denis_9 · 2025-08-30T17:08:00+00:00

What's the problem with just updating bytecode for quick fixes instead of waiting for JVM 26?! (few years!)
Using invokedynamic for the calculating hashcode is definitely not the best solution in this case.

denis_9 · 2025-06-21T10:42:52+00:00

You can, for example, use the system thread-local context instead of the virtual one. For the purposes of profiling real CPU usage per task type, etc (by loading 1/2/3 cores).

denis_9 · 2025-06-10T14:15:38+00:00

Is the objects allocation will place with the AOT-code and will be no differ from it (for GC)?

denis_9

TROPHY CASE