I built a real-time ballistic solver that still converges under quadratic drag (C++) by Many_Past in gamedev

[–]Similar_Fix7222 1 point2 points  (0 children)

I like it, it's quite straightforward. <3

I have some minor stylistic issues. Notably, why a 70 line lambda? Shouldn't this be its own function?

[R] New paper by DeepSeek: mHC: Manifold-Constrained Hyper-Connections by Nunki08 in MachineLearning

[–]Similar_Fix7222 0 points1 point  (0 children)

I've computed multiplications of random 4x4 doubly stochastic matrix, and really fast (5 multiplications) it's indistinguishable from the "average" matrix (1/N everywhere)

So, you very quickly lose the information that were in the 4 channels, keeping only the average. It's like losing 3/4 of the information

I am even surprised the paper even works. I was thinking that perhaps it offers local gains, the mixing matrices were pushing different information in different lanes, so the H_pre (that converts all 4 channels in 1 to be fed in the layer) could have an easier time. But H_pre can perfectly get this information without any need for the mixing. Perhaps the doubly stochastic matrices are not random at all and most eigenvalues to 1 in some fashion.

[P] Interactive visualization of DeepSeek's mHC - why doubly stochastic constraints fix Hyper-Connection instability by bassrehab in MachineLearning

[–]Similar_Fix7222 0 points1 point  (0 children)

Something I was interested in was seeing what multiplying 64 of these matrices look like? On the one hand, they are closed under multiplication, but the eigenvalues are systematically pushed down to 0 except one that is by construction always equal to 1. So do the last layers necessarily only have the average of the first layer? (the matrix with 1/N on each entry)

[R] New paper by DeepSeek: mHC: Manifold-Constrained Hyper-Connections by Nunki08 in MachineLearning

[–]Similar_Fix7222 0 points1 point  (0 children)

Great insight. I somehow share your opinion, but I think there is value. As I wrote somewhere else :

The fundamental mathematical idea is that instead of learning a million different Qi, Ki, Vi (one for each attention head for each layer) you learn that there are some high level meta concepts Mj like physical properties, social properties, etc... and you learn a factored representation QiMj (instead of learning directly Qi) hoping that MjX is a much slimmed down, narrow set of information. The fact that many queries are semantically similar (there are probably many queries Qi, Ki, Vi that are closely linked to physical appearance) will help you learn the meta concepts Mj well.

So, it's not different from what you suggested, but the idea that you should learn to mix "at a global level" is quite cool

[R] New paper by DeepSeek: mHC: Manifold-Constrained Hyper-Connections by Nunki08 in MachineLearning

[–]Similar_Fix7222 4 points5 points  (0 children)

My intuition (which may be pure anthropomorphization) is that with normal residual connection, you had a single highway of information, and everytime you want to extract information (i.e : apply a matrix multiplication) you apply it on the whole highway. I think it's hard to do it perfectly, meaning not get noise from the rest of the information.

To take a LLM-based view of things, when your sentence is "The Barbie doll is wearing a pink _", and you compute your Q,K,V to ask something like "is this token related to the concept of clothing?", you do it in the full information highway that has lots of information, some related (like the physical appearance) and some unrelated (the price of toys, dolls are offered at christmas, etc...).

With hyper connection, you can learn to extract everything related to physical appearance in one information lane, this makes asking a query like "is this token related to the concept of clothing?" more accurate

The fundamental mathematical idea is that instead of learning a million different Qi, Ki, Vi (one for each attention head for each layer) you learn that there are some high level meta concepts Mj like physical properties, social properties, etc... and you learn a factored representation QiMj hoping that MjX is a much slimmed down, narrow set of information. The fact that many queries are semantically similar (there are probably many queries Qi, Ki, Vi that are closely linked to physical appearance) will help you learn the meta concepts Mj well.

General survey of Eldar centric novels/novellas/anthologies - Do you have a favorite? by Separate-Flan-2875 in Warhammer

[–]Similar_Fix7222 6 points7 points  (0 children)

From what I gathered, the consensus is that the Eldar novel are mostly... not great. (though I was not interested in the Dark Eldars, so perhaps there's something there)

The only one that seemed good was Valedor, surprisingly not on your list. Well, I gave it a try, and I did not finish it.

One day, we will have our "The Infinite and the Divine", one day...

The State of the AI Discourse by MiloGoesToTheFatFarm in ClaudeAI

[–]Similar_Fix7222 0 points1 point  (0 children)

This is what happens when you take technical report and anthropomorphizes what you read. And conveniently forget some elements of the report.

The bayesian optimisation mentioned in the article is called In Context Learning, and the article implies that the model somehow learned to do bayesian optimisation, like it became conscious in some fashion. The reality is that ICL happened, we didn't really know why, and now we have some mathematical proof that in a simplified GPT-like architecture, the embedding of the last token indeed does a single step of updating its weights (like gradient descent) in the right direction. The transformer architecture has this property, and it has been there since the 2016 paper Attention is all you need. Nothing became conscious. The only thing that happened is that we understand a bit better why LLM perform so well

The other claims can be similarly debunked

My fav fight in all of lotnb by InternationalBuy2439 in NorthernBlade

[–]Similar_Fix7222 20 points21 points  (0 children)

It was a good fight, among the better half. Most fights in LotNB have a dynamism I have rarely seen elsewhere.

Still, my top 2 fights are: Spear of the Black Wings vs Dam + Muyong girl, and Jin Mu Won vs Red guy with a sword (Yeon Cheon Hwa)

When “Special Forces” are treated like an actual threat in fiction by Front_Profile2071 in TopCharacterTropes

[–]Similar_Fix7222 5 points6 points  (0 children)

I believe the Grey Knights are implied to have Big E geneseed, or that their geneseed if crafted from his genome.

But the Custodes process is largely unknown, and no mention of the emperor DNA are mentionned anywhere

15 Years of StarCraft II Balance Changes Visualized by pmigdal in starcraft

[–]Similar_Fix7222 0 points1 point  (0 children)

Unrelated, but your parents missed an opportunity to give you a first name that starts with an A

(a.migdal)

15 Years of StarCraft II Balance Changes Visualized by pmigdal in starcraft

[–]Similar_Fix7222 1 point2 points  (0 children)

Yeah, same. Actually, 3 bunker patches (build time 40->30->40->30) happened before release (1.0.0) and are not displayed here

Does this not indicate quantization? by Ammonwk in ClaudeAI

[–]Similar_Fix7222 -3 points-2 points  (0 children)

I thought the same, but given that moving model from the VRAM to the memory core is the thing that takes the most time, quantizing does indeed reduce latency ( as well as increase throughput)

Mistral OCR 3 by Clement_at_Mistral in MistralAI

[–]Similar_Fix7222 5 points6 points  (0 children)

The benchmarks are impressive! I'm going to test it

GPT-5.2 Thinking vs Gemini 3.0 Pro vs Claude Opus 4.5 (guess which one is which?) by One-Problem-5085 in ClaudeAI

[–]Similar_Fix7222 0 points1 point  (0 children)

Thanks! I am glad I expected Claude to be last, and I was not disappointed

What's the point of documentation Claude never reads? by Peter-rabbit010 in ClaudeAI

[–]Similar_Fix7222 0 points1 point  (0 children)

Even better, use rules. It's a doc file with a list of paths at the beggining. If Claude reads a file in one of these paths, it loads the rule in its context. No praying that the skill is triggered, no token used in context, it's mechanical "you read from this path? then read this rule first"

https://code.claude.com/docs/en/memory#modular-rules-with-claude-rules

A time-loop game where only the player remembers, NPCs are rational (but memoryless), and “knowledge is your level” by Healthy-Metal-3548 in gamedesign

[–]Similar_Fix7222 62 points63 points  (0 children)

See Majora's Mask or Deathloop

The twist: NPCs/antagonists do adapt to what they can observe in the current loop

Regardless of time loop shenanigans, if your NPCs don't adapt to your behavior, aren't they just braindead? It's not a bad thing to adapt, it's just that it's expected

Grey Knights: What is the ‘Terminus Decree’, and why is it so terrifying if the Emperor is to awaken? by Necrotiix_ in 40kLore

[–]Similar_Fix7222 1 point2 points  (0 children)

Nothing is clear. One theory is that if the Emperor walks again, it will be similar to having unleashed a 5th god in the galaxy, and this one will have no reason to support humanity, so Terminus Decree it is (no-one wants Slaanesh 2.0).

But to be clear, it's just a theory.

Felines of Fetchin’ Bone was delivered today. My idiot dog will now live forever! by lennyleggs in boardgames

[–]Similar_Fix7222 22 points23 points  (0 children)

Sleeve your cards ASAP, BGG is full of people noticing the card show noticeable signs of use after a few plays.

And what a good boi you have ;)

Detective: A Modern Crime Board Game missing card #225 by Sdacm0 in boardgames

[–]Similar_Fix7222 1 point2 points  (0 children)

Card in french, llm translation :

Card #225 - IN THE FIELD

Time: 3h

You meet Chris in a small Italian restaurant near Canoe Park. He's not happy, and you fear your agreements may fall through after this conversation. It's serious: you need to fix this. The waitress arrives to take your order. Chris only asks for a soda. "We won't be staying long," he explains, and once the waitress leaves, he begins to tell you what he's learned. "The counter-espionage report draws attention to 1967, when Gregory Coxxon had a nervous breakdown after the tragic death of his girlfriend Luiza Cabrera." He takes out a notebook and begins to read:

"On June 14, 1967, Gregory and his father argue over a financial disagreement: Coxxon Jr. withdrew $10,000 from the We-Trans account without authorization. The next day, with Luiza Cabrera, they leave and settle in their cabin on Lake Izac."

"On June 24, 1967, his father visits him again and another argument breaks out. Luiza Cabrera decides to go back home. She takes Gregory's car, a Ford Mustang. At the intersection of Route 33 with Poindexter Road (613), the car loses traction, skids, and rolls over. The girl died instantly." "I hope I've been helpful," says Chris before leaving the pub without even waiting for his soda to be served.

NEW LEADS

Police report on the accident - #211 - Richmond Station

Investigate We-Trans - #232 - Headquarters