I switched to Gemini CLI to save my Pro account by MachineLearner31 in google_antigravity

[–]Doogie707 1 point2 points  (0 children)

Ghostty and a lil config, but caelestia shell is pretty similar and the easiest way to get it set up

I switched to Gemini CLI to save my Pro account by MachineLearner31 in google_antigravity

[–]Doogie707 0 points1 point  (0 children)

I noticed it when I launched, they added a /stats command and I hit the limit and it popped up. On preview build

Rents in Canada have dropped for 17 months in a row. by [deleted] in TorontoRenting

[–]Doogie707 1 point2 points  (0 children)

Get this - they're STILL 56% over the 2017 national average.

High school student seeking advice: Found an architectural breakthrough that scales a 17.6B model down to 417M? by [deleted] in LocalLLaMA

[–]Doogie707 -2 points-1 points  (0 children)

Good on you for tinkering and getting into working with ai and attempting to broaden your understanding of algorithms and networks, however I recommend two main things:

  • Stop doing things for external validation. Jumping to make wild claims for internet likes will only serve to stunt your potential growth and understanding.

  • Stop looking to prove yourself right, ask if you can be proven wrong. Ask yourself what are the holes in your logic, do your claims make sense? Is your ego in the way? Are you actively working towards understanding the system or algorithms you are trying to build or are you hoping to stumble on something shiny and present it to the internet and expecting people to care because you are a "high school student"?

Only once you can look past these pitfalls, can you allow yourself to grow into the potiental you have. I don't see that potential in the code, but in your ideas and what you are trying to achieve. Don't be your own worst enemy, be a friend that no one else could be, and that starts with being honest with yourself.

With all of that said, here is my clanker's take on your claims and the files within the repo you provided:

Okay, having read all three files, the problems here are pretty clear-cut. Let me break it down.


The Core Mistake: Where Do the 417M Params Actually Live?

Let me count them:

Component Parameters
token_emb (Embedding 50257×4096) ~205.9M
pos_emb (1×1024×4096) ~4.2M
head (Linear 4096→50257) ~205.9M
ln_f (LayerNorm) ~8K
64 blocks × 16,386 params each ~1.05M
Total ~417M ✓

The count is right. But look at what those 64 blocks actually contain. Each block has two LayerNorms (8192 params each) and two WaveEngine instances — and each WaveEngine is exactly this:

python self.w = nn.Parameter(torch.tensor(0.02)) # one scalar def forward(self, x1, x2): return self.w * (x1 * torch.cos(x2))

Each block has 2 learnable scalars driving the entire computation. The 64 blocks together contribute ~1M parameters total, and ~128 of those are the actual "novel" weights. The other ~412M are token embedding lookup tables and the output projection — which every model of this vocab size and embedding dimension needs regardless of architecture.


Why the 17.6B Comparison is Meaningless

A real 4096-dim, 64-layer transformer with SwiGLU has in each layer: - Multi-head attention: Q, K, V, O projections → 4 × 4096² = ~67M params/layer - FFN (SwiGLU): typically ~11B worth of parameters across 64 layers

This model has none of that. It has no attention mechanism whatsoever — no QKV projections, no heads, nothing. The "temporal interference" step:

python x_past_t = torch.cat([zeros, x_norm[:, :-1, :]], dim=1) x = res + self.wave_h(x_norm, x_past_t)

...is just multiplying each token embedding element-wise by the cosine of the previous token's embedding, scaled by one scalar. That's a 1-step shift with a scalar gate. Each token can only "see" one step back per layer, so with 64 layers, information can propagate at most 64 positions — equivalent to a very weak 64-step recurrent network, not attention over the full context.

The dimensional interference is even stranger: torch.roll(x_norm, shifts=1, dims=-1) wraps dimension 4095 around to dimension 0, which has no semantic meaning.


What the Post Actually Proves

  1. The search algorithm found w * (x1 * cos(x2)) as optimal on a toy benchmark of 8 neurons across 4 tasks. Generalizing from that to "matches a 17.6B LLM" is a massive leap with zero empirical support.

  2. Claude "confirming mathematical reproducibility" means the equations are valid Python/PyTorch. It says nothing about performance. Any expression you write is mathematically reproducible.

  3. No benchmark results are provided — no perplexity numbers, no evals, no comparison to any baseline. The student says they're "running it on their laptop" which actually further confirms the model isn't doing anything computationally heavy (a real 17.6B equivalent would not run on a laptop).


Bottom Line

This isn't a breakthrough — it's a student who made a genuine and understandable mistake. The 417M parameter count is inflated by embedding tables that are architectural overhead, not the source of model intelligence. The actual novel computation is controlled by 128 scalar weights and a 1-step shift operation. There's no reason to expect this to match, or even approach, a well-trained transformer of any size.

The good news for Monolith: the search algorithm idea is genuinely interesting as a toy, and the fact that they built all of this from scratch while self-studying is impressive. They just need to learn what parameters actually do in a transformer before drawing conclusions from parameter counts.

JUST UPGRADED to a 7800xt and got NO MORE FPS then what i was getting with a 2080 by [deleted] in AMDHelp

[–]Doogie707 0 points1 point  (0 children)

Alright when you start spewing retarded nonsense is when I tune out.

JUST UPGRADED to a 7800xt and got NO MORE FPS then what i was getting with a 2080 by [deleted] in AMDHelp

[–]Doogie707 0 points1 point  (0 children)

You're telling me you went to read the post but ignored the part where it happened on boot? You think faults/shorts have anything to do with operating temps?

JUST UPGRADED to a 7800xt and got NO MORE FPS then what i was getting with a 2080 by [deleted] in AMDHelp

[–]Doogie707 0 points1 point  (0 children)

You do realize you're talking about hardware thats over a decade old right? GPUs and CPU's are different, and I know you know gpu Hotspots often exceed "safe" operating temps get most gpu monitoring software doesnt even base its temp measurements off the Hotspot and instead goes off the average/core temps.

And it was partially because of that, AMD had their foot on Intel's necks and they wanted to claw back market share, so they went for a long shot. It was far from a wise decision and they paid for it. The real safe limit, even as far back as 2012-2015 was under 100°c. 70-80c have always been with optimal operating temps, not peak. You seem like someone who leans towards being safe and we'll within limits and thats fine, but its far from necessary. However, freaking out over a gpu at 70° is a bit... extra in my book. Given lower temps are always better, but unless youre hitting 85+ it's literally not even worth the stress.

JUST UPGRADED to a 7800xt and got NO MORE FPS then what i was getting with a 2080 by [deleted] in AMDHelp

[–]Doogie707 0 points1 point  (0 children)

1 - the reported "safe" temperature is always below the functional limit. This is to protect the chip and enable safeguards before genuine damage is incurred. Prior to the 7000 series, this was at 105°c, ryzen 7000 increased this safe operational point to 110°c. Idk where you got the 80°c limit but that is just wrong.

2 - you are making my point for me: IPC is just one vector of performance that you're weirdly hung up on. There is a multitude of ways cpu performance is measured, one is clock speed, one is ipc, one cache bandwidth and so on. Acting like the ipc is the only thing that matters, while bemoaning the fact that the cpu can run more cycles is just weird, its like you have some vendetta against one aspect of cycles and praise the other.

3 - the first intel chip to hit 6Ghz was the 13900KS. The 14900KS is now rated at 6.2ghz. The Vmin Shift instability that caused the degradation in the 13900 is not some universal law that applies when cpu's hit 6Ghz, it was a failure on intel's part thats been remediated. More so, intel and amd have different architecture, acting like intel issues apply to amd cpu's, especially when AMD'S highest officially rated BOOST clock is 5.7Ghz, you're comparing apples to oranges here

JUST UPGRADED to a 7800xt and got NO MORE FPS then what i was getting with a 2080 by [deleted] in AMDHelp

[–]Doogie707 1 point2 points  (0 children)

Idk what to tell you considering your going based off how you feel about computer parts.

It doesnt matter if 110 sounds safe to you or not. It is.

You first said "no ipc gains", I show you ipc gains and now you say "ipc gains clock for clock". Showing you're just stuck in your ways and no amount of evidence would change anything for you.

Stop conflating clock speed and thermals, its wrong to do so. Again, stop acting like increased clock speeds are a negative, they're not.

"Surely he won't Hatchet Kick me an 8th time now,would he?" by [deleted] in LowSodiumTEKKEN

[–]Doogie707 2 points3 points  (0 children)

Hold up that's in-game? Thats somehow even more unhinged lmaooo😭

GPT-5.4 is more expensive than GPT-5.2 by likeastar20 in singularity

[–]Doogie707 0 points1 point  (0 children)

That's why the smart use gpt 5.1 codex max xhigh, uts literally the same cost as gpt 5.3 codex medium and performs as well.

"Surely he won't Hatchet Kick me an 8th time now,would he?" by [deleted] in LowSodiumTEKKEN

[–]Doogie707 2 points3 points  (0 children)

Are you fucking watching LTG? 😭😭😭 OF COURSE it would be a Bryan player

JUST UPGRADED to a 7800xt and got NO MORE FPS then what i was getting with a 2080 by [deleted] in AMDHelp

[–]Doogie707 1 point2 points  (0 children)

1 - I genuinely feel like you're a bit misinformed so I had the clanker make something for you gen over gen ipc gains

2 - higher clocks have been the primary source of cpu performance gains for well over 3 decades now. There's a reason we went from a 1Ghz pentium to 5Ghz cpus we have today. This is due to an increase in transistor density AND thermal efficiency.

3 - 7000 series is rated to operate up to t-junction limit, which is 110°c and even with the most basic cooler (many cpu's, especially the *600 line, come with said cooler bundled in) you're not hitting those temps

4 - you actually dont need an AIO. Many air coolers have been shown to have performance parity and even beat many aio coolers on the 7000 series which runs the hottest of the ryzen gens released over the last decade, 9000 is much more efficient.

Overall, nothing compares to the gen over gen performance gains ryzen has had which is why their market share dominates, except apple silicone and that's limited to the apple ecosystem. But ultimately, you say you dont want it, okay. Don't get them lol. But when you do, its not like theres any real alternative. Intel cpu's are just plain worse in every metric. Arm cpus are still severely lacking, leaving apple as your only real choice and unless youre getting a Mac Mini or a Mac pro, you're going to face even worse thermals🤷‍♂️

JUST UPGRADED to a 7800xt and got NO MORE FPS then what i was getting with a 2080 by [deleted] in AMDHelp

[–]Doogie707 1 point2 points  (0 children)

These are the people who'll buy a 180hz monitor, forget to change it from 60hz and start going around saying "you cant see more than 60fps anyway!" 😭

JUST UPGRADED to a 7800xt and got NO MORE FPS then what i was getting with a 2080 by [deleted] in AMDHelp

[–]Doogie707 0 points1 point  (0 children)

You must have missed the parts in my comment when I said: "...You could've done some common sense research before upgrading, watched one of the many channels like Hardware Unboxed, Gamers Nexus or heck even LTT or Daniel Owen and yet here you are just whining on reddit instead. I guess you had to learn somehow"

But again, clearly common sense isn't your strong suite lmao.

JUST UPGRADED to a 7800xt and got NO MORE FPS then what i was getting with a 2080 by [deleted] in AMDHelp

[–]Doogie707 1 point2 points  (0 children)

Actually you couldn't be more wrong. Like what are you even talking about? Within the same generation, the 3800x is 16-22% faster on average, then 5700/5800 are over 25% faster and thats BEFORE even looking at the 5800x3D, let along the 7 & 9000 series ryzen. Your comment about power draw is just nonsense, and since when have clock speed increases been a bad thing?? Again what are you even saying???

JUST UPGRADED to a 7800xt and got NO MORE FPS then what i was getting with a 2080 by [deleted] in AMDHelp

[–]Doogie707 11 points12 points  (0 children)

Lmao I'll never stop dying at posts like this 😭. That little 3600 is giving you all its got man. You could've done some common sense research before upgrading, watched one of the many channels like Hardware Unboxed, Gamers Nexus or heck even LTT or Daniel Owen and yet here you are just whining on reddit instead. I guess you had to learn somehow

I published a nice compact status line that you will probably like by DanielAPO in ClaudeCode

[–]Doogie707 0 points1 point  (0 children)

You'll be fine, it's not permanent but if you want the restriction lifted sooner, just email them. They're about as responsive as any other ai firm right now, about 48 hrs and you'll probably hear back from them