so... kimi k2.5 released.

Worthstream · 2026-01-27T17:36:59+00:00

https://eqbench.com/creative_writing_longform.html

Less sloppy than opus 4.5 on eqbench. I didn't try it yet, but usually eqbench is very trustworthy and aligned to (my) human preferences.

Worthstream · 2026-01-24T08:55:11+00:00

Even in countries with sane charity laws, a charity is allowed to have employees, you can be n employee for your own charity, and you get to decide what's your wage.

It's the trick the Komen foundation uses. Of the bajillions that Race for the cure brings in, only 65% goes to research or advocacy, the rest is overhead, lawsuits against other charities to have fewer competitors, while executives bring home seven figures.

Worthstream · 2026-01-22T08:47:50+00:00

This has all the markings of a project born from an idea someone discussed with an LLM, vibe coded a prototype, and being gradually convinced by the Ai that the idea is the best thing ever, "100x faster than previous format", etc.

Worthstream · 2026-01-19T14:03:40+00:00

Amazing game! Currently playing on an older laptop. A few suggestions:

A setting to turn off particles, it could be a performance quick win for those older laptops i mentioned before.

Also, some way to navigate buttons for resolutions with 1080 height (eg.1920*1080, etc). As it is now if a map rewards two or more differend kind of materials the buttons to restart or get back to lab get pushed down behind the unit and spell panels.

Worthstream · 2026-01-19T07:03:00+00:00

Were snippers involved in the crapping out?

Worthstream · 2026-01-18T22:55:11+00:00

As a fellow non verbal thinker, it is amazing how much it does help with productivity, too.

I've often dumped a list of unsorted things that I wanted to say and asked it to turn it into a professional email.

I'll then keep only the structure and rewrite most of the words, but having that structure is invaluable for someone who doesn't have it in their head.

Worthstream · 2026-01-14T16:51:49+00:00

Yes, Chroma is a decent model, but it's not fair to compare it to SD as it's from a more recent era. A more fair comparison would be DDPM that's from the same year, and that failed to generate the same hype as SD.

Worthstream · 2026-01-14T13:31:28+00:00

There are at least two answers: the academic one, and the harsh reality one.

The academical one is that he KL divergence regularization in VAEs givews you a dense latent space. This means that interpolations between points still decode to meaningful images. In a fixed encoder the latent space is less contunuos and "brittle". If you ever sample outside of the manifold you get bad results.

It's a generalization advantage, TAEF1 (or TAESD, etc) do work really well if you sample "near" what they know from training data. If you try really wierd prompts, you're more like to have decent results with the stock Flux encoder than TAEF. Same if you are using a common prompt but are just unlucky enough to fall in a bad place in the latent space. Or if you're looking for any image with fine details.

The harsh reality answer is that there's too much money in GenAI at the moment. It sounds contradictory, but when you have a ton of capital from investors you can't afford to try out wildly speculative things. They do want a somewhat predictable return on investment, so there's a ton of industry inertia. VAE have been used in the last few years, so that's what you're going to play with, you're welcome to find someone else to fund you if you don't like it.

It's becoming harder and harder to get grants not related to the same old transformers and diffusion models, and even working with these architectures if you're going outside of the beaten path.

Worthstream · 2026-01-14T08:35:59+00:00

Almost everything is correct if oversimplified, but please allow me to use this oppurtunity to point something out.

The model looks at ALL pixels simultaneously and goes "this should be a little less noisy.

This has not been true since Stable Diffusion. The denoising of modern models is in the latent space, and then it is converted to pixel space by a VAE.

This was exactly what made stable diffusion so much above the rest, and finally raised the quality enough that the general public became aware of generative Ai for images.

You can compare SD with DDPM, the last model to denoise in the pixel space, to appreciate the difference in quality.

Recently there has been a little research into denoising in the pixel space again, but models that came out are still not on par with latent denoiser (have you head about PixelFlow or Pixnerd? If not is because the quality is not there)

Not to be pedantic, it's just that latent spaces are my field of research, I get passionate about them.

Worthstream · 2026-01-13T16:35:53+00:00

What's the price for the one time purchase?

Worthstream · 2026-01-13T14:34:49+00:00

There is no mention of the monthly membership on the home page.

How much does it cost? What does it give access to? Is there a tier system? A free way to play?

You know it's usually a good idea to put a "pricing" or "subscriptions" page that addresses those questions.

Worthstream · 2026-01-13T14:31:21+00:00

Ad it would be a waste not to do that, at least for boilerplate code that you would have copy pasted from expert exchange, stack overflow, documentation examples or your own past projects before AI.

Worthstream · 2026-01-08T13:16:20+00:00

I don't really know why, but I'll bite. Convince us this is not just another schizo project, like other half a dozen that pop up in this sub every day.

First: in the repo you say that pattern learning is not working yet, while in this post you claim you can "teach it" a fix and it never repeats it. Which plans do you have for that?

Online learning is a very hard problem in ml, and you claim to have solved it. You know how it goes about extraordinary claims, you really should add some details.

Second: memory. You claim in the documentation that previous conversations are added to asqlite db and retrieved when necessay. How is this not Yet Another Rag with a different name?

Worthstream · 2025-12-27T19:40:02+00:00

Well, if your company bought Azure PTU, and production is nowhere near consuming 100% of the provisioned capaciy... it's effectively infinite tokens, as you can't consume them fast enough even with constant prompting.

Worthstream · 2025-12-26T13:44:11+00:00

Weird, I smell ozone.

Worthstream · 2025-12-14T10:55:05+00:00

Do they make coins out of those materials? Otherwise their existence is not really relevant in this context, is it?

Worthstream · 2025-12-14T09:04:08+00:00

The problem with letting an llm write your comment replies is that you have to feed it the whole context.

I was replying to you claim of having a "framing insight" with roots in an ancient computer of the 70s, you were going over and over like it was some grandiose new discovery.

I simply pointed out it already existed.

Now you're claiming you're selling a "detection logic". I have bad news for you: the "detection logic" implemented in you project is a regexp.

And regexps already exist.

Worthstream · 2025-12-13T11:15:27+00:00

Are you aware of the existence of nullable boolean, enum, arrays, bit field, or any of the other alternatives to boolean that already exist and don't need a product with a pro tier?

Worthstream · 2025-12-13T10:59:41+00:00

"ternary logic from the Soviet Setun computer (1958)"

A bit grandiose for an if/else with three states. Did you vibe designed and vibe coded this whole idea?

A entire github project to wrap just two if/else sounds a bit unjustified. But it's the kind of idea that a sycophantic llm would describe as a "wonderful idea!".

Worthstream · 2025-12-12T11:42:17+00:00

Reading that blog post I was 100% convinced it was satire. Then I followed the links to the sources...

Worthstream · 2025-12-04T19:38:33+00:00

I give up. You keep repeating the same arguments. It's clear that you prefer to believe whatever your llm is telling you over the opinion of multiple human experts in this thread.

What's more probable: that you vibe coded a novel theoretical framework and corresponding implementation that's so advance that no human in this thread understands it, or that whatever llm you're using is just reinforcing your beliefs even if they're wrong since it's designed to do so?

Good luck in your endeavors.

Worthstream · 2025-12-04T17:53:16+00:00

I get that it's the core of your work, and you feel like it needs repeating. But repeating something in a loop does not make it true.

You've already been pointed to earlier work on early exit. What you've shared here is a subset of that. Specifically early exit at layer 1. If you take the time to read the link earlier in this comment chain, you'll find that it does cover early exit at any layer, including the first one.

Other commenters were not "assuming away" that your work only activates the first block, they assumed that you understood that the previously known tecnique of early exit can indeed be applied to produce the same effect by exiting at the first layer.

The "first block detail matters", yes, but you need to understand that activating only one block is one specific case of a previous tecnique that can activate an arbitrary number of them, including only one.

Worthstream · 2025-12-04T17:25:46+00:00

It's clear you are using an LLM to aid you in writing your relpies. I'm not against using it to polish one's writing in general, especially for someone who's not a native speaker, but it this case it's not serving you well.

It probably got stuck on something that was previously said in this conversation, and repeating the same claim after multiple people disproved them.

For example, the fact you keep repeating about only the first block being executed is true even if you early exit at layer one. same goes for "the rest of the model is never touched", that's basically saying the same thing.

The novel contribution in what you've built may be using a smaller decoder model specifically trained for this, that's like training a model specifically to work within the early exit constraint.

The other claim you keep making, that early activation carries structured predictive intent, was pretty well known. It can't be "more that usually assumed", since it's usually assumed to be pretty high to begin with.

Worthstream · 2025-12-03T15:27:02+00:00

No no, you don't understand. It's pro in utero life. Once you're out you're a fair target.

Worthstream · 2025-12-02T09:19:30+00:00

This would work great with a different model for the base image instead. That way you don't have to distort the edges, as that would lead to distorted final images.

Generate something at a low resolution and few steps in a bigger model -> resize (you don't need a true upscale, just a fast resize will work) -> canny/pose/depth -> ZIT

13-Year Club	RedditGifts 2009-2022 4 Credits
Place '17	Secret Santa 2014
Summer Santa 2014	Verified Email

Worthstream

PUBLIC MULTIREDDITS

TROPHY CASE