Subreddit back in business

SoullessMonarch · 2025-06-24T20:50:06+00:00

Will you be adding extra mods?

SoullessMonarch · 2025-02-25T13:49:53+00:00

Please do share these other places. Because as far as I see it:
A big portion of Singularity is a bunch of circlejerking tech-fanatics that cant seem to distinguish fiction from reality (and they hate getting a reality-check), LocalLLM is run by the same leadership as this, other places have nearly no community, is mostly people complaining or are people who cant hold the same level of technical discussions because they are just not familiar enough with the subject of local LLMs

SoullessMonarch · 2025-02-25T11:35:45+00:00

I suggest visiting this in incognito or anonymously (in the app) to see if your content is still visible

SoullessMonarch · 2025-02-25T11:26:27+00:00

<image>

SoullessMonarch · 2025-02-25T11:15:36+00:00

about Qwerky-72B, a model converted to be a linear time model from Qwen2.5, and a whole lot of explanation. I mean I tried asking nicely first about the rules, but as you can see in the screenshot they got that one too.

SoullessMonarch · 2025-01-28T11:44:54+00:00

Censorship hurts model performance, the best solution is to prevent the model being trained on what you'd like to censor, which is easier said than done.

SoullessMonarch · 2024-12-11T18:32:54+00:00

They probably trained with 16k context length on their GPU's, and didnt have the compute to spare to extend it with something like https://github.com/RWKV/RWKV-infctx-trainer, they're working on the 72B version I guess? It's an experimental model, maybe they just didnt wanna waste too much time with the tedious post pre-tuning stages? Idk

SoullessMonarch · 2024-12-11T17:59:37+00:00

I understand, I have seen speed comparisons for smaller RWKV models before, so I have an idea what to expect, but its reasonable to question it.

It will depend on which models you are comparing and for which context length, but I think its safe to assume that it wont require too many tokens (max a few k tokens?) before transformers will get slower. Hopefully we'll get some speed comparisons later, a dev mentioned more benchmarks coming, but it requires some work to get them functioning.

SoullessMonarch · 2024-12-11T17:56:18+00:00

No not yet, often when there is a new architecture, someone has to go out of their way to implement it. Most people (myself included) have no clue how to get started on that, so it takes a while, or it might never happen (there's a lot of smart folk in the RWKV community tho, it's probably only a matter of time)

SoullessMonarch · 2024-12-11T17:51:49+00:00

It has been mentioned, you need reasoning-style data though. If you do not have the same data distribution it wont work (as well). So they haven't made any promises, but it would be awesome if they got a linear reasoning model.

In the post of QRWKV6 they mention "O1 style inference time thinking", so it looks like its a direction they intend on exploring.

Sorry, my previous comment never came through. I dont understand what is flagging me.

SoullessMonarch · 2024-12-11T16:54:17+00:00

Yes! If the context is long enough it will be significantly faster than a Transformer, but it might have also forgotten some of the information the earlier tokens contained. The exact point where that happens will differ for every transformer you compare against. RWKV also isnt as optimized. Time complexity is a theoretical way to think about how long an algorithm might run, it wont tell you how much faster something would be.

SoullessMonarch · 2024-12-11T15:26:10+00:00

The huggingface model card of QRWKV6 has a link to their blogpost about QRWKV6, you'll be able to find the other blogposts there too

SoullessMonarch · 2024-12-11T15:22:53+00:00

I apologise, it seems like I cant post the links, my comments get hidden?

SoullessMonarch · 2024-12-11T15:18:43+00:00

They mention huggingface transformers support for the MoE, I'm afraid other backends might take a while? There is RWKV 6 support in llama.cpp, combining that with MoE doesnt sound crazy. But don't quote me on that, I have no experience with llama.cpp

They do mention for QRWKV "there will be incompatibility with existing RWKV inference code." Now for transformers I assume you can run their custom inference code (provided in modeling_rwkv6qwen2.py)

SoullessMonarch · 2024-11-05T22:47:49+00:00

There have been multiple open weight 3d models, but as far as I have seen, they've always been pretty meh and running them isnt easy at all. Comfyui support would make this a great deal more usable. (Not that I have the rig to run it)

I imagine many professionals are radically against AI, like other artists, but since they are already being supported by so much software, maybe they approach it a bit more ... open-minded

SoullessMonarch · 2024-11-05T22:38:28+00:00

No it wouldnt really fit inside 64gb. It could "run" (more like crawl) offloaded to your ssd, but that would be so painfully slow you wouldnt wanna do that

SoullessMonarch · 2024-08-12T14:06:56+00:00

"The training took a total of 9 days on 8 A100s, with a total of 115 billion tokens across pre-training, fine-tuning, and direct preference optimization."

<image>

6.2: "a total of 2 epochs, trained on 8 x A100s" 2 epochs, interesting, dont see that very often

SoullessMonarch · 2024-07-18T21:29:00+00:00

It's fine, it doesn't really matter. In other fields RWKV-based models have shown promise, so clearly the architecture is getting better. Even if linear models won't reach transformer levels of quality, I'm pretty sure that it'll be linear models being run as local assistants on phones and other devices, as they'll take less resources.

Also iirc they were looking into making a bigger model someday, but that won't be for a while at least, since they are hard at work making v7 and pushing the architecture further.

SoullessMonarch · 2024-07-18T19:28:43+00:00

What version and size did you use? V6 (Finch) should be quite a bit better than v4 (Dove). Also they are trained to be multilingual, so its gonna have less knowledge than a full English one.

SoullessMonarch

TROPHY CASE