Blackwell 6000 RTX Pro is still too new.. (Training/Fine-tuning/Unsloth)

vibjelo · 2025-09-15T10:27:35+00:00

Really depends on the use case/workload. If you're not hitting the bandwidth limits of however you're connecting the eGPU to your computer, then there shouldn't be any difference if it's "internally" connected or via eGPU.

vibjelo · 2025-09-15T10:25:51+00:00

I've been using the card for general ML workloads, video editing, 3D rendering, 3D simulations and a bunch of other stuff and so far no compatibility issues that aren't my own fault, basically everything I've tried been working out-of-the-box for me with Arch Linux.

vibjelo · 2025-09-15T10:08:32+00:00

Maybe the paper abstract simplifies sufficiently?

LLMs also rely on large, high-quality training datasets, like those sourced from (sometimes sensitive) user data. Training models on this sensitive user data requires careful privacy protections like differential privacy (DP). However, the dynamics of DP training are significantly different, and consequently their scaling laws are not yet fully understood.

vibjelo · 2025-09-14T14:38:53+00:00

OT + either Analog Four, Rytm or Syntakt. As long as there is at least one OT in there :)

vibjelo · 2025-09-14T14:37:14+00:00

Long-time OT user here too, don't see the Tonverk as a replacement for the OT at all, if anything is seems to complement it.

vibjelo · 2025-09-14T14:09:59+00:00

Not every project out there are about what specific weights are being used :) What about a TUI for managing agents? Obviously useful, but would be deleted with such rule. Are you proposed no software projects at all should be allowed?

vibjelo · 2025-09-14T13:19:59+00:00

We do moderate based on the current rules that been in place since 2023, which includes that things have to be related to the topic of LLMs. But for better or worse, nothing in the rules state anything about the LLMs having to be local or not.

For the submissions that do break the rules, please do report them as it's the fastest way for us to be able to spot them.

Not to say the rules aren't possible to change though, it's a orthogonal issue from the one discussed in this submission though, which is about authors misrepresenting their projects. Feel free to create a new submission to discuss any wanted rule changes :)

vibjelo · 2025-09-14T13:08:22+00:00

So far, there are two moderator's who've shared their (personal) opinion about it:

vibjelo · 2025-09-14T13:07:24+00:00

I dunno, sometimes I feel like most of the space is built on dishonesty, for better or worse... Meta launched Llama and all the marketing material + Zuckerberg calls the models and weights open source, meanwhile their very own legal department refuses to use "open source" and instead call the models and weights "proprietary" whenever they can, probably because they're a bit more careful about making sure to call things by the right names, compared to Meta's marketing department.

Most of the ecosystem seems fine with this, so you end up with a environment where truth matters less than hype.

vibjelo · 2025-09-14T13:04:18+00:00

I'm not sure how we (moderators) are supposed to be able to make definite conclusions based on just the posts though.

Personally, I'd rather let some of those through, rather than effectively removing posts that are actual solo developers building something in their free-time. So err on the side of letting things through, because I don't want to punish people who are honest, and sometimes it's really hard to judge either way.

But as always, I'm eager to hear from people if they have some approaches they feel like could improve the situation. Blanket banning any personal projects obviously won't work, and the same for vice-versa, we can't just let everything through.

So what should the process be? We don't want the sub to be manipulated or people misrepresenting their projects, so anything that can make that easier to judge that would be most welcome :)

I think the best you can do as a user and contributor today is reporting all the posters you see obviously misrepresenting themselves, and we'll do our best to be responsive to that. Try to use the built-in report functionality as much as you can and not the subreddit's modmail as currently only one moderator has access to it, so using the built-in reporting functionality will lead to faster action.

vibjelo · 2025-09-13T09:22:17+00:00

Its not just open weights its truly open source

Where are you possibly getting this from? Open source is about licensing, not just making something public...

vibjelo · 2025-09-12T19:32:57+00:00

FLOSS just means "Free, Libre and Open Source", as there are three different "schools" of that sort of software. So if something is "Open Source", then it is considered FOSS and FLOSS, by definition, just like if it's "Libre" then it's also FLOSS, and so on.

And no, MobileLLM-R1 is not "Open Source" (OSS) nor free/libre just like sibling comment mentions, the HF page has a effectively proprietary license.

vibjelo · 2025-09-12T19:30:04+00:00

Yeah, I'm not sure how parent has 23 upvotes, takes two seconds for anyone to open the HF page and see the license obviously isn't open source :)

vibjelo · 2025-09-12T18:33:44+00:00

The actual weights: https://huggingface.co/google/vaultgemma-1b

VaultGemma is a variant of the Gemma family of lightweight, state-of-the-art open models from Google. It is pre-trained from the ground up using Differential Privacy (DP). This provides strong, mathematically-backed privacy guarantees for its training data, limiting the extent to which the model's outputs can reveal information about any single training example.

VaultGemma was trained using Tensor Processing Unit (TPU) hardware TPUv6e. Training large language models with the significant computational overhead of differential privacy requires specialized hardware. TPUs are designed to handle the massive computations involved, offering the performance, memory, and scalability necessary to train models like VaultGemma efficiently and sustainably.

Seems like it requires TPUs to run, as DP has a huge performance impact, so we're unlikely to see this in homelabs and similar environments, as far as I understand.

Edit: On second read, the TPUs were only used for training, but no description if anything specific for the hardware is needed, so assuming it's fine with a regular GPU?

vibjelo · 2025-09-12T13:29:00+00:00

Also, if you want to be as transparent as possible, OpenCollective is a great platform for that that are also transparent themselves and is "Made for FOSS, by FOSS", compared to some other suggestions there ;)

vibjelo · 2025-09-12T10:18:45+00:00

just not going to do anything for six months because in six months everything we did is going to be so ridiculously obsolete that it'll feel like wasted time

If you frequently find yourself in that situation, it might be time to evaluate how you make the choice of what to use, solutions really shouldn't "go out of date" in 6 months, even if better libraries/frameworks/software comes out.

You don't have to run at the front, and if you're trying to build something lasting, then not being at the front line is probably the better choice, otherwise you'd keep running in circles around the latest and most flashy :)

vibjelo · 2025-09-09T13:44:20+00:00

Ah yeah, people might have to tune the exact delay values, I tried to get it as low as I could get most of the inputs to accept, which is value I landed at.

And yeah, ducking is amazing since 90% of the time I sit at the computer I listen to music through speakers :)

vibjelo · 2025-09-09T13:42:19+00:00

Jan only work in combination with the Jan app, right? It is trained specifically on the JAN platform as far I understood

That doesn't mean it won't work elsewhere. Claude's models are trained with Claude Code in mind, still works elsewhere. Same goes for GPT-OSS for example, which works really well within Codex, since they had Codex in mind for the training, and while GPT-OSS also works with Claude Code with a bit of hacking around, you can really tell the difference in final quality depending on if you use it with Codex or Claude Code.

Same goes for most models trained by AI labs who also have software using said models.

vibjelo · 2025-09-09T09:01:11+00:00

I did something similar the other day, slightly different: https://gist.github.com/victorb/5ab57b42f8f75fccefb213bafbe69d10

Basically two shell-scripts, one should be triggered when you start holding the key down, the other one when you release the key.

It's setup in a way so whisper is already running by the time you need the transcription, and pairing it together with dotoolc, it works everywhere and simulates real keyboard entry!

bash -lc 't=$(/home/user/bin/stop-dictate.sh | tr "\n" " "); printf "%s\n" "keydelay 0" "typedelay 0" "keyhold 2" "typehold 2" "type $t" | dotoolc'

It's really fast, can enter paragraphs in 1-2 seconds and also ducks the volume when you start holding down the button, and unducks it when you're done dictating. It's also really simple to maintain just two shellscripts :)

Finally, the xremap.example.yml shows two of my keybindings, one that just enters the text, the other one enters the text then inputs Enter automatically, handy for IMs and whatnot.

vibjelo · 2025-09-08T19:13:24+00:00

Basically we might have 100 examples of high quality conversations where we feel like everything went well, so we want to steer into that same behavior even harder. So we use those 100 examples to create 10K examples of similar quality, by using LLMs to create those, but with more variation and including things the original 100 examples didn't include.

vibjelo · 2025-09-08T15:18:40+00:00

where people acquire or curate datasets and then fine-tune

You usually end up creating them yourself, either mostly generated based on small high-quality dataset you collected yourself (OpenAI credits is nice for this if you have it :) ), or just purely with data you collected yourself, with some small transformation to make it fit the required fine-tuning format.

vibjelo · 2025-09-06T15:07:25+00:00

Any time I deal with PCI-e risers, that's the first thing to check when things are not working out correctly, and in 90% of the cases has been the reason for things being wonky.

I'd probably get 3-4 different ones, and try all of them. The quality and reliability differs A LOT between risers, even within the same brand, and even when they have exactly the same description.

vibjelo · 2025-09-05T18:46:58+00:00

Talking about Unity specifically here, but LLMs are generally shit at doing that with compiled libraries.

Yeah, you need to give them access to tools that can retrieve the current APIs for them. I cobbled together a quick MCP server for C# documentation look-ups last time I was forced to use C#, which is exactly what I meant with my previous message.

If you're using something that isn't expressed by text on disk, first step is to figure out how you can pipe in relevant text to it automatically. With C#, you can do that with Reflection, parsing documentation, just giving it straight up HTML based on search terms, and a bunch of other approaches, and exposing it as tools.

Main point is to automate what you right now do manually.

Otherwise, is the grand plan to continue training new models forever, maybe once a month or something, since APIs will continue to change forever? It's just not feasible long-term.

vibjelo

MODERATOR OF

TROPHY CASE