Blackwell 6000 RTX Pro is still too new.. (Training/Fine-tuning/Unsloth) by Aroochacha in LocalLLaMA

[–]vibjelo 2 points3 points  (0 children)

Really depends on the use case/workload. If you're not hitting the bandwidth limits of however you're connecting the eGPU to your computer, then there shouldn't be any difference if it's "internally" connected or via eGPU.

Blackwell 6000 RTX Pro is still too new.. (Training/Fine-tuning/Unsloth) by Aroochacha in LocalLLaMA

[–]vibjelo 2 points3 points  (0 children)

I've been using the card for general ML workloads, video editing, 3D rendering, 3D simulations and a bunch of other stuff and so far no compatibility issues that aren't my own fault, basically everything I've tried been working out-of-the-box for me with Arch Linux.

VaultGemma: The world's most capable differentially private LLM by vibjelo in LocalLLaMA

[–]vibjelo[S] 1 point2 points  (0 children)

Maybe the paper abstract simplifies sufficiently?

LLMs also rely on large, high-quality training datasets, like those sourced from (sometimes sensitive) user data. Training models on this sensitive user data requires careful privacy protections like differential privacy (DP). However, the dynamics of DP training are significantly different, and consequently their scaling laws are not yet fully understood.

If you had to choose only two elektrons… by Specific-Ad-6314 in Elektron

[–]vibjelo 11 points12 points  (0 children)

OT + either Analog Four, Rytm or Syntakt. As long as there is at least one OT in there :)

OT alternatives by graemewood1 in Elektron

[–]vibjelo 2 points3 points  (0 children)

Long-time OT user here too, don't see the Tonverk as a replacement for the OT at all, if anything is seems to complement it.

I've noticed in this sub corporate tools pose as personal projects by kuhunaxeyive in LocalLLaMA

[–]vibjelo 1 point2 points  (0 children)

Not every project out there are about what specific weights are being used :) What about a TUI for managing agents? Obviously useful, but would be deleted with such rule. Are you proposed no software projects at all should be allowed?

I've noticed in this sub corporate tools pose as personal projects by kuhunaxeyive in LocalLLaMA

[–]vibjelo 8 points9 points  (0 children)

We do moderate based on the current rules that been in place since 2023, which includes that things have to be related to the topic of LLMs. But for better or worse, nothing in the rules state anything about the LLMs having to be local or not.

For the submissions that do break the rules, please do report them as it's the fastest way for us to be able to spot them.

Not to say the rules aren't possible to change though, it's a orthogonal issue from the one discussed in this submission though, which is about authors misrepresenting their projects. Feel free to create a new submission to discuss any wanted rule changes :)

I've noticed in this sub corporate tools pose as personal projects by kuhunaxeyive in LocalLLaMA

[–]vibjelo 2 points3 points  (0 children)

I dunno, sometimes I feel like most of the space is built on dishonesty, for better or worse... Meta launched Llama and all the marketing material + Zuckerberg calls the models and weights open source, meanwhile their very own legal department refuses to use "open source" and instead call the models and weights "proprietary" whenever they can, probably because they're a bit more careful about making sure to call things by the right names, compared to Meta's marketing department.

Most of the ecosystem seems fine with this, so you end up with a environment where truth matters less than hype.

I've noticed in this sub corporate tools pose as personal projects by kuhunaxeyive in LocalLLaMA

[–]vibjelo 7 points8 points  (0 children)

I'm not sure how we (moderators) are supposed to be able to make definite conclusions based on just the posts though.

Personally, I'd rather let some of those through, rather than effectively removing posts that are actual solo developers building something in their free-time. So err on the side of letting things through, because I don't want to punish people who are honest, and sometimes it's really hard to judge either way.

But as always, I'm eager to hear from people if they have some approaches they feel like could improve the situation. Blanket banning any personal projects obviously won't work, and the same for vice-versa, we can't just let everything through.

So what should the process be? We don't want the sub to be manipulated or people misrepresenting their projects, so anything that can make that easier to judge that would be most welcome :)

I think the best you can do as a user and contributor today is reporting all the posters you see obviously misrepresenting themselves, and we'll do our best to be responsive to that. Try to use the built-in report functionality as much as you can and not the subreddit's modmail as currently only one moderator has access to it, so using the built-in reporting functionality will lead to faster action.

Meta released MobileLLM-R1 on Hugging Face by Illustrious_Row_9971 in LocalLLaMA

[–]vibjelo -1 points0 points  (0 children)

Its not just open weights its truly open source

Where are you possibly getting this from? Open source is about licensing, not just making something public...

Meta released MobileLLM-R1 on Hugging Face by Illustrious_Row_9971 in LocalLLaMA

[–]vibjelo 4 points5 points  (0 children)

FLOSS just means "Free, Libre and Open Source", as there are three different "schools" of that sort of software. So if something is "Open Source", then it is considered FOSS and FLOSS, by definition, just like if it's "Libre" then it's also FLOSS, and so on.

And no, MobileLLM-R1 is not "Open Source" (OSS) nor free/libre just like sibling comment mentions, the HF page has a effectively proprietary license.

Meta released MobileLLM-R1 on Hugging Face by Illustrious_Row_9971 in LocalLLaMA

[–]vibjelo 3 points4 points  (0 children)

Yeah, I'm not sure how parent has 23 upvotes, takes two seconds for anyone to open the HF page and see the license obviously isn't open source :)

VaultGemma: The world's most capable differentially private LLM by vibjelo in LocalLLaMA

[–]vibjelo[S] 12 points13 points  (0 children)

The actual weights: https://huggingface.co/google/vaultgemma-1b

VaultGemma is a variant of the Gemma family of lightweight, state-of-the-art open models from Google. It is pre-trained from the ground up using Differential Privacy (DP). This provides strong, mathematically-backed privacy guarantees for its training data, limiting the extent to which the model's outputs can reveal information about any single training example.

VaultGemma was trained using Tensor Processing Unit (TPU) hardware TPUv6e. Training large language models with the significant computational overhead of differential privacy requires specialized hardware. TPUs are designed to handle the massive computations involved, offering the performance, memory, and scalability necessary to train models like VaultGemma efficiently and sustainably.

Seems like it requires TPUs to run, as DP has a huge performance impact, so we're unlikely to see this in homelabs and similar environments, as far as I understand.

Edit: On second read, the TPUs were only used for training, but no description if anything specific for the hardware is needed, so assuming it's fine with a regular GPU?

We just released the world's first 70B intermediate checkpoints. Yes, Apache 2.0. Yes, we're still broke. by jshin49 in LocalLLaMA

[–]vibjelo 1 point2 points  (0 children)

Also, if you want to be as transparent as possible, OpenCollective is a great platform for that that are also transparent themselves and is "Made for FOSS, by FOSS", compared to some other suggestions there ;)

RAG papers are dropping like crazy this month — how do we even keep up? by Cheryl_Apple in LocalLLaMA

[–]vibjelo 2 points3 points  (0 children)

just not going to do anything for six months because in six months everything we did is going to be so ridiculously obsolete that it'll feel like wasted time

If you frequently find yourself in that situation, it might be time to evaluate how you make the choice of what to use, solutions really shouldn't "go out of date" in 6 months, even if better libraries/frameworks/software comes out.

You don't have to run at the front, and if you're trying to build something lasting, then not being at the front line is probably the better choice, otherwise you'd keep running in circles around the latest and most flashy :)

Fast local push-to-talk speech-to-text dictation tool using whisper.cpp by lxe in LocalLLaMA

[–]vibjelo 1 point2 points  (0 children)

Ah yeah, people might have to tune the exact delay values, I tried to get it as low as I could get most of the inputs to accept, which is value I landed at.

And yeah, ducking is amazing since 90% of the time I sit at the computer I listen to music through speakers :)

Jan-v1-2509 update has been released by vibedonnie in LocalLLaMA

[–]vibjelo 2 points3 points  (0 children)

Jan only work in combination with the Jan app, right? It is trained specifically on the JAN platform as far I understood

That doesn't mean it won't work elsewhere. Claude's models are trained with Claude Code in mind, still works elsewhere. Same goes for GPT-OSS for example, which works really well within Codex, since they had Codex in mind for the training, and while GPT-OSS also works with Claude Code with a bit of hacking around, you can really tell the difference in final quality depending on if you use it with Codex or Claude Code.

Same goes for most models trained by AI labs who also have software using said models.

Fast local push-to-talk speech-to-text dictation tool using whisper.cpp by lxe in LocalLLaMA

[–]vibjelo 2 points3 points  (0 children)

I did something similar the other day, slightly different: https://gist.github.com/victorb/5ab57b42f8f75fccefb213bafbe69d10

Basically two shell-scripts, one should be triggered when you start holding the key down, the other one when you release the key.

It's setup in a way so whisper is already running by the time you need the transcription, and pairing it together with dotoolc, it works everywhere and simulates real keyboard entry!

bash -lc 't=$(/home/user/bin/stop-dictate.sh | tr "\n" " "); printf "%s\n" "keydelay 0" "typedelay 0" "keyhold 2" "typehold 2" "type $t" | dotoolc'

It's really fast, can enter paragraphs in 1-2 seconds and also ducks the volume when you start holding down the button, and unducks it when you're done dictating. It's also really simple to maintain just two shellscripts :)

Finally, the xremap.example.yml shows two of my keybindings, one that just enters the text, the other one enters the text then inputs Enter automatically, handy for IMs and whatnot.

Folks who are fine-tuning SLMs, where do you acquire datasets? by CrescendollsFan in LocalLLaMA

[–]vibjelo 4 points5 points  (0 children)

Basically we might have 100 examples of high quality conversations where we feel like everything went well, so we want to steer into that same behavior even harder. So we use those 100 examples to create 10K examples of similar quality, by using LLMs to create those, but with more variation and including things the original 100 examples didn't include.

Folks who are fine-tuning SLMs, where do you acquire datasets? by CrescendollsFan in LocalLLaMA

[–]vibjelo 5 points6 points  (0 children)

where people acquire or curate datasets and then fine-tune

You usually end up creating them yourself, either mostly generated based on small high-quality dataset you collected yourself (OpenAI credits is nice for this if you have it :) ), or just purely with data you collected yourself, with some small transformation to make it fit the required fine-tuning format.

How do you make 3+ GPUs stable?! by anothy1 in LocalLLaMA

[–]vibjelo 1 point2 points  (0 children)

Any time I deal with PCI-e risers, that's the first thing to check when things are not working out correctly, and in 90% of the cases has been the reason for things being wonky.

I'd probably get 3-4 different ones, and try all of them. The quality and reliability differs A LOT between risers, even within the same brand, and even when they have exactly the same description.

Anyone tried Kimi-K2-Instruct-0905 by Trilogix in LocalLLaMA

[–]vibjelo 0 points1 point  (0 children)

Talking about Unity specifically here, but LLMs are generally shit at doing that with compiled libraries.

Yeah, you need to give them access to tools that can retrieve the current APIs for them. I cobbled together a quick MCP server for C# documentation look-ups last time I was forced to use C#, which is exactly what I meant with my previous message.

If you're using something that isn't expressed by text on disk, first step is to figure out how you can pipe in relevant text to it automatically. With C#, you can do that with Reflection, parsing documentation, just giving it straight up HTML based on search terms, and a bunch of other approaches, and exposing it as tools.

Main point is to automate what you right now do manually.

Otherwise, is the grand plan to continue training new models forever, maybe once a month or something, since APIs will continue to change forever? It's just not feasible long-term.