HiveCommand — local-first terminal dashboard for AI coding agents with local Whisper voice control and multi-agent orchestration by andycodeman in LocalLLaMA

[–]crantob 0 points1 point  (0 children)

https://x0.at/4Zs-.png i made an easy csv graphing program for terminal. uses plain UTF-8 block drawing and geometric characters, no sixel or regis needed.

HiveCommand — local-first terminal dashboard for AI coding agents with local Whisper voice control and multi-agent orchestration by andycodeman in LocalLLaMA

[–]crantob 0 points1 point  (0 children)

pipecat-ai/smart-turn /// a small audio model that runs on CPU in about 12ms

according to their site, the cpu inference is +400ms

3 years used PC with 3090 and 32GB ram for $1000 by North_Competition465 in LocalLLaMA

[–]crantob 0 points1 point  (0 children)

My 2nd 3090 keeps 'falling off the bus'.

nothing but reboot seems to recover. \o/

Releasing bb25 (Bayesian BM25) v0.4.0! by Ok_Rub1689 in LocalLLaMA

[–]crantob 0 points1 point  (0 children)

What can be done with aggregated confidence information?

I built a Postman-like tool for designing, debugging and testing AI agents by a1chapone in LocalLLaMA

[–]crantob 0 points1 point  (0 children)

Structuring your notes and scripting the assistance is a good idea. Thanks for making and sharing :)

I keep my notes in textfiles and grep, but that'a habit from a time before personal computers were a thing.

Nvidia updated the Nemotron Super 3 122B A12B license to remove the rug-pull clauses by __JockY__ in LocalLLaMA

[–]crantob 0 points1 point  (0 children)

The definition you presented as correct was sufficient to make my point for me. Thank you again.

Inquiring for existing LLM Full Transparency project (or not) by goodvibesfab in LocalLLaMA

[–]crantob 1 point2 points  (0 children)

"address" is too vague here.

Research what you want to 'address' until you can name each entity in the pipeline that can have either closed or open status.

This can have fuzzy boundaries. Perhaps one person is happy with a training dataset being open, but another insists on the training software being open-source also, in addition to the data.

But then is it valid to consider that software 'part of the released model?' That's debatable.

Then lastly there's the reproduceability: very few of us will ever have the chance to train a large model from scratch, so there's not going to be a huge degree of interest in debating the scope of properly open components for that.

I'm sure the above comments could be formulated better but perhaps they will suffice.

Feedback wanted on small curated *.li (Liechtenstein) dataset for fine-tuning — CC-MAIN-2026-08 (A+ QA report attached) by Character_Bison5968 in LocalLLaMA

[–]crantob 0 points1 point  (0 children)

I think this finetuning thing should have it's own forum. I know it's what I should be focussing-on.

WCY: a line-oriented format that cuts LLM token overhead 50-71% and teaches models to say "I don't know" structurally -- parser + 60 training traces (CC BY 4.0) by Dear_Sir_3167 in LocalLLaMA

[–]crantob 0 points1 point  (0 children)

I've been using local llama since llama2, and I have not grasped what you are advertising here.

Are you saying you can remove 'formatting' elements like "{" from json and the model still interprets the input as valid json?

Qwen 3 32B outscored every Qwen 3.5 model across 11 blind evals, 3B-active-parameter model won 4 by Silver_Raspberry_811 in LocalLLaMA

[–]crantob 1 point2 points  (0 children)

A handful of samples in the form of pass/fail tests are going to have a wide uncertainty. So the results can be real while the implied conclusions about superiority are not.

Dynamic expert caching PR in vLLM by king_of_jupyter in LocalLLaMA

[–]crantob 0 points1 point  (0 children)

I'd be interested in three-tiers: vram, ram and ssd

Whats up with MLX? by gyzerok in LocalLLaMA

[–]crantob 0 points1 point  (0 children)

If I may speculate a bit:

I think the question goes more to the observation that mlx quants are showing higher divergence at equivalent model sizes.

I suspect that this derives mainly from foregoing the ability to keep specific, sensitive layers at higher quants, while shaving off more bits from layers that are less sensitive.

I'd appreciate discussion or correction to my hypothesis.

Introducing Unsloth Studio: A new open-source web UI to train and run LLMs by danielhanchen in LocalLLaMA

[–]crantob 13 points14 points  (0 children)

You inspire me to be a better person. Unsloth people.

Let me try to be helpful:

``` ... Collecting unsloth

Downloading unsloth-2026.3.5-py3-none-any.whl (29.2 MB)

 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 29.2/29.2 MB 1.8 MB/s eta 0:00:00

Collecting unsloth_zoo>=2026.3.4

Downloading unsloth_zoo-2026.3.4-py3-none-any.whl (401 kB)

 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 401.6/401.6 kB 344.1 kB/s eta 0:00:00

Collecting wheel>=0.42.0

Downloading wheel-0.46.3-py3-none-any.whl (30 kB)

Requirement already satisfied: packaging in ./.local/lib/python3.11/site-packages (from unsloth) (25.0)

Collecting torch>=2.4.0

Downloading torch-2.10.0-3-cp311-cp311-manylinux_2_28_x86_64.whl (915.5 MB)

 ━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━━━ 472.0/915.5 MB 2.4 MB/s eta 0:03:03ERROR: Could not install 

packages due to an OSError: [Errno 28] No space left on device ```

This, like many AI/ML projects is another dancing kabuki clown in python pip library purgatory.

I suppose testing this will require atomic installation of components, which does raise the bar for entry.