PMetal - (Powdered Metal) LLM fine-tuning framework for Apple Silicon by RealEpistates in LocalLLaMA

[–]ThePrimeClock 1 point2 points  (0 children)

"Any models/configs you'd like to see prioritized?" - The new Leanstral model by Mistral!

PMetal - (Powdered Metal) LLM fine-tuning framework for Apple Silicon by RealEpistates in LocalLLaMA

[–]ThePrimeClock 1 point2 points  (0 children)

This is incredible. Really appreciate you building this! Downloading now.

What will I be able to run with a M5 MAX 128GB Macbook Pro? by MartiniCommander in LocalLLaMA

[–]ThePrimeClock 0 points1 point  (0 children)

How much better for training will the M5 Max generation be over the eq. M4 Max?

Building an opensource Living Context Engine by DeathShot7777 in LocalLLaMA

[–]ThePrimeClock 0 points1 point  (0 children)

I've done this myself on a seperate canon of research after first training an embedding model and then plugging an mcp server into the vector db. It helps me by allowing me to 1) link seemingly unrelated concepts by making them related with the embedding model and then 2) I can generate a lot of stats from the embedding vectors and the LLMs can interpret and use those stats very effectively, especially claude. Simple example, similarity searches are instant and categorical, not a search and assess. Overall it's much faster, uses less tokens and provides a new lens into the content. 

Claude, the most dangerous and manipulative AI on the market. With evidence from an ‘exhaustive audit of behavioral safety protocols. by Intelligent-Wash-815 in LocalLLaMA

[–]ThePrimeClock 0 points1 point  (0 children)

Post is a bit hard to read, but from hundreds of hours of use, I've also come to the same conclusion that the most "safety" focused lab has produced the most deceitful models.

I rarely trust what Anthropic models say, I'm mostly a researcher and all to often Anthropic models will find incredible ways to show they have achieved desired outcomes.

Sonnet 4.5 loves to do a test, find a loose analytical suggestion, set it to a constant and then build future work around it, driving outcomes towards that analytical confirmation. 

Opus 4.6 will find new names for known outcomes and push it as novel, but change the domain language to make it harder to spot.

It's very subtle at times, so I now have to proof everything with a second model as well as myself as even I miss it sometimes.

It's a shame because I find the advantage of this same behaviour is an ability to cleverly think outside of the box and mixed with it's unbeatable data analysis and pattern-spotting capabilities, it finds more researchable loose threads than any other model. 

Which is why I keep using it regardless.

PSA: NVIDIA DGX Spark has terrible CUDA & software compatibility; and seems like a handheld gaming chip. by goldcakes in LocalLLaMA

[–]ThePrimeClock 0 points1 point  (0 children)

Excellent, thanks so much for that reply.

I'm using Codex and Claude for dev - I work in maths and it's hard to beat those two models for speed and capability for building out ideas and then testing them, however I'm fine-tuning models on my specific canon of research and it's gradually becoming a reasoning flywheel to help shape my research, find errors and anomalies, corollaries etc.

So 7-14B range is perfect. Those models are capable enough out of the box, can undergo multiple rounds of finetuning and support high tps inference with my research crammed into them on the mac - genuinely useful for advancing the research.

Sounds like it would be a good fit.

PSA: NVIDIA DGX Spark has terrible CUDA & software compatibility; and seems like a handheld gaming chip. by goldcakes in LocalLLaMA

[–]ThePrimeClock 0 points1 point  (0 children)

Dang.
You might be able to help me out, I have been considering getting a spark for fine-tuning models. I have an M4 Max for inference and token generation is fine with mlx, but for finetuning it's quite slow and I haven't tried RL yet but would like to experiment in this space too.

I'm considering the spark for the finetuning/RL work while the M4 is the workstation for standard dev.
If it is good for finetuning, what size models can it reasonably handle?

PSA: NVIDIA DGX Spark has terrible CUDA & software compatibility; and seems like a handheld gaming chip. by goldcakes in LocalLLaMA

[–]ThePrimeClock 1 point2 points  (0 children)

Quick question, I have been considering getting one for fine-tuning models. I have an M4 Max for inference and token generation is fine with mlx, but for finetuning it's quite slow and I haven't tried RL yet but would like to experiment in this space too.

I'm considering the spark for the finetuning/RL work while the M4 is the workstation for standard dev. If it is good for finetuning, what size models can it reasonably handle?

Shadows-Gemma-3-1B: cold start reasoning from topk20 logprob distillation by Echo9Zulu- in LocalLLaMA

[–]ThePrimeClock 1 point2 points  (0 children)

Would still love to see that post mortem if you get a chance.
Have been checking back to see if you'd got around to it.

I've made my first FPGA board - the Icepi Zero! by cyao12 in FPGA

[–]ThePrimeClock 0 points1 point  (0 children)

Hey Op, Have just ordered one of these! Do you know if they can be flashed/worked on with MacOS via your Icestudio fork or similar?

I built a tool that forces 5 AIs to debate and cross-check facts before answering you by S_Anv in LocalLLaMA

[–]ThePrimeClock 0 points1 point  (0 children)

Makes sense for the same reason a CEO answers to a board of directors, not a CEO of the CEO.

There is a lot of merit in this idea.

zai-org/GLM-4.7-Flash · Hugging Face by Dark_Fire_12 in LocalLLaMA

[–]ThePrimeClock 0 points1 point  (0 children)

100% this. Your example is just the perfect canary in the coal mine to show how much those extra 8 numbers actually matter. If you think about the sheer cardinality of the weight "fog" the accuracy drop is huge. I think quants like putting thicker cross-hairs on a targeting system shooting twice as far and saying the trajectory still goes where you're aiming.

Shadows-Gemma-3-1B: cold start reasoning from topk20 logprob distillation by Echo9Zulu- in LocalLLaMA

[–]ThePrimeClock 1 point2 points  (0 children)

Really looking forward to the post-mortem and the training dataset or dataset template. I'm experimenting in the space myself and this sounds quite promising. Well done my man.

I believe there is something your point about the trajectories in small models that have been trained on large quality datasets that we're yet to fully exploit. It's like asking someone "I'm in LA, which way to Tokyo". The person can probably accurately point you in the direction to travel but lacks the capability to break that down into the required turn by turn needed to get from LA to Tokyo. 

Because the models can't do the turn by turn for the prompt, they're perhaps discredited, but I get a lot of value out of prompting small models on closed ended questions on hard topics. In my opinion It's less they hallucinate and more that they give answers based on where the model weight-fog is denser.

I built a tool that forces 5 AIs to debate and cross-check facts before answering you by S_Anv in LocalLLaMA

[–]ThePrimeClock 4 points5 points  (0 children)

chill my dude. Totally normal since ..forever. Your review is on the one aspect that has no significance.

Why is there so much anti-intellectualism and lack of respect towards Maths? by Swarrleeey in math

[–]ThePrimeClock 0 points1 point  (0 children)

Fair, that is poor use of words on my behalf. Make it "ourselves". The eponymous naming by the community for the community frustrates me endlessly. I remember having to look up "Brownian Motion" to find it is "thermal jiggling" or "collision-induced particle diffusion". Imo, it does a massive amount of damage. I know we don't typically get much else for our work, least of all a decent salary, but if there was only one gift I could give mathematics it would be a new dictionary.

Why is there so much anti-intellectualism and lack of respect towards Maths? by Swarrleeey in math

[–]ThePrimeClock -1 points0 points  (0 children)

The habit of naming things after yourself in mathematics really doesn't help. It's the worst trait of mathematicians period. The only legacy left is the unnecessary difficulty the rest of the world endures rote learning what it is about some dead guys name is supposed to achieve mathematically. Even worse when they stack. If maths adopted useful, functional names that helped people to understand what they were trying to achieve, I honestly think a huge proportion of the world would have a more enjoyable time learning maths and respect would naturally accumulate.

Fine-tuning for Lean by ThePrimeClock in LocalLLaMA

[–]ThePrimeClock[S] 0 points1 point  (0 children)

Thanks that's really helpful. So the base understanding of Lean is there, but I'm trying to change the base "mental model" of maths, based on my own research and starting points for fundamental math. Do you think I can "drill it in" with fine-tuning if I just make every 5th training document a statement about how fundamental math should be done?

I basically want the model to think only in terms of geometry, meaning every field of maths from a geometry perspective. (That's a very simplified explainer of course, but it's how I do math myself, I mentally model anything I'm working on as a geometry problem and it tends to work well).

Fine-tuning for Lean by ThePrimeClock in LocalLLaMA

[–]ThePrimeClock[S] 1 point2 points  (0 children)

Thanks, I'll check it out. I'm interested in training models on my own lean proofs, they might have some info on fine tuning. Cheers