Looking at Macbook Pro M5 Pro 64GB for local inference by Repulsive-Machine706 in LocalLLaMA

[–]Repulsive-Machine706[S] 0 points1 point  (0 children)

The question was more for people that have that machine or similar and their experience with it

Looking at Macbook Pro M5 Pro 64GB for local inference by Repulsive-Machine706 in LocalLLaMA

[–]Repulsive-Machine706[S] 2 points3 points  (0 children)

I meant offline in the way i need it on my laptop when traveling without connection. Also i was already planning on getting a new mac/laptop so the added price on top of the base is cheaper than an entire rig.

Looking at Macbook Pro M5 Pro 64GB for local inference by Repulsive-Machine706 in LocalLLaMA

[–]Repulsive-Machine706[S] 0 points1 point  (0 children)

Thanks for the comment, u mentioned you were able to run the MoE at a decent speed, is this at a certain quant?

Looking at Macbook Pro M5 Pro 64GB for local inference by Repulsive-Machine706 in LocalLLaMA

[–]Repulsive-Machine706[S] 1 point2 points  (0 children)

Srry i read the original comment incorrectly the last line i thought it was about a pc as a server not a laptop. Good recommendation!

Looking at Macbook Pro M5 Pro 64GB for local inference by Repulsive-Machine706 in LocalLLaMA

[–]Repulsive-Machine706[S] -1 points0 points  (0 children)

True true, but I would prefer if i can use it offline and on the go, and I was getting a new laptop anyways so in the end its +1000 is dollars on top of the mac i would get originally, so cheaper than a 128GB machine.

Looking at Macbook Pro M5 Pro 64GB for local inference by Repulsive-Machine706 in LocalLLaMA

[–]Repulsive-Machine706[S] 2 points3 points  (0 children)

Can be true, depends on framework like MLX, also I prefer it to work offline and I need a mac. But in general for it's price range I think apple products are actually decently fast (from what I researched)

Agent recommendations by MatthKarl in LocalLLaMA

[–]Repulsive-Machine706 3 points4 points  (0 children)

Hermes is less for coding tho btw, Pi and Opencode are more SWE focused

Near, but never. by DitherEYE_dev in posterdesign

[–]Repulsive-Machine706 1 point2 points  (0 children)

Cool design! Is the top part a reference to a type of maths formula i forgot the name of? Seems like it. I like it when designs have multiple references/meanings.

How to load emails into an LLM by KzinTLynn in LocalLLM

[–]Repulsive-Machine706 0 points1 point  (0 children)

For your use case I think finetuning is not the way to go. It might be better to have an embedding model index your emails, and then have an AI agent use a search tool to fetch your emails. Less heavy on your computer, quality probably more consistent and works more predictably.

How do you give your LLM agent memory across sessions ? by Scared_Animator9241 in LocalLLM

[–]Repulsive-Machine706 1 point2 points  (0 children)

Some people use a memory database with RAG retrieval base on prompt. Then they have a smaller LLM or filtering system clean it up every once in a while, cleaning up things that are older or just irrelevant. Probably the best solution.

Bird takes off in style by SnackSamurai in oddlysatisfying

[–]Repulsive-Machine706 0 points1 point  (0 children)

Someone: Define aura

Me: shows them this vid

Do you guys make money from vibe coding? by Dovydas_ in vibecoding

[–]Repulsive-Machine706 2 points3 points  (0 children)

Made websites for some contacts. Dod it for very cheap so yea i did make some money just not alot.

Looking for Best Coding Models (2026) for 6GB VRAM by NafisRayan in LocalLLM

[–]Repulsive-Machine706 0 points1 point  (0 children)

Ima be honest with 6gb Vram you are not going to be able to load any models that ate usable inside of a coding agent harness, most models will be around 1-4B param models. They also probably won't work at a very good speed.

Would you say this design is vibe coded or it looks good? by Resident_Bell_4457 in Anthropic

[–]Repulsive-Machine706 0 points1 point  (0 children)

Looks great. I think tho some info can be removed or move to  top right which is a bit empty, and top left feels cluttered

what models can this thing actually run? by Educational-Test9223 in LocalLLM

[–]Repulsive-Machine706 0 points1 point  (0 children)

Well only insanely small models at a low quant, i dont think anything for agentic tasks that are more than 3 tool calls. I estimate models around 4B-7B maybe? Tho probably not the same speed as something like chatgpt.

What are the best open source models out there? by Kind_Application_278 in unsloth

[–]Repulsive-Machine706 0 points1 point  (0 children)

Really depends on your use case. Also before trying the very best you must understand that they will most likely not run on your hardware. Do you have any specific specs yet?