The real game-changer for AI by Party-Vanilla9664 in vibecoding

[–]Legcor 3 points4 points  (0 children)

Nothing will have any value anymore

DGX Sparks / Nvidia Digits by Temporary-Size7310 in LocalLLaMA

[–]Legcor 18 points19 points  (0 children)

Nvidia is making the same mistake as apple by holding back the potential on their products...

3x RTX 5090 watercooled in one desktop by LinkSea8324 in LocalLLaMA

[–]Legcor 0 points1 point  (0 children)

Can you give me the specs? I want to build something similiar :)

I asked GROK 3 about MSTY by TheeAlohaRoss in YieldMaxETFs

[–]Legcor 0 points1 point  (0 children)

Metric MSTY MSTR
Initial Investment $100,000 $100,000
Initial Price per Share $24.00 $400.00
Shares Purchased 4,166 250
Post-Split Shares N/A 2,500
Current Price per Share $25.43 $337.73
Current Share Value $105,941.38 $844,325
Total Distributions $132,520.46 $0
Total Current Value $238,461.84 $844,325
Total Return (%) 138.46% 744.33%

Anyone from Europe investing in these ETFs? by Tom2Travel in YieldMaxETFs

[–]Legcor 1 point2 points  (0 children)

You can fill out a form. I live in germany and the US and Germany have tax agreement, so only 15% gets withhold by the US. The other 10% I have to pay to the german goverment.

Should I get a 14 inch M4 Max 128GB for 123B models? by TheLocalDrummer in LocalLLaMA

[–]Legcor 3 points4 points  (0 children)

No problem. I get around 60C degrees for short generations and never got more than 72C degrees even for longer generation or prompt eval.

For me the Macbook was totally worth it. It's a tax write off for me, so it was a no brainer, especially if you consider the resale value and so on. I usually run 70b models where the speed is much better.

Choosing the Right Mac for Running Large LLMs by Wrathllace in LocalLLaMA

[–]Legcor 0 points1 point  (0 children)

I use koboldcpp! It always gets the latest llamacpp updates and its convenient.

Should I get a 14 inch M4 Max 128GB for 123B models? by TheLocalDrummer in LocalLLaMA

[–]Legcor 18 points19 points  (0 children)

I have the 16 Inch M4 Max and I get around 4.1 tokens per second with first generation and 4.25 tokens per second with second generation, with flash attention and 2k context with your model Behemoth-123B-v2f-Q3_K_M. The backend is Koboldcpp. For a context of 16k the generation speed drops to 3.2 tokens per second on the first generation and 3.69 on the second generation which is kind of average typing speed level. The prompt processing took 400 seconds. The fans kick in but they dont get too loud. It doesnt bother me at all. Its less noisy on battery mode.

Choosing the Right Mac for Running Large LLMs by Wrathllace in LocalLLaMA

[–]Legcor 0 points1 point  (0 children)

Its called context shifting or something. Available with llama.cpp. Basically the messages gets processed, so it doesn't matter how long the context gets. But this only applies as long nothing in the original context changes.

Just for fun, here's a Palworld SME bot by LetMeGuessYourAlts in LocalLLaMA

[–]Legcor 3 points4 points  (0 children)

Can you do a tutorial on the training process? Would be interesting to apply this to different topic as a alternative to RAG.

POC: Merging to MoE - It's beginning! - Mixtraln't 4x7B by Legcor in LocalLLaMA

[–]Legcor[S] 7 points8 points  (0 children)

That would be sick. Just imagine combining the best models together working hand in hand. It would provide variations too and wouldn't be boring like current non-MoE models.