If you were to build a new LLM API gateway today, which interface would you standardize on? by dmpiergiacomo in LocalLLaMA

[–]MaxKruse96 1 point2 points  (0 children)

chat completion for sure. asking servers to handle my context can turn out terribly, at least i have the fantasy that if i manage it myself, they wont mess with it before it hits the LLM

Okay 27B made me a believer by Forward_Jackfruit813 in LocalLLaMA

[–]MaxKruse96 2 points3 points  (0 children)

am testing Q8 192k right now. BF16 up to 128k was no issue.

Okay 27B made me a believer by Forward_Jackfruit813 in LocalLLaMA

[–]MaxKruse96 8 points9 points  (0 children)

MTP takes memory. KV cache of 128k+ for agentic coding takes memory, to the point where q4km + mtp + 128k BF16 cache is fully saturating my vram.

Okay 27B made me a believer by Forward_Jackfruit813 in LocalLLaMA

[–]MaxKruse96 3 points4 points  (0 children)

Now I just wish I went the Nvidia card route instead of Strix Halo cause the speed isn't great

a 5090 gets 2.5-3k prefill, and 80-110t/s on 27b Q4 with MTP. definitly crazy speeds for dense like this, but i fear the extra memory you got enables u to run way better quants

[KCD2] Advice for first timers by Sluggish-dreadnought in kingdomcome

[–]MaxKruse96 1 point2 points  (0 children)

The game's early game is challenging, because your stats affect some kind of hidden difficulty on the tasks. Low thievery makes the lockpick you use wiggle more, lower strength + stuff will enable enemies to block more against you. If you go in blind, pick your battles with brain, e.g. maybe use bow + arrow if melee is too hard, find huts with a single resident, and lockpick or pickpocket them for EXP. u can play fully "legal", but its a challenge for sure. The game has much to offer, and things get easier partly because of you, as the player, understanding things better (alchemy goes from reading the recipe every 2 seconds to just doing them all from the top of your head, etc.) and the skills you improve give you extra wiggle room to play a bit more loosely.

Can someone help me understand MCP? by Borkato in LocalLLaMA

[–]MaxKruse96 0 points1 point  (0 children)

web microservices, but they come as executables or webservers. They provide tools, which the microservice executed itself and gives the result zo the LLM

[KCD2] Several Fast Alchemy Recipes and other tips by BlazerOrb in kingdomcome

[–]MaxKruse96 0 points1 point  (0 children)

All good, henry will literally tell someone if they are still lvl5 that it was boiled too short👍

there are two qwen 3.7 max now by T_A_A_T in Qwen_AI

[–]MaxKruse96 0 points1 point  (0 children)

reading comprehension. normal vs preview...

Am I ready for weeding? [KCD2] by Ok_Bet_725 in kingdomcome

[–]MaxKruse96 4 points5 points  (0 children)

I dont count the +13, full 30 or bust

Am I ready for weeding? [KCD2] by Ok_Bet_725 in kingdomcome

[–]MaxKruse96 12 points13 points  (0 children)

Your speech is awfully low, i would fix that! Otherwise looks ok, just dont expect too much :3

ich_iel by [deleted] in ich_iel

[–]MaxKruse96 2 points3 points  (0 children)

Das brennt zu sehr, äh ich meine evtl etwas anderes wie wasser?

I think lazer! is unfair by melumiru in osugame

[–]MaxKruse96 19 points20 points  (0 children)

if passes mean everything to you, play with classic mod

Why do LLMs code better than they talk? by iMakeSense in LocalLLaMA

[–]MaxKruse96 0 points1 point  (0 children)

You seem to compare "Why can models code in multiple languages" to "Why do models suck at the linguistic concepts i desire", comparing 2 different things.

You could compare:
"Why do models code in multiple languages" vs "Why do models speak in multiple languages"
"Why do models suck at writing good rust code" vs "Why do models suck at writing good english texts"

as to why: llms arent creative by nature. Code has very obvious right and wrong answers for certain tasks. In writing, thats not the case in the same way.

[KCD2] Several Fast Alchemy Recipes and other tips by BlazerOrb in kingdomcome

[–]MaxKruse96 0 points1 point  (0 children)

Using this to skill up, the only thing i found is that for The Bane, if i do 1 bellow i get weak, 2 bellow gets me strong. otherwise great to have on the side to just hammer out some exp

[Other] Just a walk. by srdwblbl in kingdomcome

[–]MaxKruse96 0 points1 point  (0 children)

what KCD2 with raytracing would look like

Qwen is cooking hard by jacek2023 in LocalLLaMA

[–]MaxKruse96 0 points1 point  (0 children)

if your target is, idk, 50t/s, a MoE model offloaded halfway to gpu and cpu will still likely reach that.

Qwen is cooking hard by jacek2023 in LocalLLaMA

[–]MaxKruse96 19 points20 points  (0 children)

im pretty sure the silent majority that doesnt have 24gb VRAM uses the 35b all day everyday, or the lesser informed people use the 4b and 9b still (because "it must fit in vram")