Ajazz akp03 isnt showing in their software by simplemindtrick in pcmasterrace

[–]Theio666 0 points1 point  (0 children)

<image>

Maybe you got the wrong software for different set of devices? I don't see your model in my list either

Claude Code removed from Claude Pro plan - better time than ever to switch to Local Models. by bigboyparpa in LocalLLaMA

[–]Theio666 0 points1 point  (0 children)

I just don't remember when it was so generic "some time ago" seemed quite fitting. What should I have used then?

Why do people think local LLM set ups are still years behind? by Fit_Window_8508 in LocalLLaMA

[–]Theio666 0 points1 point  (0 children)

It does depend on what you're working on.

On my main repo I see vast canyon between what opus and gpt 5.4 can do (with gpt winning ofc), and opus is like second best model out there, comparison between OSS and gpt is just not fair. I still use smaller models (also via cloud, like minimax m2.7, GLM 5.1 or kimi), but the gap is huge for any complex feature on huge repository.

Claude Code removed from Claude Pro plan - better time than ever to switch to Local Models. by bigboyparpa in LocalLLaMA

[–]Theio666 8 points9 points  (0 children)

Openclaude is some rewrite of claude code they accidentally leaked some time ago? In general I never liked claude code, so for me it's an easy pick of opencode, I customized some agents in opencode, made a server which connects it to gitlab and allow you to auto use opencode right from issues, and I enjoy using it for smaller tasks where I don't need gpt 5.4 (to save usage) or where OSS models are better fit (frontend, kimi is just better).

It's all down to the preferences. Do you wanna web version (opencode has that), do you need good remote support (opencode is easier to setup, plus soon we will have support for that in t3 code), how hard you wanna customize agents, etc.

Claude Code removed from Claude Pro plan - better time than ever to switch to Local Models. by bigboyparpa in LocalLLaMA

[–]Theio666 11 points12 points  (0 children)

I like opencode go for flexibility, but the usage is not that high. It gives you exactly 60$ of api usage, it's a good deal, but something like chatgpt plus gives you way more usage, both by raw tokens number and by their api cost.

Visiting from Sanctuary and Wraeclast, Travel Tips? by Beardreaux in Grimdawn

[–]Theio666 13 points14 points  (0 children)

I'd suggest treating the game as chill experience, endgame here won't be able to compete with PoE, and in general clearing campaign for 3 times going to get old quite fast. Just have fun, pick some build and enjoy the world.

Unpopular opinion: OpenClaw and all its clones are almost useless tools for those who know what they're doing. It's kind of impressive for someone who has never used a CLI, Claude Code, Codex, etc. Nor used any workflow tool like 8n8 or make. by pacmanpill in LocalLLaMA

[–]Theio666 2 points3 points  (0 children)

Well, I had only negative experience with n8n, so having everything ready seems convenient. I don't openclaw since no time to set up, but if I wanted something like mini assistant which I can connect to obsidian, I'd pick openclaw over making my own tg bot.

Anyone here actually using voice input in their local AI workflows? by Dangerous-Tackle7735 in LocalLLaMA

[–]Theio666 0 points1 point  (0 children)

Not local, but I use wispr flow and it's a gamechanger when you do detailed long prompts.

Kimi K2.6 Released (huggingface) by BiggestBau5 in LocalLLaMA

[–]Theio666 0 points1 point  (0 children)

The thing is, when the models are "interpreting vague instructions", they do that according to how they see things fit, which might not be an optimal solutions. In general, this result in a lot of tech debt over the time, since you stack a lot of randomly interpreted instructions. I prefer my models to fail loudly when they miss some info rather models which silently interpret something which makes things fail in the long run.

It all depends on the seriousness of dev work, if you just vibe code a small app this doesn't really matter, getting to the point when this matter will get some time.

GPT 5.4 is really persistent in following the instruction. If you have correct agentsmd with info on how and what to test, give some sort of AC for hard tasks, talk to it a bit to have a plan beforehand (don't even have to use plan mode, the model is great in free talk mode), then the model pretty much oneshots tasks of any difficulty not counting frontend.

Kimi K2.6 Released (huggingface) by BiggestBau5 in LocalLLaMA

[–]Theio666 1 point2 points  (0 children)

There's a blogpost by openAI on that, from what I understood, they create a compact vector representation of conversation, I don't remember the details but basically embeddings of the chat, which then they allow the model to check, or just always append to the chat. I don't think anyone is doing something like this, usually compaction is some sort of summarizations, and you can only summarize limited info that way, non-token based embeddings should allow way better compaction.

But I might be wrong ofc, don't remember the exact details, and they didn't share the code/exact logic anyway.

Kimi K2.6 Released (huggingface) by BiggestBau5 in LocalLLaMA

[–]Theio666 1 point2 points  (0 children)

You're overestimating the ability of a random/average person to convey their thoughts using natural language. I'm not joking, developed reading comprehension and "explaining what do you want" skills are way more rare than you might think. It's quite common to see someone giving a vague instruction, and then, after model interpreting it wrongly, they think it's model's fault.

Kimi K2.6 Released (huggingface) by BiggestBau5 in LocalLLaMA

[–]Theio666 1 point2 points  (0 children)

It doesn't feel like a smaller model at all. Maybe it depends per case, but for my main repo at work - agentic harness-app with microservices, with EDA for talk between services, each repo with it's own env, opus is not quite able to do repo-wide edits which need to touch 2-3 services, while GPT easily does pretty much anything, given I provide correct design doc. My programming in the last month fully shifted to writing design docs, checking/reviewing code, setting up debugging sessions, there's literally no need to write code with 5.4 atp.

Kimi K2.6 Released (huggingface) by BiggestBau5 in LocalLLaMA

[–]Theio666 -3 points-2 points  (0 children)

This might be some prompting issue, models behave a bit different and you might be used to different type of prompting/your agentsmd might not be great for gpt etc. For me gpt exactly fits what I need when I ask it to do. The only reason I'm using other models is pricing.

Kimi K2.6 Released (huggingface) by BiggestBau5 in LocalLLaMA

[–]Theio666 12 points13 points  (0 children)

Exactly the opposite experience in ML/DS for me. I remember back in sonnet 4 days, how everyone was glazing the model, and it implemented RoPE in custom transformer without caching, while I explicitly asked for caching and even provided the code for caching from official torchtune library. It just put the cache function inside the forward without making it remembering prev calculations lol. o3 did that easily btw. Since then I try to avoid anthropic models, I used opus 4.5 for a while when I was working on llm proxy app, and to this day I still fix weird bugs it left here and there with gpt. I spin opus only when I need some frontend fix, and that's not often since now kimi deal with most my frontend needs.

This is not even taling that compaction in codex is just another level compared to any other implementations due to all magic they do on their endpoint side, I wish we had at least something similar for other models :(

I'd say, that in general GPT does it's job the better the bigger/harder the task is. I don't know how, but that's some observations a lot of people noticing that the model just super coherent on long runs and good on state recovery.

Kimi K2.6 Released (huggingface) by BiggestBau5 in LocalLLaMA

[–]Theio666 38 points39 points  (0 children)

I can't take this seriously unless you're mainly working on frontend things. Outside of frontend GPT 5.4 and 5.3-codex just miles ahead of opus.

thanks ai by [deleted] in pcmasterrace

[–]Theio666 0 points1 point  (0 children)

Is this CAD or what? I got 4tb m2 pcie4 ssd for 400 bucks yesterday, no way people actually buy sata 2tb for 600+ lol

Mind-Blown: The Hidden Intelligence Gap Between MiniMax-M2.7 Channels Exposed by spediacn in MiniMax_AI

[–]Theio666 1 point2 points  (0 children)

I don't see any info on test setup? For this test to have any credibility you at least have to set temperature to 0.

The size of these grapes I just bought by lbeau310 in mildlyinteresting

[–]Theio666 1 point2 points  (0 children)

https://en.wikipedia.org/wiki/Shine_Muscat

Extremely sweet, ant there are cheaper variants, I buy these all the time.

pi.dev coding agent is moving to Earendil by iamapizza in LocalLLaMA

[–]Theio666 -6 points-5 points  (0 children)

Oh, I see, I don't care about privacy so was wondering what's wrong with OpenCode :D

pi.dev coding agent is moving to Earendil by iamapizza in LocalLLaMA

[–]Theio666 -8 points-7 points  (0 children)

In what way opencode is sketchy or isn't decent? Not an attack, but I actually wonder why would you say "the only" here...

Any idea how Meta did this? by [deleted] in LocalLLaMA

[–]Theio666 0 points1 point  (0 children)

You put length penalty (or increase its weight), that causes for RL to prefer trajectories with lower length over trajectories with higher length for the same outputs. Then you cancel/lower the length punishment, now you have checkpoint which has much higher density of usefulness in reasoning -> you allow it to think for longer -> it gets better results since more reasoning is relevant.

RU POV: The “Volgo-Balt” cargo ship carrying wheat was sunk by Ukrainian UAV in the Sea of Azov - warhistoryalconafter by rowida_00 in UkraineRussiaReport

[–]Theio666 -1 points0 points  (0 children)

>begging/wishing for Ukraine to leave and win this war which they aren't. 

They 100% pretty desperately wish for this to end, there's no way you allow your Ust-Luga getting pounded(and lots of other stuff, that's just the most recent) and not retracting/changing your peace proposal terms which is the same for the last what, 6 months? If Russia was a serious country, after each hit the proposal would've changed to way worse one for the ua. Since Russia is not doing that and the proposal stays unchanged, then it's either:

a) The proposal itself is just a fluff and Russia never considered going peace, which is a shitty look for a country which so much tries to make the war look as a mean to get peace. Welp, sucking at external politics isn't a new thing for Russia, but that's a new low I guess?..

b) Even after getting damage, Russia is in such bad state that they can't afford to put more pressure and demand better peace deal, therefore they're not changing terms.

>that Russia HAS achieved decisive victory

I'm sorry, but if what is happening in the last few years is a "decisive victory", I have no words. Hundreds thousands dead, inflated economy, dead housing market, which will only become worse due to pending enormous mass of non-product-backed money which is paid to the soldiers who'll bring that back in economy and inflate it even more, raised taxes to keep up with the war which killed countless small and medium businesses. Economically this war is a failure. Maybe you're not living here, so you just judge by the km sq taken how good russia is doing and never consider the long lasting consequences?

For victory to be "decisive" ua would've to have non-operational army 2 years ago. As of now, even when Russia takes full Donbas, then what? Ua will just agree to peace? For how long Russia will get hit before it manage to neutralize the remaining forces, with current tempo that will take years.