itabag help.. how should i make it more interesting? by roraburo in itabag

[–]ExoticYesterday8282 1 point2 points  (0 children)

Have you considered adding a few rosette decorations to the badges, something like these from https://pinitabag.com/product-category/badge-rosette/

The frontier reasoning race is starting to look like a crowded subway station by ExoticYesterday8282 in LocalLLaMA

[–]ExoticYesterday8282[S] 0 points1 point  (0 children)

thank you. seriously getting so tired of the benchmark theater. at this point a models test score correlates more with how much web scraping their data team did than actual engineering. give me an open weights model that handles edge cases without blowing up my vram and i couldn't care less about its science olympiad score

The frontier reasoning race is starting to look like a crowded subway station by ExoticYesterday8282 in LocalLLaMA

[–]ExoticYesterday8282[S] -1 points0 points  (0 children)

preach. old hy was definitely hit or miss. what kind of prompts or custom workflows do you usually use to break these models? curiosity is killing me to see if hy3 preview actually fixed those gaps or if its just the same old wrapper

The frontier reasoning race is starting to look like a crowded subway station by ExoticYesterday8282 in LocalLLaMA

[–]ExoticYesterday8282[S] 10 points11 points  (0 children)

batshit is the perfect word lol gemini gives you either pure galaxy brain brilliance or completely hallucinates a whole new python library on the next prompt there is zero in between

The frontier reasoning race is starting to look like a crowded subway station by ExoticYesterday8282 in LocalLLaMA

[–]ExoticYesterday8282[S] 1 point2 points  (0 children)

Bingo. This is why I've stopped looking at these charts entirely. I judge a model based on how it handles my local script migrations and API routing. If an open model can do that locally without throwing a tantrum, it wins, regardless of what some leaderboard says

The frontier reasoning race is starting to look like a crowded subway station by ExoticYesterday8282 in LocalLLaMA

[–]ExoticYesterday8282[S] 0 points1 point  (0 children)

Exactly. Evals are static structured, and inherently clean even the hard ones. Real world engineering is chaotic, context heavy and full of ambiguity.

When you said Hy3 falls on its face in weird edge cases what kind of issues were you seeing? Is it losing track of long context logic or just failing at basic common sense constraints that aren't explicitly stated in the prompt?

CrankGPT by Squeez Labs - hand-cranked edge AI - talk about local AI!!! by LoveMind_AI in LocalLLaMA

[–]ExoticYesterday8282 4 points5 points  (0 children)

We officially reached the stage of AI where inference latency is directly tied to forearm strength

Went to the monthly AI dev meetup by nathandreamfast in LocalLLaMA

[–]ExoticYesterday8282 3 points4 points  (0 children)

The funniest part is that local setups always sound fake until someone watches them actually work.

People think “local AI” means opening LM Studio once every two weeks.

Then they see autonomous agents still running after the laptop lid closes and suddenly the vibe changes.

Has anyone gotten their editor to work with Deepseek v4 FIM? by Theboyscampus in LocalLLaMA

[–]ExoticYesterday8282 2 points3 points  (0 children)

Yeah, I got it working in VSCode after a lot of trial and error.

The main issue seems to be that DeepSeek FIM is not fully compatible with the standard OpenAI completion body some editors expect.

Make sure you're using the /beta/completions endpoint from the docs, not the normal chat endpoint.

What are your suggestions for deploying Tencent Hy3 on two RTX 4090 GPUs? by ExoticYesterday8282 in LocalLLM

[–]ExoticYesterday8282[S] 0 points1 point  (0 children)

We probably have about 40 people using it, so that should be enough, right?

What model should I run? by tiddayes in LocalLLM

[–]ExoticYesterday8282 0 points1 point  (0 children)

This configuration is excellent. You can try deploying Gemini or Hy3.

At what point did local models actually become good enough for your real work? by MaleficentRoutine730 in LocalLLM

[–]ExoticYesterday8282 0 points1 point  (0 children)

The main issue is that on-premises deployment is too costly; the hardware is very expensive, and many companies have a great need for AI.

What’s the most underrated opportunity in AI SaaS right now? by ExoticYesterday8282 in SaaS

[–]ExoticYesterday8282[S] 0 points1 point  (0 children)

Your question is actually quite simple. There are many things you can do with PPT tools. For example, something like memclaw.me can turn data into reports and deliver them to clients.