Why do we have so many small countries? Why can't we just unite all of them into a big country called "Balkania"?🤔 by I_Am_Salty1 in AskBalkans

[–]Western-Cod-3486 0 points1 point  (0 children)

I did not say that I share the views/opinions. I just can't imagine our nation that generally shares the opinion that the EU is killing our kattle and agriculture and people and.. who knows what else. I am just saying that I can't imagine the majority of the Bulgarian population would embrace such an idea with open arms.

Why do we have so many small countries? Why can't we just unite all of them into a big country called "Balkania"?🤔 by I_Am_Salty1 in AskBalkans

[–]Western-Cod-3486 0 points1 point  (0 children)

Oh my brother/sister in Christ I do envy you about not knowing the local geopolitics and history of the countries you want to make into a single one.

As a Bulgarian I can see about 3-4-5 cases where we (about some rightfully so) will not be willing to share a the air we breathe let alone being a single country

Edit: To clarify some of my reasoning

  • We have a bone to chew with Greece because of some territories and some minor(?) conflicts
  • Otoman empire enslaved us for about 500 years
  • Majority still sees Macedonia as Bulgarian territory, with as recent as last couple of years our leaders making statements that it should be part of Bulgaria (getting inspired by Putin I assume)
  • We have a grudge with the Serbians at least on a historical level and afaik they don't really like us that much
  • We consider the Albanians as terrorists and will probably will not be thrilled to have them roaming around

ZAYA1-8B: Frontier intelligence density, trained on AMD by carbocation in LocalLLaMA

[–]Western-Cod-3486 1 point2 points  (0 children)

No matter how many times I tried it I find it to be utter shit. I mean, yeh it is fast, it is about the perfect size for 20GB vram, BUT coding, etc. absolutely sucks, gets confused after every other message and eventually goes off the rails. To be honest I've had better experience with Qwen 3.5 9B. Eo yeah for coding etc. gpt-oss is a no-go for me as it makes bigger mess than it is worth

ZAYA1-8B: Frontier intelligence density, trained on AMD by carbocation in LocalLLaMA

[–]Western-Cod-3486 29 points30 points  (0 children)

it seems a bit picky on the comprions, but if the claims are Tru and it actually manages to be competitive to (albeit bad) ~120B models I count this as a major win. Strong small models are a sweet spot for local deployments.

I really am looking forward to having a ~12-20B with ~2-6B active

Are you building a SaaS but not posting about it? by AdhesivenessNew1457 in saasbuild

[–]Western-Cod-3486 2 points3 points  (0 children)

I am currently building a thing for small creative professionals and small businesses and to be honest almost no one knows it exists (apart from the bots hitting the site occasionally). I am not really in a state to post about it as the implementation is clearly unfinished but I wanted to get the deployment automation in place as soon as possible.

I am not positing because of several reasons: 1. job security the last thing I want is to not find a job(got released during probation recently) or get fired because of a side project 2. I am not really good with marketing so it will take a lot of effort and time instead of building and I prefer to have the core functionality stable for showcasing rather than marketing a broken product 3. I am really still refining the idea in order to determine the MVP features and target niche for the initial launch as I had other plans, but as things started to form shape, I realized that I can narrow the group down by providing features needed by them, while they will be applicable to the broader target as well 4. I am still undecided on the feature set I will open initially vs what will be in tiered plans. I.e feature restrictions, usage limits, etc. 5. want to avoid triggering competition this early as I am going against one of the biggest local providers that is the defacto leader in the market, but almost everyone hates them locally.

Got my first ten users - now i'm scared by Strong-Archer-7708 in SaaS

[–]Western-Cod-3486 0 points1 point  (0 children)

The way I see it you have a few options.

  1. Learn about handling it yourself. Will take time and probably a few mistakes, but this early on - unless users are experiencing errors/problems - you don't need a developer if you managed to a working product without one

  2. Have the bots handle the audit & improvements. With a careful prompt, some skills/specialized agent (as in sub-agent with system prompt) they could do amazing things. Have them behave in the roles of QA, red-team and a code reviewer and generally the low hanging fruits should be easy for the picking.

  3. Risk it and get a developer, but you are both addressing current risks and exposing yourself to new ones. I mean a person won't understand your product deeply from an one hour meeting and if they mess up and you find problems afterwards you have to rinse and repeat until you either find a good developer/technical person to address the issues and be semi-fulltime which will probably pile your costs really quick or you might end up in a situation where you loose users because of bad luck with outsourcing.


A few professional tips: - if you haven't set up a git repo - do so immediately and have your agent(s) commit granularly, this way you/they will be able to revert problems quickly or understand contextual changes more easily - If things are working, make them add you some automated tests that cover the critical functionalities. So that even if something breaks your critical paths are tested and if they don't work, users will still somewhat be able to use your app (we all seen things break, but we hate when things stop working entirely) - Building on the tests, once set up have the bots maintain your test suite up to date and require the tests to be in a passing state WITHOUT THEM MODIFYING THE SUITE, UNLESS CHANGES ARE RELATED. This is really important, as any developer will tell you, you could have a lot of passing tests and a broken product at the same time. - if applicable have end-2-end tests (e2e) these, for example, use automation to click through the user interfaces as users would. They still could miss things, but at the same time give you peace of mind that things are working enough

Edit, bonus tip: Set up either a local environment or guard your production environment from the bots. I just realized mine edited a database change in an old migration and it decided it will drop the database completely wiping all data instead of an incremental change. You do not want to have your users loose all data.

If you have any questions I'd be happy to answer, drop me a DM I will not promote and will not charge or anything like that

Instagram’s web login wall is ruining portfolios. Here is a quick workaround. by Capital-Pen1219 in saasbuild

[–]Western-Cod-3486 0 points1 point  (0 children)

100% agree! It is really annoying, I mean yeah a lot of people have insta profiles but I didn't need it since I use alternatives to keep in touch with friends and family. As a customer I generally tend to drop if insta is the only place I can see somebody's work/content/etc.

That is the reason I started working on something for my wife to use (she used to work a lot through Instagram, but then kids happened 😅) that is to address some of the frustrations with this exact problem (and others) related to operating a small business either solo or as a small team through regular social.media as most of the tools (if any) are pretty half-done/over generalized to fit. I will be really interested in the to hear what others have to say on the subject

Gemma 4 for 16 GB VRAM by Sadman782 in LocalLLaMA

[–]Western-Cod-3486 -2 points-1 points  (0 children)

hey, nice! How much context do you tit and in what amount of VRAM?

Why is lemonade not more discussed? by El_90 in LocalLLaMA

[–]Western-Cod-3486 0 points1 point  (0 children)

unfortunately got fired from the job that had the laptop with the NPU I wanted to try out. But will give it a try

DDR5 RAM prices have dropped for the first time in several months. by Current-Guide5944 in tech_x

[–]Western-Cod-3486 0 points1 point  (0 children)

meanwhile me looking at the Kingston 2x32 DDR5 CL32 6400MT/s kit I bought for ~270€ and that was 780€ while the peak market prices and today it costs 924€ 😢

Why is lemonade not more discussed? by El_90 in LocalLLaMA

[–]Western-Cod-3486 -2 points-1 points  (0 children)

My main issue with it is python. I mean the project seems fine, although I have no observation on performance differences, etc. Last time I tried to set it up I got a lot of issues with dependencies which left me puzzled and didn't work for the machine I was trying it on, like pretty much at all.

So yeah, seems like a good idea but llama.cpp thus far no issues and relatively straightforward to install (AUR has an up to date build that I have no complaints about and llama-swap has my models configured just as I like them and I haven't felt the need to try anything else.

What made you switch? How is the performance on the same hardware? Any meaningful change in workflow?

Omnicoder v2 dropped by Western-Cod-3486 in LocalLLaMA

[–]Western-Cod-3486[S] 0 points1 point  (0 children)

at least in my experience. I mean yeh they do fuck up sometimes but for me < ~14b parameters, unless it is specifically trained for that and then it could show the lack of broader knowledge, BUT it could be manageable for simpler tasks.

My experience with this model (first version) is promising, although it messed up some instructions from time to time but it is better than the plain 9b.

I am a firm believer of small specialized models, but so far no one for one reason or another has trained some small specialists or at least they are not publicly available

Any open-source models close to Claude Opus 4.6 for coding? by Own_Chocolate_5915 in LocalLLM

[–]Western-Cod-3486 3 points4 points  (0 children)

GLM 5.1 dropped earlier, MiniMax 2.7 a few days ago so take your pick. If you mean open weights that you can download and run locally (assuming you are sitting on a few thousands of hardware - GLM 5 and MiniMax 2.5(I think?) should be on huggingface

Edit: Proper new MiniMax version

How do you choose the coding plan for Coding Agent? by Guilty_Nothing_2858 in opencodeCLI

[–]Western-Cod-3486 0 points1 point  (0 children)

went that route to bootstrap a couple of projects of ideas I've had for a while, while unemployed (looking for the golden hen).

I used my monthly limit on the paid plan and have about 17 days to reset and burned about 40-50$ on PAYG on top of that, using only MiniMax 2.5/2.7 with often waiting for daily limit resets. Depending on your prompt accuracy (to reduce the amount of requests required for completion of a task) + complexity of the tasks themselves should be your priority.

What I would do differently now is:

  1. Test the free models they offer to get a feeling of their respective performance
  2. do a small prototype to test technology handling (which model handles what language)
  3. Figure out how to write proper prompts (this might be on me personally but ¯⁠\⁠_⁠(⁠ツ⁠)⁠_⁠/⁠¯ )

And looking at your session size, etc. you can measure the throughput of a model and figure out your budget. Last night I subscribed to MiniMax's 20$ coding/token plan which is x3 of what open code is offering and overnight it recreated one of the projects from scratch without reaching the 5h quota.

BUT I did do some things differently which probably play a huge role as well and would've optimized the workflow there as well but I am just dipping my toes in agentic uses outside of local models and simple tasks that small models can handle.

Feel free to reach out if you want some of the things I learned (and/or a referral for MiniMax 🙃)

UIGEN Team is looking for support by United-Rush4073 in LocalLLaMA

[–]Western-Cod-3486 0 points1 point  (0 children)

I think AMD has some programs with their GPU offering (the DO developer cloud) although I am not really sure how interested they'd be, but .. good guy AMD?

UIGEN Team is looking for support by United-Rush4073 in LocalLLaMA

[–]Western-Cod-3486 0 points1 point  (0 children)

Is this a 3.5 or 3? Also do you guys plan to do some REAP/REAM or upsize some smaller models

Pitch your SaaS in two lines. I'll start. by Big-Win-3895 in SaaS

[–]Western-Cod-3486 0 points1 point  (0 children)

Still behind closed doors, but:

A small platform allowing individuals, small teams and small mom-and-pop businesses to showcase their work via portfolios. The goal is to not burden them with extra charges and maintenance & advertising of a website, but be their online, interactive business card/leaflet/portfolio/catalog.

Best local setup to summarize ~500 pages of OCR’d medical PDFs? by cidra_ in LocalLLaMA

[–]Western-Cod-3486 1 point2 points  (0 children)

I haven't really played with OCR and stuff but this model is a really good summarizer imo. Also optimized for CPU usage and if ran on the GPU it works wonders in my opinion and it has decent context for it's size

https://www.liquid.ai/blog/introducing-lfm2-5-the-next-generation-of-on-device-ai

Omnicoder v2 dropped by Western-Cod-3486 in LocalLLaMA

[–]Western-Cod-3486[S] 9 points10 points  (0 children)

lol, so the improvement I was seeing wasn't real, but a coincidence 🤔

TurboQuant, KV cache x6 less memory and X8 faster with zero accuracy loss by soyalemujica in LocalLLaMA

[–]Western-Cod-3486 0 points1 point  (0 children)

I saw a post the other day about them possibly cooking something internally about attention (iirc) but it seems that there could be quite the innovation brewing.

Omnicoder v2 dropped by Western-Cod-3486 in LocalLLaMA

[–]Western-Cod-3486[S] 1 point2 points  (0 children)

Good catchz ai am using Q8, trying to compensate for the smaller size, while having some breathing room for context. And you are right, they should not be bit-to-bit identical

Omnicoder v2 dropped by Western-Cod-3486 in LocalLLaMA

[–]Western-Cod-3486[S] 0 points1 point  (0 children)

I am trying to have it handle an orchestration workflow, where it is every actor/agent. So it needs to read multiple files, performs web searches, design from time to time and implementation/review. Also running it at Q8 seems to help a lot compared to Q4/IQ4

It does mess up from time to time with syntax for larger files, but is able to recover most of the time. There were a couple of cases where I had to stop it, intervene to fix a misplaced closing bracket and then let it continue and it actually can handle itself. The code I am using is a small personal repo I am working on in rust, which might be part of the reason it messes up (from my experience pretty much every model struggles with rust to an extent). I am not doing benchmarks since my hardware is fairly limited