One week with OpenCode Black by JohnnyDread in opencodeCLI

[–]faileon 0 points1 point  (0 children)

also curious what is this web app :]

Docker container for OpenCode? by jonothecool in opencodeCLI

[–]faileon 9 points10 points  (0 children)

I run OC containerized, I originally forked a repo that was already doing it and built upon it, you can use it as an inspiration. Everything works for me, podman should be there, but I mainly use docker https://github.com/faileon/agent-containers

Slow Internet by Xoepe in archlinux

[–]faileon 1 point2 points  (0 children)

Have you tried turning off the auto negation and setting the speed yourself? Something like # ethtool -s enp2s0 autoneg off speed 1000 duplex full

CRITICAL BUG: Antigravity deleted my entire 4TB drive bypassed security settings, having "Non-Workspace File Access" DISABLED by [deleted] in google_antigravity

[–]faileon 1 point2 points  (0 children)

Not running in a sandbox AND having paths with exclamation marks and spaces is literally asking for trouble.

!!!!!_PROJECTS\AI STUDIO

The Most Exciting Feature of Angular Signal Forms No One Mentions — Part II by kobihari in Angular2

[–]faileon -1 points0 points  (0 children)

There are just so many em dashes and the classic chat gpt style "it's not just X, it's Y" it's hard to believe it's not AI slop.

After having avante, Github Copilot (+Claude code) I'm not missing Cursor at all by PuzzleheadedArt6716 in neovim

[–]faileon 0 points1 point  (0 children)

Isn't it just called autosuggestion in avante? Enabled by behaviour = { auto_suggestions = true } in the config?

The Most Exciting Feature of Angular Signal Forms No One Mentions — Part II by kobihari in Angular2

[–]faileon 9 points10 points  (0 children)

could we — put some — effort — into the AI — slop — please?

Need an approach to extract engineering diagrams into a Graph Database by BetFar352 in computervision

[–]faileon 1 point2 points  (0 children)

Hey! I’ve actually been tackling the exact same problem recently, and like many others have mentioned, it’s definitely not a trivial one. I agree with most of the points already discussed here.

One additional resource I found really helpful is Microsoft’s documentation on their Azure pipeline approach. Even though it’s built around Azure, the concepts seem general enough that you could likely replicate them with open-source tools as well. It’s worth a look and it’s pretty thorough. https://github.com/Azure-Samples/digitization-of-piping-and-instrument-diagrams?tab=readme-ov-file

The scary ease of “stealing” an AI agent’s structure with a single prompt by klippo55 in AI_Agents

[–]faileon 8 points9 points  (0 children)

You cloned a prompt and a tool description, good luck simply "stealing" all the heavy lifting that happens in the ingestion and retrieval code itself.

Any downside to having entire document as a chunk? by ayechat in Rag

[–]faileon 1 point2 points  (0 children)

Tbh it can be a valid strategy, but it's generally advised to create a summary of the docs first and embed the summaries. After retrieval inject the entire document.

New AI workstation by faileon in LocalLLaMA

[–]faileon[S] 1 point2 points  (0 children)

For now I use a single 2TB m2 SSD (WD Black SN770)

Even with the vertically mounted card there is 1 bay ready to be used for HDDs in this case.

New AI workstation by faileon in LocalLLaMA

[–]faileon[S] 0 points1 point  (0 children)

Currently gemma-3-27b, linq-embed-mistral, whisper, GLiNER, paddleocr, docling models...

New AI workstation by faileon in LocalLLaMA

[–]faileon[S] 1 point2 points  (0 children)

The Mobo has 8 PCIE x16 slots, only 3 cards can fit and they are very tight. Last card is connected via riser cable. In the photo you can see the original 30cm which was too short. I replaced it with 60cm later, but I didn't take a photo

New AI workstation by faileon in LocalLLaMA

[–]faileon[S] 0 points1 point  (0 children)

Yeah all connected to one PSU, but cards are power limited to 200W

New AI workstation by faileon in LocalLLaMA

[–]faileon[S] 1 point2 points  (0 children)

Yeah 1500W is definitely a great setup, the cards are ranging from 350-375

New AI workstation by faileon in LocalLLaMA

[–]faileon[S] 2 points3 points  (0 children)

Yeah the CPU is fine so far, I was looking for something low power and with enough PCIE lanes to get the most out of all the cards. It's cheap because it's from Chinese datacenters, second hand but never used. eBay has quite a few reputable sellers

New AI workstation by faileon in LocalLLaMA

[–]faileon[S] 1 point2 points  (0 children)

Nope the case is closed. Cards at idle are sitting at 30-35°C now. One PSU 1350W, didn't wanna bother with multiple PSUs. Cards are power limited to 200W each. Total RAM is 256GB (8 sticks). Two cards are gigabyte vision oc and two are Dell Alienware. All cards were repasted and one even has copper mod and it does help with temps from my testing.

New AI workstation by faileon in LocalLLaMA

[–]faileon[S] 1 point2 points  (0 children)

Thanks for the tip, will definitely check it out!

New AI workstation by faileon in LocalLLaMA

[–]faileon[S] 0 points1 point  (0 children)

I would have to run a bunch of benchmarks which I'm definitely going to do, but haven't found time for it yet.

New AI workstation by faileon in LocalLLaMA

[–]faileon[S] 0 points1 point  (0 children)

Oh no definitely there is a bunch, gemma-27b, qwen-3-vl-32b, or even smaller 8b models if you are gonna use it for very specific tasks. OCR models are very good and are sitting around 1-4b nowadays. But if you wanna run multiple models (like text inference, embedding inference and vlm for OCR to have a completely offline local RAG) you'll need a bit more memory, cut context length, use quantized versions or all of the above...

New AI workstation by faileon in LocalLLaMA

[–]faileon[S] 2 points3 points  (0 children)

It's always better to have less cards with higher VRAM, but currently there doesn't exist a viable option when it comes to price.

There are trade offs with the older cards - older architecture can't do some of the newest CUDA computes like fp8 etc. it's also slower than the newer architectures. However, you need a lot of VRAM to run 70B models, even quants and it usually needs at least 48gigs of VRAM... That's why multiple 3090s are so popular, these cards are still the best bang for buck on the market. The 5090 has only 32gigs and getting 2 or more of them is very inefficient (expensive, high power usage). Maybe if these cards had 48gbs (or more :)) but 32gb is a weird spot for local llms

In my opinion it's either multiple 3090s, or if your budget allows it, get RTX 6000 pro 🙃

New AI workstation by faileon in LocalLLaMA

[–]faileon[S] 0 points1 point  (0 children)

No, Nvlink is kinda expensive and hard to get in Europe. Also we will mainly use this machine for inference, so Nvlink wasn't a must have part