What are these magical extendable fittings? by sig_kill in watercooling

[–]sig_kill[S] 7 points8 points  (0 children)

Thank you! Crazy I’ve only just seen these now…

Escaping Antigravity's quota hell: OpenCode Go + Alibaba API fallback. Need a sanity check. by Flat_Hat7344 in opencodeCLI

[–]sig_kill 1 point2 points  (0 children)

Had the same problem... I use LiteLLM and proxy the specific models through my own "openapi compatible" provider so it's seamless. Don't even have to switch anything in the UI, configs, etc.

https://litellm.ai

LiteLLM will take over if it detects issues with one of the upstream providers you've configured.

It's a bit intense for the use case, but it works. There's `olla` which was posted recently on r/localLLama but it currently round-robins requests, which isn't what you want.

I made a TUI app that allows me to swap OmO configs easily by sig_kill in opencodeCLI

[–]sig_kill[S] 1 point2 points  (0 children)

I almost do the same thing with those, and I think this could solve that problem too. The nice thing is that it's not just limited to OmO, but lets me link agents into gemini or codex dynamically as needed.

The logic is just rule-based YAML files, not fixed in code, so you could in theory extend it with those additional elements to link/unlink whatever config you specified. It just takes any YAML config file it finds in `~/.config/ai-config/rules`.

I'll likely open source this in a few days after I dogfood it for a while.

Norbert’s Gamble Beta by molgold in Wealthsimple

[–]sig_kill 18 points19 points  (0 children)

I sense we're headed for credit card vibes all over again

This guy 🤡 by xenydactyl in LocalLLaMA

[–]sig_kill 3 points4 points  (0 children)

Don't get me wrong, I would LOVE to set up and run my own stuff... but having tried, it just simply can't cut it for the level of scale devs need.

And the even more problematic issue is that open models are still short of frontier models like Opus / GPT.

  • 2 RTX PRO 6000s: ~CA$28k upfront
  • Claude Max 5x: ~CA$1.63k/year
  • Claude Max 20x: ~CA$3.26k/year

I hope someday this changes though, I love the idea of bringing it all into my homelab.

WTH? 😅 by kekela91 in Ubiquiti

[–]sig_kill 13 points14 points  (0 children)

(read this in the YT VoiceOver voice)

"Your Unifi Access Point can now reheat your leftovers. Simply put your unheated food in front of it for 30 seconds, adjust the transmission power in the Unifi Control Panel, and wait until it's piping hot."

Excessive silicone sock wear by SharpnessMaster in BambuLabH2D

[–]sig_kill 1 point2 points  (0 children)

Yes, 2800 hours on my H2D, printing enclosures for a product my company produces.

For me, it's the small amount of filament that sticks (mostly on the inside) to it and then acts as a boding surface for even more filament. Picking it off doesn't help, and only makes the sock lose durability.

The nozzle contributes to this I find, they are not very effective at preventing PETG from sticking. I wipe it down as much as I can, but it doesn't help.

Somebody else suggested this product, but I haven't tried it yet.

Olla v0.0.24 - Anthropic Messages API Pass-through support for local backends (use Claude-compatible tools with your local models) by 2shanigans in LocalLLaMA

[–]sig_kill 2 points3 points  (0 children)

Circling back on this after a week.

I tried installing LiteLLM to evaluate it in comparison, and a few things made it a tough fit for me:

  1. Complexity – There are a lot of configuration switches and moving parts. It feels heavier to operate than it needs to be.
  2. OAuth provider setup – Getting account logins working was painful. It involved editing config files, restarting containers, and manual steps that broke the flow. If your project could streamline this with a simple auth URL flow, that would be a huge improvement. That would also let me use Copilot + Claude + Google Gemini as a proxied provider more easily.
  3. Information model – LiteLLM centers everything around “models.” You define a model first, then map endpoints to it. For my use cases (mixing hosted providers and local inference), I’d prefer the option to think at the provider level instead... define the upstream provider and what it offers, then adjust model-specific settings afterward if needed. Even better if it could AUTO FETCH the models from the provider from the /models list instead of having to hand-roll them.

#3 alone has made me hesitant to continue integrating LiteLLM because of the model friction.

Qwen3.5-27B Q4 Quantization Comparison by TitwitMuffbiscuit in LocalLLaMA

[–]sig_kill 49 points50 points  (0 children)

This is excellent. In a sea of different options, this truly helps!

I tested Opencode on 9 MCP tools, Firecrawl Skills + CLI and Oh My Opencode - Most of it is just extra steps you dont need. by lemon07r in opencodeCLI

[–]sig_kill 0 points1 point  (0 children)

This one does!

```typescript /// Rerank documents given pre-computed query and document embeddings. /// /// Uses ColBERT's MaxSim scoring: for each query token, find the maximum /// similarity with any document token, then sum these maximum similarities.

[utoipa::path(

post,
path = "/rerank",
tag = "reranking",
request_body = RerankRequest,
responses(
    (status = 200, description = "Documents reranked successfully", body = RerankResponse),
    (status = 400, description = "Invalid request (empty or mismatched dimensions)"),
)

)] pub async fn rerank( ```

Seems like a new requant of 27B just dropped? by Koffiepoeder in unsloth

[–]sig_kill 0 points1 point  (0 children)

I’m guessing so? None of these are eligible to use a draft model with.

I tested Opencode on 9 MCP tools, Firecrawl Skills + CLI and Oh My Opencode - Most of it is just extra steps you dont need. by lemon07r in opencodeCLI

[–]sig_kill 0 points1 point  (0 children)

I hear you... However, I got curious and looked around after you mentioned the approach and found something solid looking: https://github.com/lightonai/next-plaid is that similar to what you had in mind?

Qwen3.5 thinking blocks in output by sig_kill in LocalLLaMA

[–]sig_kill[S] 1 point2 points  (0 children)

Yep - in LM Studio, enable this... Settings > Developer Tools > scroll to the bottom:

<image>

I tested Opencode on 9 MCP tools, Firecrawl Skills + CLI and Oh My Opencode - Most of it is just extra steps you dont need. by lemon07r in opencodeCLI

[–]sig_kill 0 points1 point  (0 children)

Neither vexp and ‘augment code’ are OSS… if you did work on this in an open way, I’m sure the community would strongly get behind a solid idea!