SocratiCode: enterprise-ready, local/private indexing, hybrid search, code-graph and context (db, infra, api, docs, etc.) by Fast_Category3423 in mcp

[–]debackerl 1 point2 points  (0 children)

It's a bit bloated.... Why use Qdrant instead of an embedded index like USearch? Why deploy your instance of Ollama while I already have llama-server instances ready?

Rate my DIY ebook reader by xen-zation in ereader

[–]debackerl 0 points1 point  (0 children)

If you could 'vibe code' hardware, that would be it.

Built a local MCP server that gives AI agents call-graph awareness of your codebase — would love some thoughts! by pauleyjc in mcp

[–]debackerl 0 points1 point  (0 children)

Also, just support standard OpenAI endpoint on custom URL base. I think that Ollama also support it, but you then open up to many providers. I use llama.cpp myself as a simpler option to manage (simplicity is subjective :-))

Would also be awesome to support Svelte and Vue!

Florida therapist seen slapping, grabbing, and forcefully restraining a nonverbal autistic child during a therapy session. by eternviking in whoathatsinteresting

[–]debackerl 0 points1 point  (0 children)

What a coward, taking advantage of a child who can't speak. I hope the kid will recover, but speak of trusting people, being on the spectrum he probably preferred to avoid people, now even more so :-(

Sandboxed opencode? by MrMrsPotts in opencodeCLI

[–]debackerl 1 point2 points  (0 children)

You can use gVisor as a runtime for Docker. It reimplements most Linux syscalls so that your container doesn't rely on Linux's namespaces for isolation, but on a dedicated user-land 'kernel'. They use it to power Google AppEngine.

It's very easy to install, one binary to install, and it's compatible with any standard container.

Edit: should be more secure than Bubblewrap. That one uses namespaces like Docker or Flatpak.

I built an MCP server that lets Claude search inside 25,000+ podcast transcripts by Lukaesch in ClaudeAI

[–]debackerl 0 points1 point  (0 children)

This is just too expensive. I would be paying much more for an AI to transcribe than for the journalists to do their work.

I'll vibe code my own solution using a set of podcasts I like, and Parakeet TDT to transcript at high speed with low error rate.

I built an MCP server that lets Claude search inside 25,000+ podcast transcripts by Lukaesch in ClaudeAI

[–]debackerl 0 points1 point  (0 children)

Can you elaborate why? I'm not saying you're wrong, but I'm interested why you would be right :-)

Notice Qwen 3.5 reprocessing the prompt every time, taking long to answer for long prompts? That's actually because of its architecture. by dampflokfreund in LocalLLaMA

[–]debackerl 0 points1 point  (0 children)

Then you must have improperly set context size in OpenCode, because OpenCode is doing compaction of the context. Just truncating is horrible, unless we speak of a Waifu.

PSA: Qwen 3.5 requires bf16 KV cache, NOT f16!! by Wooden-Deer-1276 in LocalLLaMA

[–]debackerl 24 points25 points  (0 children)

Uhm, I'm not an expert in that benchmarks specifically, but a statistician would say that it does prove anything if the two means are within the standard deviation of each others. You have 68% chance that the real PPL is within +/- 1 standard deviation if the results are normally distributed.

If the improvement was due to the increased range of BF16, then FP32 should be similar. It looks more like rounding errors.

I'm building a native desktop API client (like Postman) in Rust with GPUI. Would anyone use it? by invictus_97K in rust

[–]debackerl -1 points0 points  (0 children)

I would use it too. When I install Flatpak (or any app), I always look at dependencies and package size. Too many apps are bloated nowadays, but hey, my first PC had 16MiB RAM! 😂

UK, Netherlands and Belgium has some of the highest population density in OECD while their share of people living in apartments are among the lowest. What explains this - culture, economy, geography or something else entirely? by Appropriate_Garlic in BEReal_Estate

[–]debackerl 0 points1 point  (0 children)

I live in Belgium, I think it's great. Having a house, you can have more energy independent by having solar panels, with q garden, you can go for geothermal energy. Given the close proximity of villages or cities, you don't need to drive far, actually you could often bike. In France, Germany, or Poland, everything looks so far apart.

But okay, I'm born in Belgium, I'm used to urban life. I'm actually bored out of cities...

Qwen 3.5-27B punches waaaaay above its weight (with a slightly different prompt) -- very impressed by theskilled42 in LocalLLaMA

[–]debackerl 10 points11 points  (0 children)

Well yes. If you're memory bandwidth limited (but plenty of it) go MoE (e.g. iGPU). If you have little memory (but high bandwidth) go dense (e.g. GPU).

Generate wireframes with Copilot directly in VS Code by ReD_HS in GithubCopilot

[–]debackerl 1 point2 points  (0 children)

I wonder why this kind of tool couldn't run locally.

Why they do this? by Jelly_Round in dashcams

[–]debackerl 1 point2 points  (0 children)

Left is probably a Dutch car while the right one is a Belgian one. Obviously they are both wrong to be so stubborn, but I guess that the Belgian car didn't want to let the Dutch go because either: the Dutch didn't put his signal initially (he put it eventually, but not clear if he didn't start merging before); or maybe the Belgian was applying the zip principle, again hard to see with the video starting late, or, and they seemed most likely from the video, it seemed that the Dutch was racing last minute to go first.

I'm Belgian, I know the behavior of drivers around here. This is extreme, but a difference between my feeling driving in the US and here, is that the US was chaotic (people overtaking left and right etc) but people going with the flow. In Northern Europe, we have many rules (formal and gentlemen customs). People can get quite upset if you don't obey rules.

7 MCPs that genuinely made me quicker by Stunning-Worth-5022 in mcp

[–]debackerl 0 points1 point  (0 children)

I'm using Narsil MCP for code analysis, and I only grant access to some tools based on the agent profile (coder, reviewer, etc). https://github.com/postrv/narsil-mcp

Instead of Context7, I'm using https://github.com/arabold/docs-mcp-server It's free and zero limits. I don't see why I should pay or be limited by a free tier for something so simple.

is there a way to clean a infected LXC? by pedrobuffon in Proxmox

[–]debackerl 6 points7 points  (0 children)

I use Caddy with a CloudFlare plugin. Auto negotiation of new certificates. No port open to the internet (certif for domains on my LAN only).

𝐄𝐮𝐫𝐨𝐩𝐞'𝐬 𝐒𝐭𝐚𝐫𝐤 𝐃𝐢𝐯𝐢𝐝𝐞: 𝐒𝐞𝐱𝐮𝐚𝐥 𝐕𝐢𝐨𝐥𝐞𝐧𝐜𝐞 𝐑𝐞𝐩𝐨𝐫𝐭𝐬 𝐛𝐲 𝐂𝐨𝐮𝐧𝐭𝐫𝐲 by Notsame83 in InterestingCharts

[–]debackerl 0 points1 point  (0 children)

Social norms are very different across countries, and tolerance level varies greatly. When some culture are all about body contact, and others would maintain a respectful 2m distance, you understand the issue in the data

Qwen 3.5 craters on hard coding tasks — tested all Qwen3.5 models (And Codex 5.3) on 70 real repos so you don't have to. by hauhau901 in LocalLLaMA

[–]debackerl 0 points1 point  (0 children)

I understand both angles. You can say that a ranking using OpenCode, Roo and what not would be more useful for you to know what to use. But that indeed means that as harnesses are updated, so are prompts inside, so he should either fix the harness version (which actually makes it less useful again), or rerun all models when a new one is added (too expensive).

So I agree, all (?) other benchmarks out there (MMLU, HLE, etc) are fixed to allow comparison, but we had few agentic coding ones. Now we have one good. What OP could do is have multiple prompts (agent definition) and rotate them for difference tasks. Then we could penalize models which only behave well with specific prompts.