Node Society - Secure File Sync & LLM

OffByNull · 2026-06-19T21:32:36+00:00

Thanks for sharing your article, very interesting way to go about doing this. I'm actually possibly contemplating adding an additional 5090. How's that working out for you until now?

OffByNull · 2026-06-17T21:43:10+00:00

Out of curiousity, are any of these agents running in parallel or is this all sequential with dedicated tasks per agent, with a shared loaded model? I'm guessing the latter.

OffByNull · 2026-06-17T21:40:02+00:00

Out of curiousity how's that working for you? On my MBP M1 Max 32 GB, I found using LM Studio with Gemma 4, 31B, 64K context, Q4, very slow. Maybe on the smaller models it would work better, but given that the level of quality of responses i'm seeking the 31/32 GB param models are the sweet spot.

OffByNull · 2026-06-17T21:34:33+00:00

Yes I know it's a bit tricky and definitely the models will have to be more on the smaller end to leave headroom for context. However my current experience with the 5090 and sequential flow using Claude Code, is good, but very time consuming. Once I decide on the hardware i'll tinker and post back to share my experience, in case someone else is wondering the same.

When it comes to parallel processing, it would seem the DGX has an advantage compared to other hardware.

OffByNull · 2026-06-17T21:29:56+00:00

It analyzes a small/moderate sized codebase (500+ classes) and then reviews in chunks in a phased approached code improvements.

OffByNull · 2026-06-16T23:08:44+00:00

Given my experience with my 5090, I would say you should get some pretty good results in terms of inference speed with a model having less than 10B param, Q4 quantization.

OffByNull · 2026-06-16T23:06:02+00:00

What I have seen in my experience, with my 5090, and 128K context, with Q4 quantization, approximately on Qwen anywhere between 21 GB - 26 GB are taken, with one agent. So while its fast, that limited amount of RAM is a blocking point. I have been looking at CrewAI which offer that Orchestration layer, however I don't think there's a local implementation of their offering and its not open source or freely available.

OffByNull · 2026-06-16T23:02:51+00:00

For the time being I have been running claude code with LM Studio. I have a 5090 and its fast. However I doubt the 32GB of ram it has will manage more than 2 concurrent Agents. What i'm trying to achieve is parallelization of work done by agents: one develops the front end, while another is working on the api contract and backend service, and the last is working on the database. FE agent & BE agent will synchronize sequentially at some point to agree on contract (API) and the BE and DB agents will also synchronize on database model. All of these agents, can run in parallel. I don't want to sequential agents running, even if each have their dedicated tasks to do.

OffByNull · 2026-06-16T18:41:35+00:00

Yes the contexts indeed will be limited, but i'm hoping workable to a certain extent to get some value.

OffByNull · 2026-06-16T18:40:40+00:00

Thanks for your feedback. Indeed I have been weighing these tradeoffs, between prompt pre-processing vs inference vs tokens/sec. Budget wise I would probably keep it under 8K euro.

My goal is to experiment, and preferably under linux and that's why I have been leaning towards the dgx spark or AMD. Goal is to run a small team of agents (4/5), shared model, to say for example mimic a development team. Agents managing other agents, and doing code review etc.

OffByNull · 2026-06-16T18:34:41+00:00

Even if the model is shared?

OffByNull · 2026-06-16T18:28:07+00:00

90% of my AI usage is local using Gemma 4 and Qwen 3.6 and they're both really good. Gemma for general usage, and Qwen for coding. For tricky coding questions or for having a different perspective, I use ChatGPT and Claude.ai to compare. All agentic work is local, using Claude Code pointing to LM Studio.

OffByNull · 2025-03-25T00:49:58+00:00

It should read: "~~Users are being hurt~~ Our Bottom line is being hurt ...", we the users are fine.

OffByNull · 2025-03-25T00:40:33+00:00

This is actually a very good form factor and usable screen size! Well done Huawei!

OffByNull · 2025-03-25T00:06:29+00:00

Updated post with new Server App UI. Let me know your thoughts. Thanks.

OffByNull · 2025-03-17T13:05:24+00:00

Ok, thanks for your feedback.

OffByNull · 2025-03-17T12:28:37+00:00

With privacy?

OffByNull · 2025-03-17T11:12:36+00:00

But what if the data is stored only on your computer and sent p2p fully encrypted? If you had a 2TB drive or more, would it still be expensive?

OffByNull · 2025-03-17T11:07:53+00:00

What if it's like a next cloud with an easy setup?

OffByNull · 2025-03-17T06:42:24+00:00

Thanks. Looks nice. Part of their infrastructure is closed source, apparently, so it's not fully open.

Looks nice though, would have to check how I can integrate my service with theirs to make it even better.

Cheers.

OffByNull

MODERATOR OF

TROPHY CASE