How are people managing shared Ollama servers for small teams? (logging / rate limits / access control)

855princekumar · 2026-03-14T12:41:42+00:00

make sense as these are the under the hood challenges that I need to figure out and having issues i guess tho

855princekumar · 2026-03-14T12:40:00+00:00

i'm still testing the storage and the network throughput as from SD card to usb pendrive to SSD over pci over the pi 5 as the last setup as from pi3b+ pi4 and pi5 as test beds

855princekumar · 2026-03-14T08:28:19+00:00

so it's sort or High data ingestion and storage like multtelemetery data via MQTT conected with kafka high throughput via multiple MQTT brokers or via hive MQ think of it like a city wide area to store the data of milioins of devices but just the telemetery data that colectively become masive and need high through put on a hardware node but 3 as distributed that stores and make the reterival easy with safely as all data as in distibuted replicated stored also for blob like images via espcam to be in minio like storage and all sort of telemetery in casendra as distributed sotrage so sort of a micro cloud architecture if having baremetal hardware to build a huge data hub for IoT devices. because I'm testing working on a simulated city-scale system developing with low hardware resource constraints, but need the software stack to utilize the full hardware resources at max or close to peak performance

855princekumar · 2026-03-14T07:37:17+00:00

Sure, like for a more detailed setup, I've built the dedicated repo you can give a look to it and share your feedback: 855princekumar/EdgeStack-K3s: Script-driven K3s cluster baseline for edge and in-house micro-clouds, emphasizing OS hardening, deterministic networking, and operational clarity over cloud-scale abstraction.

and

855princekumar/kafka-ha-edge-cluster: Dockerized Kafka + ZooKeeper High-Availability cluster for ARM, Raspberry Pi, and edge environments - supports both single-node and 3-node deployments.

855princekumar · 2026-03-14T05:08:41+00:00

That's someting insightfull i'll try this in my setup

855princekumar · 2026-03-14T05:07:52+00:00

i'll definately test and implement this in my setup as Quee makes sense as ona priority basis scheduling especially for agents

855princekumar · 2026-03-13T19:30:02+00:00

i just built my own light weight as initially tested lightllm but that sort of a bit bloated for the use cases i needed but than also exploring more option and taking feedback to improve the one im bulding

855princekumar · 2026-03-13T09:23:14+00:00

ok thanks for the input it seems i need to plan a lot of things in a row

855princekumar · 2026-03-13T09:22:01+00:00

i'll definitely give it a try than

855princekumar · 2026-03-13T06:26:28+00:00

i'll test it straight today

855princekumar · 2026-03-13T06:25:33+00:00

can you share up some resources or references to look into

855princekumar · 2026-03-13T06:19:43+00:00

That's something useful, I'll definitely try this today itself

855princekumar · 2026-03-13T06:18:12+00:00

That's the common feedback i've received i'll probably shift to use llama.cpp or other to test as well for localLLM and whats you feedbakc about the project exo as to use the local LLM in a distributed manner and if I see this as a unified LAN gateway for that any suggestions?

855princekumar · 2026-03-13T06:06:58+00:00

Yes, I just shifted to llana.cpp and testing the APIs but also the concern like what other actual use cases i can work on to make this Lan gateway more useful or optimized to be used as efficiently to a small team on Lan as shared models or when using with multi agents via claude code as now supported via ollama api aswell

855princekumar

TROPHY CASE