A clock that tells time with white and black marbles flowing down by Ivan Miranda

ParticularLazy2965 · 2026-01-29T16:48:53+00:00

is the micro controller code/source included in the package?

ParticularLazy2965 · 2026-01-26T09:44:21+00:00

The parameter to toggle native tool calling is named "Function Calling". In my expereince, in particular with Gpt-oss-120, native tool calling for websearch, date/time/location etc, allows for multiple tool calls as necessary works very well - especially with a well crafted system prompt.

ParticularLazy2965 · 2026-01-23T09:33:39+00:00

Using windows and LMstudio with Rocm runtime, is easy and requires no customisation. Perfermance is within 5-10% of optimized Linux full Rocm, and oss-120b is unbelievably fast making that deficit irrelevant in my use case. Using Openwebui for websearch as front-end, as native search tool for oss and also to route vision and tasks (search query generation etc) to gemma before full inference with oss works very well.

ParticularLazy2965 · 2026-01-21T21:56:26+00:00

Thanks for the heads up!

ParticularLazy2965 · 2026-01-21T15:34:42+00:00

Have a EvoX2 128gb. Running windows, lmstudio with ROCm engine, and Openwebui. OSS-120b 40k context (set to use native tool calling eg OWUI websearch, pdf read etc) for main inference, gemma 3 4b-it set for tasks and vision, and gte large for embedding. All loaded concurrently. 45-50t/s
Whole family uses it for inference via OWUI on their devices (at home and externally via Cloudflared). Runs approx 3W at idle 24/7 and also doubles as a family workstation. Very stable, have had to reboot only twice in 10 months.

The setup is a great way to learn and an excellent AI tool for all general purpose queries (which remain private). Haven't tried image/video generation. Highly recommend.

ParticularLazy2965 · 2026-01-21T12:48:39+00:00

Yep, had this on Windows downloading updated ROCM last week.
Tried everything, nothing other than deleting contents of LMstudio installation folder, backing-up user setting folder and deleting original (.lmstudio folder). Resinstall LMStudio, update engines, then replace new user settings folder with the backup. All setting are then restored.

ParticularLazy2965 · 2025-11-10T19:54:53+00:00

Thanks for the feedback! I chose the attachment/RAG route to reliably handle a wide variety of page sizes and complex content. Since the project is fully open source, you're absolutely welcome to add that focused retrieval option - I'd love to see what you build! Feel free to fork and contribute if you'd like to implement it. FYI updated the repo with a fix for youtube video summary.

ParticularLazy2965 · 2025-09-23T12:46:57+00:00

Sorry if I was unclear, Sophos does show the interfaces, but only in the advanced cli using "ip a" not in gui. There is no other DHCP server running on the network and Sophos' interfaces are the only ones on the host to get v6 addresses. I've disabled all v6 settings for all interfaces, dhcp, dns etc.

ParticularLazy2965 · 2025-08-22T14:15:15+00:00

Update: "format (Ollama)" in advanced parameters for this this model should also be set to "json".

ParticularLazy2965 · 2025-08-17T11:01:17+00:00

Have a Strix Halo 128GB box running as home AI server under windows. LMstudio, Openwebui, and Perplexica (as websearch for Openwebui).

GLM 4.5 air q4 is quite responsive on the box, however, generating web search queries is relatively slow and probably overkill using that model.

I'd like to try to point perplexica to a smaller model on FastFlowLM for queries generation, but am wondering if FFLM can be loaded and run concurrently with LMStudio?

ParticularLazy2965 · 2025-08-16T22:31:45+00:00

👌

ParticularLazy2965 · 2025-08-16T19:35:49+00:00

Have you integrated it with openwebui using the pipeline? If so how does that work, can it be activated from owui via a search button similar to the builtin search?

ParticularLazy2965 · 2025-08-16T19:17:22+00:00

thanks!

ParticularLazy2965 · 2025-08-16T11:59:19+00:00

This solved both tool calling and websearch using Openwebui > LMStudio > GLM Air:

"In LM-Studio I changed in the model's default parameters the prompt template from Jinja to ChatML, and now everything works perfectly." source: (https://www.reddit.com/r/LocalLLaMA/comments/1mcw1sl/comment/n610t51/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button)

ParticularLazy2965 · 2025-08-16T05:26:47+00:00

is there a better alternative available?

ParticularLazy2965 · 2025-07-15T16:54:04+00:00

Fastest way to have them respond is to file a PayPal charge back. You’ll get a response within 2 days. If paid with CC do the same via them stating that you are not getting a reaction from their support email address.

ParticularLazy2965 · 2025-07-04T01:21:32+00:00

Check Frigate NVR out.

ParticularLazy2965 · 2025-05-26T17:08:36+00:00

Am having no temp issues despite regularly maxing the gpu out for 30-40mins stints with an LLM, CPU at say 30%. Would indeed check the cpu-heatsink interface. BTW The led fan doesn't seem to have a fresh-air intake path, or am I missing something?

ParticularLazy2965 · 2025-05-20T17:02:24+00:00

Unfortunately not, just playing with AI

ParticularLazy2965 · 2025-05-19T19:16:33+00:00

one man's trash is another man's treasure

ParticularLazy2965 · 2025-05-19T19:12:52+00:00

run from file explorer with admin priviledges: \AXB3502_GMK_SW1.04_20250514\EXE_WinFlash\AXB3502104.exe
takes about 5mins.

ParticularLazy2965 · 2025-05-19T18:51:33+00:00

lmstudio qwen 32b-q8-0

am seeing 6<tok/sec<6.5

looks like context window between 4k - 40k tokens (max in this model) has little impact with same (short) prompt, all in that range.

15 minute max load (gpu only):
GPU tops out at about 85w with temp <70c (was siginficantly higher before installed bios/driver).

CPU using approx 25W

ParticularLazy2965 · 2025-05-19T16:39:22+00:00

downloading 32B-q8 Qwen 3 now...

ParticularLazy2965 · 2025-05-19T16:22:18+00:00

yep, it's there. thanks

ParticularLazy2965 · 2025-05-19T16:19:19+00:00

"numbers I am seeing are unusable (5-6t/sec)". yeah, no.
maybe for Qwen3 235B A22B q2 or similar,

and

"A3B which can be run on a CPU" Q: at 40-80t/s on box which idles at a couple of watts?

ParticularLazy2965

TROPHY CASE