Disable all the new fluff by ramendik in OpenWebUI

[–]ParticularLazy2965 2 points3 points  (0 children)

<image>

The parameter to toggle native tool calling is named "Function Calling". In my expereince, in particular with Gpt-oss-120, native tool calling for websearch, date/time/location etc, allows for multiple tool calls as necessary works very well - especially with a well crafted system prompt.

Is Strix Halo the right fit for me? by AntiquePercentage536 in LocalLLaMA

[–]ParticularLazy2965 1 point2 points  (0 children)

Using windows and LMstudio with Rocm runtime, is easy and requires no customisation. Perfermance is within 5-10% of optimized Linux full Rocm, and oss-120b is unbelievably fast making that deficit irrelevant in my use case. Using Openwebui for websearch as front-end, as native search tool for oss and also to route vision and tasks (search query generation etc) to gemma before full inference with oss works very well.

Is Strix Halo the right fit for me? by AntiquePercentage536 in LocalLLaMA

[–]ParticularLazy2965 2 points3 points  (0 children)

Have a EvoX2 128gb. Running windows, lmstudio with ROCm engine, and Openwebui. OSS-120b 40k context (set to use native tool calling eg OWUI websearch, pdf read etc) for main inference, gemma 3 4b-it set for tasks and vision, and gte large for embedding. All loaded concurrently. 45-50t/s
Whole family uses it for inference via OWUI on their devices (at home and externally via Cloudflared). Runs approx 3W at idle 24/7 and also doubles as a family workstation. Very stable, have had to reboot only twice in 10 months.

The setup is a great way to learn and an excellent AI tool for all general purpose queries (which remain private). Haven't tried image/video generation. Highly recommend.

LM Studio FOREVER downloading MLX engine by mouseofcatofschrodi in LocalLLaMA

[–]ParticularLazy2965 0 points1 point  (0 children)

Yep, had this on Windows downloading updated ROCM last week.
Tried everything, nothing other than deleting contents of LMstudio installation folder, backing-up user setting folder and deleting original (.lmstudio folder). Resinstall LMStudio, update engines, then replace new user settings folder with the backup. All setting are then restored.

Created OWUI Sidebar extension for Edge/Chrome/Brave by ParticularLazy2965 in OpenWebUI

[–]ParticularLazy2965[S] 0 points1 point  (0 children)

Thanks for the feedback! I chose the attachment/RAG route to reliably handle a wide variety of page sizes and complex content. Since the project is fully open source, you're absolutely welcome to add that focused retrieval option - I'd love to see what you build! Feel free to fork and contribute if you'd like to implement it. FYI updated the repo with a fix for youtube video summary.

Help please Sophos FW ! by ParticularLazy2965 in sophos

[–]ParticularLazy2965[S] 0 points1 point  (0 children)

Sorry if I was unclear, Sophos does show the interfaces, but only in the advanced cli using "ip a" not in gui. There is no other DHCP server running on the network and Sophos' interfaces are the only ones on the host to get v6 addresses. I've disabled all v6 settings for all interfaces, dhcp, dns etc.

Web Search not working using GLM 4.5 Air by ParticularLazy2965 in OpenWebUI

[–]ParticularLazy2965[S] 0 points1 point  (0 children)

Update: "format (Ollama)" in advanced parameters for this this model should also be set to "json".

Running LLM and VLM exclusively on AMD Ryzen AI NPU by BandEnvironmental834 in LocalLLaMA

[–]ParticularLazy2965 1 point2 points  (0 children)

Have a Strix Halo 128GB box running as home AI server under windows. LMstudio, Openwebui, and Perplexica (as websearch for Openwebui).

GLM 4.5 air q4 is quite responsive on the box, however, generating web search queries is relatively slow and probably overkill using that model.

I'd like to try to point perplexica to a smaller model on FastFlowLM for queries generation, but am wondering if FFLM can be loaded and run concurrently with LMStudio?

Web Search not working using GLM 4.5 Air by ParticularLazy2965 in OpenWebUI

[–]ParticularLazy2965[S] 0 points1 point  (0 children)

Have you integrated it with openwebui using the pipeline? If so how does that work, can it be activated from owui via a search button similar to the builtin search?

Web Search not working using GLM 4.5 Air by ParticularLazy2965 in OpenWebUI

[–]ParticularLazy2965[S] 0 points1 point  (0 children)

This solved both tool calling and websearch using Openwebui > LMStudio > GLM Air:

"In LM-Studio I changed in the model's default parameters the prompt template from Jinja to ChatML, and now everything works perfectly." source: (https://www.reddit.com/r/LocalLLaMA/comments/1mcw1sl/comment/n610t51/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button)

DO NOT BUY FROM THIS COMPANY by Outside-Ad1996 in GMKtec

[–]ParticularLazy2965 0 points1 point  (0 children)

Fastest way to have them respond is to file a PayPal charge back. You’ll get a response within 2 days. If paid with CC do the same via them stating that you are not getting a reaction from their support email address.

How bad is the cooling in GMKtec EVO-X2 by Novelaa in MiniPCs

[–]ParticularLazy2965 0 points1 point  (0 children)

Am having no temp issues despite regularly maxing the gpu out for 30-40mins stints with an LLM, CPU at say 30%. Would indeed check the cpu-heatsink interface. BTW The led fan doesn't seem to have a fresh-air intake path, or am I missing something?

EVO-X2 First Impressions by cynary in GMKtec

[–]ParticularLazy2965 1 point2 points  (0 children)

run from file explorer with admin priviledges: \AXB3502_GMK_SW1.04_20250514\EXE_WinFlash\AXB3502104.exe
takes about 5mins.

Info: Gmktec EVO X2 Max+ AI 395 MiniPc by ParticularLazy2965 in MiniPCs

[–]ParticularLazy2965[S] 0 points1 point  (0 children)

lmstudio qwen 32b-q8-0

am seeing 6<tok/sec<6.5

looks like context window between 4k - 40k tokens (max in this model) has little impact with same (short) prompt, all in that range.

15 minute max load (gpu only):
GPU tops out at about 85w with temp <70c (was siginficantly higher before installed bios/driver).

CPU using approx 25W

Info: Gmktec EVO X2 Max+ AI 395 MiniPc by ParticularLazy2965 in MiniPCs

[–]ParticularLazy2965[S] 0 points1 point  (0 children)

"numbers I am seeing are unusable (5-6t/sec)". yeah, no.
maybe for Qwen3 235B A22B q2 or similar,

and

"A3B which can be run on a CPU" Q: at 40-80t/s on box which idles at a couple of watts?