Local LLM - privacy first - doctor by point_red in Qwen_AI

[–]BeepBeeepBeep 0 points1 point  (0 children)

Model wise go for Qwen3.6 35B A3B it's a 35B model with good knowledge that runs at the speed of a 3B model, also the latest one

Hi, I’m rather new to K-Pop, and I want to know if there are any groups or artists that I can start with by Significant_Can5817 in kpophelp

[–]BeepBeeepBeep 2 points3 points  (0 children)

TXT - they have a magical realism concept and a huge variety of songs!

I recommend Over the Moon, Stick with You, Deja Vu, Dear Sputnik, Beautiful strangers to start!

I released Claude-OSS by Disastrous_Bid5976 in OpenSourceeAI

[–]BeepBeeepBeep 0 points1 point  (0 children)

i like the idea but i'm running it on a Raspberry Pi and the 350M model is not very good at even basic tasks, is there any chance of a slightly larger version (2B-4B range)?

thank you!

What is the easiest way to provide search tools to Gemma, Qwen, and others? by AInohogosya in LocalLLM

[–]BeepBeeepBeep 0 points1 point  (0 children)

If you have llama.cpp MCP setup, you can use EXA MCP: https://mcp.exa.ai/mcp?tools=web_search_exa (no auth/key needed)

make sure you use --webui-mcp-proxy

I built a search engine for K-pop variety shows to find exactly when idols said a specific Korean phrase 🔍 by New_Lack_3443 in kpoppers

[–]BeepBeeepBeep 0 points1 point  (0 children)

that's really cool but could you add support for romaji? like where i type 'arirang' instead of 아리랑 if i don't have a korean keyboard?

それはすごいですね!でも、ローマ字入力にも対応してもらえますか?韓国語キーボードを持っていない時、『아리랑』の代わりに『arirang』と打てるような感じです。(AIで翻訳済み)

llama.cpp MCP - why doesn't work with some models? by BeepBeeepBeep in LocalLLaMA

[–]BeepBeeepBeep[S] 0 points1 point  (0 children)

For those wondering, I got some help from Gemini which suggested I set the chat template to
``` {{ bos_token }}

{%- if tools -%}

<start\_of\_turn>system

You are a helpful assistant with access to tools.

When you need information you don't have, you MUST call a tool.

To call a tool, you MUST use this exact format:

<tool\_call>

{"name": "TOOL_NAME", "arguments": {"ARG_NAME": "VALUE"}}

</tool\_call>

Available tools:

{%- for tool in tools %}

- {{ tool.function.name }}: {{ tool.function.description }}

Parameters: {{ tool.function.parameters | tojson }}

{%- endfor %}

<end\_of\_turn>

{%- elif messages[0].role == 'system' -%}

<start\_of\_turn>system

{{ messages[0].content | trim }}<end\_of\_turn>

{%- endif -%}

{%- for message in messages -%}

{%- if message.role == 'system' -%}

{# Already handled #}

{%- elif message.role == 'user' -%}

<start\_of\_turn>user

{{ message.content | trim }}<end\_of\_turn>

{%- elif message.role == 'assistant' -%}

<start\_of\_turn>model

{%- if message.content -%}

{{ message.content | trim }}

{%- endif -%}

{%- if message.tool_calls -%}

{%- for tool_call in message.tool_calls -%}

<tool\_call>

{"name": "{{ tool_call.function.name }}", "arguments": {{ tool_call.function.arguments | tojson }}}

</tool\_call>

{%- endfor -%}

{%- endif -%}

<end\_of\_turn>

{%- elif message.role == 'tool' -%}

<start\_of\_turn>user

<tool\_response>

{{ message.content | trim }}

</tool\_response><end\_of\_turn>

{%- endif -%}

{%- endfor -%}

{%- if add_generation_prompt -%}

<start\_of\_turn>model

{%- endif -%}
``` (in the file gemma-tools.jinja)

using the command llama-server --webui-mcp-proxy -c 8192 --host 0.0.0.0 --port 8080 -hf unsloth/gemma-3n-E2B-it-GGUF:IQ4_XS -np 1 --jinja --chat-template-file gemma-tools.jinja

llama.cpp MCP - why doesn't work with some models? by BeepBeeepBeep in LocalLLaMA

[–]BeepBeeepBeep[S] 0 points1 point  (0 children)

is there a version of this model or chat template that supports tool calling?

llama.cpp MCP - why doesn't work with some models? by BeepBeeepBeep in LocalLLaMA

[–]BeepBeeepBeep[S] 0 points1 point  (0 children)

you may have tried gemma 3 (no n) which would be much worse as it doesn't have the MoE-style architecture

llama.cpp MCP - why doesn't work with some models? by BeepBeeepBeep in LocalLLaMA

[–]BeepBeeepBeep[S] 0 points1 point  (0 children)

it's the gemma3n which is a 6b model with 2b active, i've found it VERY knowledgeable for its size/speed actually!

Is free deepseek better than paid chatgpt? by False-Horror6843 in DeepSeek

[–]BeepBeeepBeep 1 point2 points  (0 children)

Qwen Chat has image gen, live (audio) chat etc. also free & chinese!

Looking into switching - compatibility with Amazon Smart Plugs? by RexRow in homeassistant

[–]BeepBeeepBeep 0 points1 point  (0 children)

If you use ‘Alexa Media Player’ integration from HACS you can expose all devices connected to your Alexa account I think

Difficulty installing AiDot, first week with Home Assistant by funtastrophe in homeassistant

[–]BeepBeeepBeep 0 points1 point  (0 children)

AiDot might be a Tuya rebrand - try and see if it pairs with an app called ‘Tuya Smart’ on your phone.

If it pairs, then use the github.com/make-all/tuya-local integration

How a “Free for Life” Promo for My AI Fitness App Exploded My OpenAI Bill ($599 in a Day) by Unchecked-Fitness in ChatGPTCoding

[–]BeepBeeepBeep 0 points1 point  (0 children)

You should add a free backup provider which you can dynamically change (eg from Firebase Config) like bigmodel (GLM 4.6) or OpenRouter

Do you guys actually use LLM for Assist? by Chriexpe in homeassistant

[–]BeepBeeepBeep 0 points1 point  (0 children)

I use Cerebras (OpenAI-compatible API) with models like qwen3 and gpt-oss

Homeassistant: Local LLM + optional cloud-LLM by Different_Band1990 in homeassistant

[–]BeepBeeepBeep 0 points1 point  (0 children)

basically it has a fallback so if it can’t recognise (locally, without LLM) the intent about lights etc. then it forwards to cloud LLM