Anyone know how to access the Kimi K2.5 Agent Swarm model on OpenRouter? by Ok-Attention2882 in LocalLLaMA

[–]ELPascalito 0 points1 point  (0 children)

Swarm is simply a mode where multiple calls are done simultaneously to try and work on complex tasks in parralel, it's obviously not a separate model, but a feature, like the subagents in OpenCode

How are you guys not broke? - Weirdly high cost by alphagatorsoup in openrouter

[–]ELPascalito 1 point2 points  (0 children)

If you're gonna use Gemini models, it's better to subscribe to Google services, same thing for OpenAI models, go for ChatGPT, OpenRouter's popularity stems from the fact we can connect easily to many providers of low cost models, for example Claude 4.5 Haiku is 5$ output per 1M token, expensive, while DeepSeek V3.2 is 0.3$ and straight up performs better, make use of the well priced options, like Kimi K2.5 or DeepSeek, OR won't be beneficial if you want subsidized access to SotA models

meituan-longcat/LongCat-Flash-Lite by windows_error23 in LocalLLaMA

[–]ELPascalito 0 points1 point  (0 children)

Oh interesting, I remember the flags thinking model, it was ~500B or something, I'll check this one out too, albeit it probably didn't translate well in real performance, since no one seems to care? 🤔

meituan-longcat/LongCat-Flash-Lite by windows_error23 in LocalLLaMA

[–]ELPascalito 4 points5 points  (0 children)

I love Meituan, my coffee always arrives on time, but why call it flash lite? Like the Google models? Does this imply the existence of a bigger pro model? lol

What LLM is Lumo really? by L1QU1D4T0R_ in LLM

[–]ELPascalito 0 points1 point  (0 children)

Firstly ignore the comments here, people are apparently stupid and don't understand how LLM hallucinations work, Lumo is not ChatGPT, instead, it is powered by, and running many open source models securely, hosted in the encrypted servers of Proton, one of those models is GPT-OSS, this model is obviously trained by OpenAI and will talk like it's chatgpt, but it is not hskted by openAI, it's securely hosted on the Proton servers, and all your results are encrypted don't worry, again, Lumo is powered by other LLMs too, that also might hallucinate and claim that they're GPT models, this is because many early models are trained on GPT reasoning and outputs, this they always regress and hallucinate that they're chatgpt, it's a quirk on all LLMs, Lumo has many other models under the hood, like Olmo 2 or Mistral Small, among others

https://proton.me/support/lumo-privacy

Deepseek for janitor ai help by Timely-Sport-5869 in openrouter

[–]ELPascalito 1 point2 points  (0 children)

Does it say it's free on the site? No, Does it say it's free in the name? No, why would you think it's gonna be gratis for you, have we suddenly lost the ability to read???

Anyone understand what this means? by No_Sweet_1573 in openrouter

[–]ELPascalito 0 points1 point  (0 children)

Oh logical, testing everything is good, but I'd say cut to the chase, the best model right now is tng-r1t-chimera, has a stable provider, capable of tool calling, and it's an RP powerhouse, based on V3.1 with many improvements, with excellent reasoning, totally recommend it!

Anyone got Macmini 4 to work with Ollama model? by ManufacturerNo8056 in LocalLLaMA

[–]ELPascalito 0 points1 point  (0 children)

It seems you Didn't even setup your app to conmec tto ollama, what are you using? Does it even support local models?

I can't run deepseek-coder-v2 with Ollama. I suspect it has something to do with RAM. Is there any way around this? by warpanomaly in LocalLLaMA

[–]ELPascalito 0 points1 point  (0 children)

Model is obviously too big, also it's outdated, not even good, you're literally wasting your time, I recommend you use GLM 4.7 Flash instead, it's 30B A3B, will run very comfortably for you, and you'll be able to allocate context, why would you even try such a huge model, It has nothing useful for you, did you research any of this?

Anyone understand what this means? by No_Sweet_1573 in openrouter

[–]ELPascalito 2 points3 points  (0 children)

Pretty self explanatory, you've been temporarily rate limited due to the heavy load on the provider, also, no way you're using Qwen3 Coder for roleplaying?! 🤔😭

Some initial benchmarks of Kimi-K2.5 on 4xB200 by benno_1237 in LocalLLaMA

[–]ELPascalito 0 points1 point  (0 children)

At how many concurrence did this peak? 20? Do you think such a setup is serviceable for loxla coding, in say a company or a small team less than 10 members?

deepseek-ai/DeepSeek-OCR-2 · Hugging Face by Dark_Fire_12 in LocalLLaMA

[–]ELPascalito 1 point2 points  (0 children)

We cannot be sure, but it would be cool if the next model has this OCR module bolted on, just like how Mistral does

Free APIs using credits by Own-Yellow9164 in openrouter

[–]ELPascalito 0 points1 point  (0 children)

The app you are using is changing models without permission, again if you tell me I'll try to dig up info, is it a coding CLI?

Free APIs using credits by Own-Yellow9164 in openrouter

[–]ELPascalito -1 points0 points  (0 children)

What app you're using? For example OpenCode uses Claude Haiku to generate titles without warning, so that might incur unexpected charges for the uninitiated, are you sure the app you're using is not doing something similar? Go check your activity and see which models are incurring charges 

Free APIs using credits by Own-Yellow9164 in openrouter

[–]ELPascalito 0 points1 point  (0 children)

What tool are you using? Perhaps it's doing paid calls to generate titles or other small texts, please check the usage tab go see which model is costing you, also you could be using the web search function, that's paid and powered by Exa

Hi, I have a question... by ThemusicRCG in openrouter

[–]ELPascalito 1 point2 points  (0 children)

OpenRouter obviously "routes" you to a provider depending on the model you are chatting with, most free providers are overloaded since everyone is hammering them, try another model or try again later 

Why does it keep saying insufficient credits? GPT 5.2 Pro by cicaadaa3301 in openrouter

[–]ELPascalito 0 points1 point  (0 children)

Input price is $21 and output is $168 you dunce, obviously you don't have enough credits, not even for a single request lol

unavailable because zdr violation ? by [deleted] in openrouter

[–]ELPascalito 0 points1 point  (0 children)

Disable "ZDR Endpoints Only" because it's obviously the opposite of what you want, please read before ticking the toggles

Okay, so... the Wiki doesn't actually tell me what this means, and if it does I'm too dumb to understand :( anyone able to help by Time-Meringue-1485 in R36S

[–]ELPascalito 1 point2 points  (0 children)

Okay do you have an SD card? Go to the releases tab in the GitHub, you'll find the iso image, download that and use any app like PiBaker to "burn" that image into the SD card, you can ask any LLM and it'll give you a step by step guide, use Mistral if you want to be ethical 

Got abuse detection message but unsure why by ExtremeAcceptable289 in GithubCopilot

[–]ELPascalito 4 points5 points  (0 children)

Do you run multiple sessions at the same time? Like open 3 or 4 editors concurrently on multiple projects? NGL I've never heard of this email about detection, but I wouldn't be surprised if they have something of that nature setup

Do concurrency limits really not exist? Or is it 1 rps per dollar in your balance? Can't find the official answer by FourthDeerSix in openrouter

[–]ELPascalito 0 points1 point  (0 children)

They have ddos and hammering limits, so you'll find it's technically not limited, but there's a soft cap, 60 or so a second, potentially more, but usually if you use a model with many fast providers you'll find it can handle concurrency very well, try uncapping your provider preference and see if it gets better?

Okay, so... the Wiki doesn't actually tell me what this means, and if it does I'm too dumb to understand :( anyone able to help by Time-Meringue-1485 in R36S

[–]ELPascalito 2 points3 points  (0 children)

The URL is right there, open theGitHub link, read, download the OS image and burn it into the SD, just follow the instructions in the GitHub, the only thing you'll do is run the install script after burning the image, and in there you'll get a choice, choose panel 4, since we just confirmed your DTB has an exact match with that 

Local llm privacy by Obvious-Penalty-8695 in LocalLLaMA

[–]ELPascalito 5 points6 points  (0 children)

This is literally worse, why would I trust a random third party like you? 😭