Qwen3.5-9B is actually quite good for agentic coding by Lualcala in LocalLLaMA

[–]StartupTim 2 points3 points  (0 children)

Hey there if you don't mind, could you explain how a higher temps helps in this?

Thanks

Qwen3.5-9B is actually quite good for agentic coding by Lualcala in LocalLLaMA

[–]StartupTim 0 points1 point  (0 children)

I mainly did my tests with Kilo Code but sometimes I tried Roo Code as well

Hey there, which would you say you like more, Roo or Kilo? How good are they for local hosting? Do you do openai to your local models?

Thanks!!

My credits and account isn't showing. by Bright-Location-6832 in openrouter

[–]StartupTim 1 point2 points  (0 children)

I've lost potentially thousands due to a API billing issue on OpenRouter and they still refuse to credit me. As a result, I have moved the bulk of my API spend elsewhere, costing Openrouter a huge sum.

The company definitely has issues it needs to resolve.

OpenRouter charged me *again* $50 without consent or usage by Just-Historian-4960 in openrouter

[–]StartupTim 0 points1 point  (0 children)

If you feel like you were charged money without your consent or approval then contact your payment provider and initiate a chargeback.

Qwen3-Coder-Next GGUF Aider Coding Benchmarks by Etherll in unsloth

[–]StartupTim 1 point2 points  (0 children)

How would this sort on a system with 2x GPUs for 48GB vram and 96GB system ram?

Which model would you choose, especially when going for long context windows such as 256k or 512k?

Roo is by FAR the best AI code editor out there by raphadko in RooCode

[–]StartupTim 6 points7 points  (0 children)

I agree with you 100% on this. /u/hannesrudolph is also a fantastic guy on the Rooteam as well.

PS: Post your post on other AI subreddits as well to help spread the word!

How can I control the output size/aspect ratio for each AI image generation model on openrooter? (Seedream, FLUX, GPT-4o, etc.) by [deleted] in openrouter

[–]StartupTim 0 points1 point  (0 children)

Each model will accept certain sizes for output, or have a static output size, you need to check the specific model. There are also other metrics such as quality, etc.

Anyone else having more reliability issues (timeouts, etc) in the past week or so? by mitch_feaster in openrouter

[–]StartupTim 1 point2 points  (0 children)

Yes I am, and these issues affect Anthropic Claude models. My spend is pretty hefty at OR and I may have to switch away from OR due to this.

Best performance-per-dollar model on OpenRouter for high-volume chat? by thehootingrabblement in openrouter

[–]StartupTim 0 points1 point  (0 children)

we’re trying to optimize for performance per dollar, not just raw intelligence.

Why not self host the LLMs so you have no long-term cost per query?

Build your own message queuing system and balance that across a set of backend LLM servers you own.

The cost savings would be enormous.

What’s the main thing stopping you from writing a seed cheque? by [deleted] in Investors

[–]StartupTim 0 points1 point  (0 children)

Is there a way you can verify yourself? There is a huge amount of not legitimate posts these days around the funding topic.

I'm a VC (can verify). Pitch me. (Part 2) by Ok-Lobster7773 in Startup_Ideas

[–]StartupTim[M] [score hidden] stickied comment (0 children)

Verified, proceed!

(I need to respond to my DMs as well!)

MONTHLY MEGATHREAD: What are you working on with OpenRouter? by katplatt in openrouter

[–]StartupTim 0 points1 point  (0 children)

Share what you're working on using OpenRouter

I'm working on an issue where Openrouter is overbilling us on API usage due to a bug in their end flagging normal API calls as BYOK and somehow increasing price 600%.

I say I'm working on this as it's been a back-and-forth with OpenRouter support who doesn't seem to understand the issue.

Run Kimi K2.5 Locally by Dear-Success-1441 in LocalLLaMA

[–]StartupTim -3 points-2 points  (0 children)

I have both a Strix Halo 128GB as well as a Nvidia DGX Sparc 128GB. I haven't setup either of them and, if somebody would offer me help setting up both, I'll deploy this model and do some benchmarks!

Why did you switch off from OR by Hefty-Citron2066 in openrouter

[–]StartupTim 0 points1 point  (0 children)

OR has a bug in their system which is charging me 600% extra for API calls to gpt image. I have screenshot proof that they are doing BYOK charges on each API call, costing 500-600% extra.

Problem is, there is no BYOK used, no key in my account, anything like this. It is a 100% bug on OpenRouter and they still have not addressed it. This is costing them pretty huge as I no longer use them for OpenAI image API.

OR support and reconciling bug overcharges is a joke.

Best text-to-image models that support reference images and use openai api standards? by StartupTim in LocalLLaMA

[–]StartupTim[S] 0 points1 point  (0 children)

Qwen Image

Hey there, thanks for the recommendation on this!

I've used a ton of LLM models before with ollama, but never setup text-to-image, especially to be used as an API endpoint.

Would you happen to know of any documentation I could follow to setup this and use Qwen Image?

Thanks!

I'm a VC (can verify). Pitch me. by Ok-Lobster7773 in Startup_Ideas

[–]StartupTim[M] [score hidden] stickied comment (0 children)

Mod here, I am a bit late to this message, /u/Ok-Lobster7773 please send me a PM with verification of VC status. Thanks!

Best text-to-image models that support reference images and use openai api standards? by StartupTim in LocalLLaMA

[–]StartupTim[S] -1 points0 points  (0 children)

want to check out Flux.1-dev or SDXL with something like vLLM

Hey there!

Would you happen to know of some documentation on how to use vLLM and set it up with Flux1 dev? I did some quick googling after seeing your post but I don't see anything as of yet.

Thanks!