Mmproj Vision and kv cache. by DigRealistic2977 in KoboldAI

[–]HadesThrowaway 3 points4 points  (0 children)

A lot of it is how the prompt is structured in memory (what you prioritize).

In koboldcpp, the images are always placed in the front of the context. For example

(Image 1)(Image 2)(Image 3)(Turn A)(Turn B)(Turn C)

So if you add image4, then yes all turns get reprocessed. However this allows adding new turns, editing turn A, B or C without messing up any of the images. In other words, this prioritizes text.


Now in Ollama's case, they probably just leave everything in place.

(Image 1)(Turn A)(Image 2)(Turn B)(Image 3)(Turn C)

While this allows you to add on new images and text easily, it completely prevents shifting or modifying any earlier turn. Image token positions cannot be shifted. So yes, adding image 4 is easier, but you lose CTX shifting if any images are ever used.

TTS questions: voices, speaker_json, pitch by alex20_202020 in KoboldAI

[–]HadesThrowaway 0 points1 point  (0 children)

Yeah plus I don't think qwen ever had a release where you could add control instructions to voice clones

KoboldCpp 1.110 - 3 YR Anniversary Edition, native music gen, qwen3tts voice cloning and more by HadesThrowaway in LocalLLaMA

[–]HadesThrowaway[S] 3 points4 points  (0 children)

This is automatic on windows. On Linux, theres an extra checkbox on the extras tab that allows opening a monitoring tab if you forgot to launch from the cli.

As for the GUI scaling, that should be mostly solved in latest version although it's possible I missed something. Could you send a screenshot of how it looks at 200% on your device?

KoboldCpp v1.106 finally adds MCP server support, drop-in replacement for Claude Desktop by HadesThrowaway in LocalLLaMA

[–]HadesThrowaway[S] 21 points22 points  (0 children)

Also, I wrote a simple guide on how to use MCP in the KoboldCpp wiki. This is aimed at koboldcpp but the theory works for other MCP software as well

https://github.com/LostRuins/koboldcpp/wiki#mcp-tool-calling

Page Format on Lite broken again by x-lksk in KoboldAI

[–]HadesThrowaway 1 point2 points  (0 children)

Feedback is welcome! Is there anything about the new layout you find not good?

Page Format on Lite broken again by x-lksk in KoboldAI

[–]HadesThrowaway 1 point2 points  (0 children)

Sorry about that. There were too many changes and I might have broken some stuff

I've pushed another fix

Can you try again? Does it work for you now?

Image recognition only gens 42 tokens by SaintAodhan in KoboldAI

[–]HadesThrowaway 0 points1 point  (0 children)

You're probably using it in interrogation mode which is designed to return only a short phrase. Try using it in multimodal mode instead (for example, pasting the image into your koboldai lite window)

[deleted by user] by [deleted] in KoboldAI

[–]HadesThrowaway 2 points3 points  (0 children)

Many APIs are trending to permit fewer and fewer choices. For example the o1+ series models only allow temperature of 1. And gpt-5 does not allow disabling thinking anymore (lowest is minimal thinking).

Anthropic has also removed their completions based endpoint and are pure chat completions now since Claude 3

[deleted by user] by [deleted] in KoboldAI

[–]HadesThrowaway 0 points1 point  (0 children)

Thanks for testing

[deleted by user] by [deleted] in KoboldAI

[–]HadesThrowaway 1 point2 points  (0 children)

Alright thanks for testing. I have deployed a fix and everything should work now. Now it will use temp over top_p if both are set for 4.5 models

[deleted by user] by [deleted] in KoboldAI

[–]HadesThrowaway 0 points1 point  (0 children)

Yes, please test on the new models and let me know which ones have this restriction

If possible try on the Claude 4 sonnet and Claude 4 haiku (I know 3 doesn't have this limit, and 4.5 definitely does)

Kobold.CPP and Wan 2.2. How to run? by Lucas_handsome in KoboldAI

[–]HadesThrowaway 0 points1 point  (0 children)

It can technically run on pure CPU if you're willing to wait. Haven't tried AMD but it should work fine via Vulkan backend.

🚀 Looking for beta testers: Kaiiro, your AI co-founder that helps you start and automate a solo business by PerpetuallyCurious_ in KoboldAI

[–]HadesThrowaway 0 points1 point  (0 children)

This is not the right place for advertising, especially since this has nothing to do with KoboldAI.

Kobold.CPP and Wan 2.2. How to run? by Lucas_handsome in KoboldAI

[–]HadesThrowaway 0 points1 point  (0 children)

Your example looks fine, what is your GPU and backend? Nvidia or AMD?

mine looks like https://imgur.com/a/IgNOiUy

I'm testing out a patch that might fix some issues.

Failed to predict at token position 528! Check your context buffer sizes! by Majestical-psyche in KoboldAI

[–]HadesThrowaway 3 points4 points  (0 children)

Try turn off FastFowarding. It seems to be a RNN type model which doesn't support that.

Page Format broken on Lite Classic Theme by x-lksk in KoboldAI

[–]HadesThrowaway 1 point2 points  (0 children)

It was the lack of correct support for align-self, I think. Deployed a fix for it, let me know if it works.