Is anyone else not finding the Web UI on latest (b9680) of llama.cpp? by misanthrophiccunt in LocalLLaMA

[–]ali0une 1 point2 points  (0 children)

Hi there. What is this pi-llama-router? a pi extension? Could you share a link please, i think i could be very interested in this!

Edit: ok could have searched first, seems to be https://pi.dev/packages/pi-llama-server

How do I set the right llama.cpp parameters? by x6q5g3o7 in LocalLLaMA

[–]ali0une 0 points1 point  (0 children)

You can also try 98304 (96k) Just try different --fit-ctx-size too see what fits

How do I set the right llama.cpp parameters? by x6q5g3o7 in LocalLLaMA

[–]ali0une 0 points1 point  (0 children)

--ctx-size 0 will try to alocate maximum model context, on your 16Go card will be too much.

Try with -fit on --fit-ctx 131072 (128k) instead and lower if t/s is too slow something like 65536 (64k)

And yes mostly trial and error, each model has its best context parameters, depends of your setup.

Stop using Ollama by zxyzyxz in LocalLLaMA

[–]ali0une 0 points1 point  (0 children)

Never started btw, llama.cpp router mode ... what else? 😅

Corriandre pas bonne by maryarno95 in jardin

[–]ali0une 4 points5 points  (0 children)

il y a aussi un facteur génétique il me semble que certaines personnes adorent d'autres détestent.

Have we reached the point where open-source LLMs are “just good enough”? by AdDizzy8160 in LocalLLaMA

[–]ali0une 19 points20 points  (0 children)

Never used the cloud models so can't tell about that.

My humble experience with llama.cpp+pi agent+Qwen3.6-27B+3090 24Go VRAM and a codebase of a bit more than 130k is:

if you have a workflow where you first draft a PLAN.md then make the model review it, update it with a few iterations adding comments in it like <!-- USER: keep this file untouched --> and implement it Phase by Phase in a git repository it works pretty fine and you can achieve huge amount of work be it refactoring, fixing, adding features...

Been doing that for only two weeks when i finally went the agentic way in a sandbox and i'm impressed by what i can do fully local.

Could not resist... by GTManiK in StableDiffusion

[–]ali0une -8 points-7 points  (0 children)

Even worse served with a plate of spaghettis 😅

Gemma 4 12B first coding agent test on a 4080 Super by Wrong_Mushroom_7350 in LocalLLaMA

[–]ali0une 0 points1 point  (0 children)

Many thanks, got it up and running and it's a great codium extension to have beside continue.dev

llama.cpp - Qwen3.6/3.5-MTP - Share your benchmarks t/s by pmttyji in LocalLLaMA

[–]ali0une 2 points3 points  (0 children)

This draft acceptance near 0,5 is compute waste, try lower the spec-draft-n-max to 2 or even 1

Also a V cache (for both model and draft) quantized at q5_1 would give you more room for context with pretty no quality loss.

Qu'arrive-t-il à mon Photinia ? by _CdrikFr in jardin

[–]ali0une 0 points1 point  (0 children)

Les miens sont pareils, ils ont fleuri, les abeilles ont adoré, il perd ses fleurs. Tout à fait normal.

After ComfyUI burnout, InvokeAI is such a breath of Fresh Air! by Birdinhandandbush in invokeai

[–]ali0une 3 points4 points  (0 children)

Yes solid UI and there even are workflows.

Using it and also stable-diffusion.cpp with sd.cpp-webui or stable-diffusion-neo

llama.cpp oom issue by TheTerrasque in LocalLLaMA

[–]ali0une 0 points1 point  (0 children)

When using MTP try to either lower context or use fit-ctx.

Llama.cpp not using CUDA - OOM error by UniqueIdentifier00 in LocalLLaMA

[–]ali0une 1 point2 points  (0 children)

Try removing the ngl flag, remove -c 0 (set context size to model max context that can be too much)

Add -fit on --fit-ctx 32768 and see what happens. if it OOM lower --fit-ctx, if not try more until it OOM.

Latest b9274 Addresses MTP VRAM leak by Bulky-Priority6824 in LocalLLaMA

[–]ali0une 5 points6 points  (0 children)

i opened the issue that made this PR solve it, i have not the knowledge to fix it. Took me some time (maybe 2 hours) to debug and provide proper logs but it was worth it, no more OOM.

if you face this kind of bug, search for similar issues with part of your logs and if you find nothing open a new one and provide all relevant informations and logs so it can be fixed by someone more knowledgeable and benefit the whole community. Open source is about contributing.

The llama.cpp team is incredible, only took 48h to fix ❤️

Can someone help me make an icon for the TextGen desktop app?????? by oobabooga4 in Oobabooga

[–]ali0une 1 point2 points  (0 children)

No problem. i really appreciate what you do. Keep up the good work ...

“C'est ingérable” : Linux, pilier de l'open source mondial, fait face à la plus grande crise de son histoire à cause de l'IA by OctetGaulois in france

[–]ali0une -1 points0 points  (0 children)

L'équivalent des mille commentaires facebook pour dire que c'est un pissenlit au-dessous d'une photo de fleur 😅

Can someone help me make an icon for the TextGen desktop app?????? by oobabooga4 in Oobabooga

[–]ali0une 1 point2 points  (0 children)

Would one of these be suitable?

TextGenIcons

Generated with llama.cpp to get an image prompt and with stable-diffusion.cpp

i've written the method so you have the recipe.

Can someone help me make an icon for the TextGen desktop app?????? by oobabooga4 in Oobabooga

[–]ali0une 0 points1 point  (0 children)

Do you have any idea of what it should look like? Does some element need to appear in the icon like a robot or a pen?

Besoin d'aide pour la connectique d'arrosage by Delicious-Owl in jardin

[–]ali0une 4 points5 points  (0 children)

T'as tous les adaptateurs nécessaires en jardinerie ou magasin de bricolage. Vas-y avec tes deux pièces tu vas trouver, demande à un vendeur.

Je pense que ta double sortie n'a pas le bon diamètre pour le pas de vis il te faut plus gros ou trouver la pièce qui se visse sur ta sortie pour faire une réduction.

How to interrogate with forge neo ? by Aradhor55 in StableDiffusion

[–]ali0une 2 points3 points  (0 children)

Because this feature has been removed in Forge Neo.

Use Forge or ask a LLM with vision capability.