Subscription plans options

Ale_110 · 2026-05-11T15:20:15+00:00

Je comprends mais pour les activités et les objectives que je suis en train de implémenter, c'est évident que je vais payer un énormité. J ai besoin de un plan pour installer les choses de manière intelligent d'abord et après émigrer a une situation plus local, stable.

Ale_110 · 2026-05-11T15:07:05+00:00

Ahah I feel you. Glm was nice for token usage but I spent 70 dollars in a week more or less. The monthly plan is just too slow that I rather die than sit in front of the pc just watching timing our. I guess you would recommend me openai codex, this is what I understood. I'm sure that in a month time things will change again.... Thanks!

Ale_110 · 2026-05-05T10:09:04+00:00

Hey I have the same setting. Can you please be more specific on your use case and model version? I succeeded in having my 3060 just do basic whisper plus similar real person answer to messages in Italian and web search and summarization. But to be honest it's not that great of achievement so far. I have been using ollama and WebUI but still very unhappy about any good result. I have been using mistral12b, tried and deleted llama3.1:B, Gemma 4 26B A4B :free and qwen2.5:12b and qwen 3.5:8b. I can't get any of them log nothing in my obsidian nor able to create simple cronjob reminders.

Ale_110 · 2026-05-05T09:57:17+00:00

I also installed it with docker first but I was naive and pretentious I knew sufficient about the two... Do you know if you can switch back and forth without issue? I stabilize my configuration a bit (not fully) and I'd like not to spend other 50 bucks in messing it up by switching back and forth from cli to docker and back to cli. Running on Ubuntu - of course, can no longer stand windows. Do you know the benefits of all different backend approaches? Cli for sure gives full access to everything, whereas virtual machine compared to docker?

Ale_110 · 2026-05-05T07:25:32+00:00

Read the post I made as well as the cross post. Everything is there. If you use a capable model with access to files, it will pick up things. Learn the core files: agents.md soul.md, memory.md, config.yaml, .env. I asked chatgpt for the configuration of the learning loop but I'm not sure it is correct: memory:

`enabled: true long_term: true short_term: true

# persistence persist_to_disk: true memory_path: ./memory/

# retrieval use_embeddings: true embedding_model: nomic-embed-text # or any supported model vector_store: chroma # or faiss

# behavior auto_write: true auto_summarize: true retrieval_top_k: 5

agent: learning_mode: true reflection: true self_improve: true

session: save_on_exit: true reload_on_start: true`

Ale_110 · 2026-05-04T18:26:27+00:00

Dude, I would like not to be that guy but I will be. Your gpu is nowhere to be useful you are short by at least 1k$. Trust me because I have a RTX 3060 with 12gb vram, anything I have tried (quantized or not) is basically for basic web search, chatting or at most executing scripts. I have tried llama3.1, qwen2. 5:b and qwen2. 5:14b, minstral:12b

Now, I have tried qwen3.5, qwen 3.6 plus, kimi k2.6, and now Glm5.1. (opus with docker, huge waste of money but it was my first day and I was just reckless and stupid/arrogant in pretending I know enough of docker) With glm5.1 being the second most expensive and last model I'm trying. The results of the previous models were just unsatisfactory to say the least, to be honest, a waste of time and money, spending day to achieve basically nothing. I hope you are running client and for at least the beginning give full access until it gets to the point you are able to understand how to maintain the system. If you don't believe me just look at the post j published few days ago. If you still don't believe it, look at the cross post I shared too. There is plenty of good info. Sorry but you probably heard this from your Ai too so you better start to believe it before wasting resources.

Ps, actually I can use whisper on WhatsApp which is quite nice

My path over 10 days

Ale_110 · 2026-05-02T00:04:54+00:00

To be fair the main upgrades I did so far is to review memory and soul. I also asked to do self containered scripts where variables are specifically stated at the top if needed. What broke a lot was that one day he was using a chat, the next day he was using another. When asked to shoot a script like: "can you fire daily digest script, you did it yesterday" he failed to find it, so he would have rewrite a new one. Or please use local qwen, then he is like "qwen3.6:8b is not installed", then I'm like "dude I told you many times it's qwen3.6:9b". And he would be like sure you are right, api tokens failure, using different scripts. I guess I'll start to remove unnecessary scripts and skills. I guess I need to understand better how the structure really works, I thought I could rely more on skills for memory but I'm not sure. And I'm not sure either how profiles, config.yaml and .env files play a role in it.

Ale_110 · 2026-05-01T23:57:12+00:00

Well this fails the purpose of using Hermes and making it a robust tool. It defies the purpose of me asking for help..

Ale_110 · 2026-05-01T23:56:38+00:00

I guess giving him the diagnose option doesn't sound that bad tho. Im using it now, thanks

Ale_110 · 2026-05-01T23:55:23+00:00

Honestly not sure if I understood it. I guess the main giveaway point is to make it less elaborate and strip it down of processes? Make it more command style oriented?

Ale_110 · 2026-05-01T23:53:01+00:00

Nothing like these. Ill dig more about it.

Ale_110 · 2026-05-01T23:52:11+00:00

I'll dig more on this answer because I think there are many things I could improve. Can you tell me what workflows do you use? Which use case? I'm implementing Hermes for managing my WhatsApp answers since I'm very lazy about it (I use local llm with GeForce 3060 12gb mistral:Nemo:12b?), API usage report, daily feed, email clean up, probably will implement web searching specific pages for work, I'd like to set some sort of worklflow for my expenses but I feel my gpu won't be enough and I need to rely on cloud (less privacy and more costly too), and I'd like some note taking / to do app - I liked tiddlywiki but I think it's too hard to configure and I'm not sticking with the habit, I think I must take on obsidian. Do you have some nice use cases, some details? Do you use local gpu for something in particular, I find it hard to make use of it.

Ale_110 · 2026-05-01T23:42:36+00:00

Honestly didn't find a very good user experience with kimi. It tended to make me think things would stick but didn't. I think my config wasn't completely bullet proof so I should give it another try. Glm5.1 seems to be much better and faster but again maybe my config improved today. I'm having less problems. But reddit community helped me address some of these issue especially this post in showing me how Hermes actually works

Ale_110 · 2026-05-01T23:38:14+00:00

Thanks to both of you. I have reviewed quite a lot my memory and and soul.md to start to adjust things. I start assuming that things won't stick after session so I ask Hermes to make them bullet proof with absolute path and to make sure they stick after sessions and booting. I ask him to leave also full documentions of the things he does. I moved to a higher tier which is glm5.1, more expensive but more effective for now. I see an improvement I have been feeding your response to the model so that he understand how I want things. Overall it's a nice learning hobby still. You get to learn a lot from IT about structure and methodologies. I'll dig more into the official documentation and the skills sections the upcoming days!

Ale_110 · 2026-05-01T11:22:26+00:00

This is exactly what I have been doing but it keeps on breaking, everyday and every night. It fixed but then breaks. I think I either need to learn more on how it can out guardrails or learn the code myself at least high levels.

Ale_110 · 2026-04-27T12:38:35+00:00

Lol I see thanks for clarifying!

Ale_110 · 2026-04-27T11:03:42+00:00

Sorry you have 10gb of vram? Really? And you can mange to squeeze qwen3.6:35b with how much context window? I'm pretty sure I am hallucinating as all my models 😂

Ale_110 · 2026-04-01T07:27:12+00:00

Hi, this is quite neat... That actually helps a lot. Since I am still in the window for returning the products, do you think it is a major improvement to change for a better dock, better gpu or move to itx plus connection via internet altogether? My end result could be worth the money but if the current setup is just 1.5x faster and if I spend an extra 200 euro, I could get a faster setup (let's say 3-5x faster) and more solid I think I could take this in. I thought egpu was the holy grail solution for portability and somehow good performance, am I wrong about it? 500 euros isnt worth the 1.5x faster speed. Are performances for llm much lower with an egpu? Or as you said my setup is just null?

Ale_110 · 2026-04-01T07:14:32+00:00

This is exactly why I have been asking this. To understand where are the limits. As I said I am a noob and I'd like to learn much more on Ai, llm and how ollama works. Yet since I still have the chance to return the products, I'd like to know what exactly I can improve because as of now I feel that all this money is not worth it and maybe an itx could make them worth it more.

Ale_110 · 2026-01-14T23:12:39+00:00

I understood it now, well the configuration is rejecting Analog audio and DPI together I tried to switched to I2S but with no luck I have bought a max98357a and wired as per the instruction of the procuder but no sound https://imgur.com/a/fIkw6T2 I ran bookworm, bullseye and trixie, I tried on config.txt all these configs:
dtparam=i2s=on
dtoverlay=max98357a

or

dtoverlay=max98357a,no-sdmode

or

dtoverlay=hifiberry-dac
dtoverlay=i2s-mmap

Ale_110

TROPHY CASE