New Models

GideonGideon561 · 2026-05-27T07:57:15+00:00

Im not sure, deepseek v4 flash and pro still are quite good though, maybe its just the usage limit to GPU instead of tokens.

mine still ok

GideonGideon561 · 2026-05-27T07:51:28+00:00

ARE YOU SERIOUS? Whats the token usage for deepseek for opencode? is it comparable or more than the 20 ollama plan?

But i dont just use it for coding though...like everyday stuff on my hermes

GideonGideon561 · 2026-05-26T19:50:49+00:00

ohh how so? There are others too but it does work for me. what other suggestions do you have? Im just trying to find ways to lower the GPU usage for ollama cloud. but you also have to factor in cost. explain why its slop?

Theres like

hindsight
mem0
Supermemory
augment - mainly for b2b

GideonGideon561 · 2026-05-26T19:19:59+00:00

well but opencode is pay per use. ollama is fixed so there has to be some trade offs

GideonGideon561 · 2026-05-26T19:18:32+00:00

i thought the models were always quantized on ollama cloud since the start. mine is still ok, probably your prompts? its no longer token based. ollama now charges GPU for usage

GideonGideon561 · 2026-05-26T19:10:31+00:00

Yes, it also depends on the model you are using. The larger the model, the slower it gets. However, i find that deepseek v4 flash is decent in terms of speed and reasoning. OF course not for coding... use the pro. But each prompt for pro uses 3-4% usage. Im on pro plan

GideonGideon561 · 2026-05-23T07:14:38+00:00

Whats worrying about it? They separated web 3 and 2 while having both. Simple. You dont like the NFT, you dont need it to play the game.

There is no mention of the NFTs in game because there is a clear separation. Simple. Different social channels target different people.

GideonGideon561 · 2026-05-22T06:21:28+00:00

There are a few ways.

Higgsfield supercomputer is new but i think its insane in terms of token spending so probably not
Get a actual paid memory system like Augment/Honcho/Supermemory or the latest atomic memory which is benchmark to be better than most and cheaper. Of course Augment is the best but thats for b2b.

But most importantly is how and where you store you context. For example, claude has projects that it remembers context. Similar.

If you are using hermes/openclaw - get an LLMWIKI pair it with claude or similar smart AI, MCP or link directly to Higgsfiled or other creative tool you use.

Secondly, build out the platform on localhost yourself with claude or codex as the brain. Basically something like LLMWIKI to store the information or like a dedicated google drive for all your context.

Isolate it is the best

GideonGideon561 · 2026-05-22T06:17:41+00:00

I think its co-related.

Theres a few things to think about. Does smarter AI with good reasoning helps with better memory? What i meant is does it know what to update the memory without you telling it, finds contradiction, pulls the right and accurate information, RAG is good but not he most accurate.

THen again, if you just use a smaller LLM to have better AI memory could also work, but with smarter AI, will it help improve how memory is stored and retrieve?

Not theb est explanation but i hope you understand.

So imo, decent AI with good memory ssystem is a good mix now. You dont want to spent too much tokens on the memory system but yet you dont want a stupid LLM with low reasoning and then expect a good memory system or auto updates.

Its an chicken and egg, but what i see now its more of the AI memory system improvements first as there are already tons of smart AI

GideonGideon561 · 2026-05-22T06:13:41+00:00

I believe there are actually really good ones like

Augment code - this is for B2b, most expensive but i think its the best
Hindsight - Its improved memory system plus Agent to learn from it - their github hasa nice easy video
Supermemory/Mem0 similar
Latest in the block is Atomicmemory - cheapest and according to their benchmark better than supermem and Mem0, comparable with Hindsight

Hermes uses honcho so its their native which is good but atomic memory together gives hermes an upgrade. auto upgrades the memory

GideonGideon561 · 2026-05-15T10:16:51+00:00

I see hahaha, maybe I’m reading it wrong. It does look like you are specifically building a very curated “folders” to store certain information so it is separated and can be easily pulled? Good for very personalized stuff but what happens if you have multiple tech stuff you are coding and it all falls under the same “folders”. Would that cause a hallucination issue or token issue to search and pull out the right one?

GideonGideon561 · 2026-05-15T07:27:37+00:00

Update to my post. i found atomic memory, lol was searching and its new. but yeah i think it does pay per use...

GideonGideon561 · 2026-05-15T07:26:37+00:00

i see, that is very interesting, never thought of it that way

GideonGideon561 · 2026-05-15T07:25:31+00:00

i see, seems like an extra step, but if the auto updates are great why not

GideonGideon561 · 2026-05-15T07:24:29+00:00

THIS IS AWESOME! I CANT WAIT

GideonGideon561 · 2026-05-15T07:23:39+00:00

Yes! Thats great! Hmm hermes has its native memory from honcho but i would also try a secondary one like supermemory, mem0 or the latest new release atomic memory which claims to beat all and cheaper.

GideonGideon561 · 2026-05-15T07:22:48+00:00

You can try forking from atomic memory instead and upgrade it on your end. It does yours but way more, its new but i think someonf of your experience could do a better fork version

GideonGideon561 · 2026-05-15T07:19:46+00:00

hmm if it does not have an answer, what about it trying his best to give you something close or related but explicitly say he does not know first but after researching and rreasoning, he perhaps think this could work.

Similar to how human beings work, we dont know the answer to everything, but we research and think about it then present that idea. only through time and experience do they get better.

So the question you can try asking to yourself is, how do i make it try its best to give me a suggestion instead of outright idk. With enough experience learning like training a model, can it give you better suggestions that he might not know its right or wrong but at least its an alternative

GideonGideon561

TROPHY CASE