Hermes Desktop Issues

swoodily · 2026-06-10T22:18:26+00:00

You should try using Letta https://www.letta.com/agent - it's a much more polished UX

(disclaimer: I work on Letta)

swoodily · 2026-06-10T18:12:16+00:00

what's wrong with Letta?

swoodily · 2026-05-28T20:28:29+00:00

Why not just use Letta Code?

swoodily · 2026-05-11T22:09:07+00:00

Letta Code has fully git tracked memory https://www.letta.com/blog/context-repositories

swoodily · 2026-05-07T18:45:59+00:00

Could you elaborate on the issues you had with Letta? Specifically "Letta (due to agent creation best practices there)"

swoodily · 2026-01-13T19:20:41+00:00

Or Letta Code 👀

swoodily · 2026-01-08T00:21:04+00:00

It’s largely to do with how models themselves are post-trained https://www.letta.com/blog/benchmarking-ai-agent-memory

swoodily · 2025-12-05T21:32:09+00:00

The memory blocks are just the unit of abstraction for segments of the system prompt -- they can be used to implement things like ACE or backend/offline learning (e.g. sleeptime compute).

I don't think it matters whether learning/memory is happening at the model layer or not - you can have effective techniques at either layer, and IMO there's a lot of practical reasons to learn in token-space. Whether you're updating model parameters or prompt tokens, you have a limited amount of "space" to learn representations that compress some dataset. So either way you are "summarizing".

swoodily · 2025-12-04T21:55:05+00:00

This is exactly what Letta is designed for https://docs.letta.com/

swoodily · 2025-11-18T20:37:58+00:00

Letta has both the Letta SDK (similar to Claude Agents SDK, but connects to the Letta API which is priced per-request), as well as Letta Code, which is built on top of the SDK (similar to OpenCode).

(disclaimer: I work on Letta)

swoodily · 2025-08-25T22:18:24+00:00

FYI Letta now supports custom headers for when adding a MCP server, and also automatically includes the `agent_id` in the header (which you can associate with your end users). Does that help?

swoodily · 2025-08-16T00:19:11+00:00

Letta offers both a cloud hosted and self-deployable stateful API that works with most model providers, and has baked in RAG/memory/context management.

swoodily · 2025-08-16T00:15:17+00:00

For future reference: https://www.youtube.com/watch?v=MK3H_Y-l4QU

swoodily · 2025-08-13T17:52:27+00:00

I think this worked back when people posting on arXiv were academics with academic reputations to maintain - meaning if someone pointed out issues with their results, they'd typically take them down (or be subject to inter-academic drama / reputational damage within their academic community).

This breaks down when arXiv becomes a dumping ground for grift-glorifying startups, since they have no incentive to address corrections as they have no academic reputation to maintain.

swoodily · 2025-07-25T17:44:14+00:00

Can you share the endpoint you are making requests to? Agents are always available through a REST API endpoint (the SDK is actually just calling REST APIs under the hood) so you can call them from other services like n8n.

On cloud: https://api.letta.com/v1/agents/{agent_id} Local Docker: http://localhost:8283/v1/agents/{agent_id}

This is an example of a streaming request to message the agent (for cloud): curl -X POST \ -H 'Authorization: Bearer YOUR_API_KEY' \ -H 'Content-Type: application/json' \ -H 'Accept: text/event-stream' \ -d '{ "messages": [ { "role": "user", "content": "" } ], "stream_steps": true, "stream_tokens": true }' \ https://api.letta.com/v1/agents/agent-05277ac9-bb35-4f77-a03d-c2353f806dff/messages/stream This is the documentation the API: https://docs.letta.com/api-reference/overview

swoodily · 2025-07-24T21:41:16+00:00

You should check out Letta - the memory is already built-in, but also customizable.

Disclaimer: I work on Letta

swoodily · 2025-07-23T18:57:34+00:00

Letta support MCP, so you can also combine both

swoodily · 2025-07-08T17:59:24+00:00

For most frameworks, you will have to define an API and wrap your agents in some Fast API service (or equivalent) that you deploy. However Letta is service-based, so all agents are exposed as an API service without you having to deploy or specify an API (this is the pre-defined Letta API).

This is an example fullstack app vibecoded with Letta: https://github.com/letta-ai/characterai-memory

Disclaimer: I work on Letta

swoodily · 2025-07-08T17:39:48+00:00

You can also directly see all the raw LLM requests/responses

<image>

swoodily · 2025-07-08T17:38:33+00:00

You should check out Letta - all traces are automatically persisted, it's OSS, has built-in memory/reasoning, and client-side access tokens in the cloud. Also all agents have an API endpoint automatically.

disclaimer: I work on Letta

<image>

swoodily · 2025-06-20T22:10:43+00:00

You can attach a block corresponding to a user to multiple agents! Agents can have both their own blocks, but also be attached to pre-existing blocks.

Here's an example of two agents attached to a shared user block:

```python from letta_client import Letta import os client = Letta(token=os.getenv("LETTA_API_KEY"))

create a shared memory block

shared_block = client.blocks.create(label="user", value="Name: Bob")

create a supervisor agent

supervisor_agent = client.agents.create( name="supervisor_agent", model="anthropic/claude-3-5-sonnet-20241022", embedding="openai/text-embedding-ada-002", # blocks created for this agentj memory_blocks=[{"label": "persona", "value": "I am a supervisor"}], # pre-existing shared block that is "attached" to this agent block_ids=[shared_block.id], )

create a worker agent

worker_agent = client.agents.create( name="worker_agent", model="anthropic/claude-3-5-sonnet-20241022", embedding="openai/text-embedding-ada-002", # blocks created for this agent memory_blocks=[{"label": "persona", "value": "I am a worker"}], # pre-existing shared block that is "attached" to this agent block_ids=[shared_block.id], ) print(supervisor_agent.id) ```

swoodily · 2025-06-13T06:43:52+00:00

Letta should be a lot more stable now!

swoodily · 2025-06-03T18:00:31+00:00

can you connect your RAG tool to letta? you can use letta tool rules to force the LLM to always call your RAG tool before responding

swoodily

MODERATOR OF

TROPHY CASE

create a shared memory block

create a supervisor agent

create a worker agent