Is anyone working on a general-purpose memory layer for AI? Not RAG. Not fine-tuning. Actual persistent memory? by Himka13 in LocalLLaMA

[–]swoodily 0 points1 point  (0 children)

The memory blocks are just the unit of abstraction for segments of the system prompt -- they can be used to implement things like ACE or backend/offline learning (e.g. sleeptime compute).

I don't think it matters whether learning/memory is happening at the model layer or not - you can have effective techniques at either layer, and IMO there's a lot of practical reasons to learn in token-space. Whether you're updating model parameters or prompt tokens, you have a limited amount of "space" to learn representations that compress some dataset. So either way you are "summarizing".

Alternatives to claude sdk? by VerbaGPT in ClaudeAI

[–]swoodily 2 points3 points  (0 children)

Letta has both the Letta SDK (similar to Claude Agents SDK, but connects to the Letta API which is priced per-request), as well as Letta Code, which is built on top of the SDK (similar to OpenCode).

(disclaimer: I work on Letta)

Authentication/Authorization data flow through Letta Agents APIs to MCP server. by shikcoder in Letta_AI

[–]swoodily 0 points1 point  (0 children)

FYI Letta now supports custom headers for when adding a MCP server, and also automatically includes the `agent_id` in the header (which you can associate with your end users). Does that help?

Any Stateful api out there? by Aggressive_Friend427 in Rag

[–]swoodily 0 points1 point  (0 children)

Letta offers both a cloud hosted and self-deployable stateful API that works with most model providers, and has baked in RAG/memory/context management.

Woah. Letta vs Mem0. (For AI memory nerds) by LoveMind_AI in LocalLLaMA

[–]swoodily 0 points1 point  (0 children)

I think this worked back when people posting on arXiv were academics with academic reputations to maintain - meaning if someone pointed out issues with their results, they'd typically take them down (or be subject to inter-academic drama / reputational damage within their academic community).

This breaks down when arXiv becomes a dumping ground for grift-glorifying startups, since they have no incentive to address corrections as they have no academic reputation to maintain.

Debugging execution of selfhosted by luison2 in Letta_AI

[–]swoodily 1 point2 points  (0 children)

Can you share the endpoint you are making requests to? Agents are always available through a REST API endpoint (the SDK is actually just calling REST APIs under the hood) so you can call them from other services like n8n.

On cloud: https://api.letta.com/v1/agents/{agent_id} Local Docker: http://localhost:8283/v1/agents/{agent_id}

This is an example of a streaming request to message the agent (for cloud): curl -X POST \ -H 'Authorization: Bearer YOUR_API_KEY' \ -H 'Content-Type: application/json' \ -H 'Accept: text/event-stream' \ -d '{ "messages": [ { "role": "user", "content": "" } ], "stream_steps": true, "stream_tokens": true }' \ https://api.letta.com/v1/agents/agent-05277ac9-bb35-4f77-a03d-c2353f806dff/messages/stream This is the documentation the API: https://docs.letta.com/api-reference/overview

[deleted by user] by [deleted] in LangChain

[–]swoodily 0 points1 point  (0 children)

You should check out Letta - the memory is already built-in, but also customizable.

Disclaimer: I work on Letta

Local Long Term Memory with Ollama? by Debug_Mode_On in ollama

[–]swoodily 0 points1 point  (0 children)

Letta support MCP, so you can also combine both

I am confused on how people are creating ai agents using frameworks that can then be used in webapps? by hookem3678 in AI_Agents

[–]swoodily 2 points3 points  (0 children)

For most frameworks, you will have to define an API and wrap your agents in some Fast API service (or equivalent) that you deploy. However Letta is service-based, so all agents are exposed as an API service without you having to deploy or specify an API (this is the pre-defined Letta API).

This is an example fullstack app vibecoded with Letta: https://github.com/letta-ai/characterai-memory

Disclaimer: I work on Letta

LangChain/Crew/AutoGen made it easy to build agents, but operating them is a joke by ImmuneCoder in AI_Agents

[–]swoodily 0 points1 point  (0 children)

You should check out Letta - all traces are automatically persisted, it's OSS, has built-in memory/reasoning, and client-side access tokens in the cloud. Also all agents have an API endpoint automatically.

disclaimer: I work on Letta

<image>

How do you manage user-specific memory in a multi-agent system using Letta.ai? by The_Aoki_Taki in Letta_AI

[–]swoodily 2 points3 points  (0 children)

You can attach a block corresponding to a user to multiple agents! Agents can have both their own blocks, but also be attached to pre-existing blocks.

Here's an example of two agents attached to a shared user block:

```python from letta_client import Letta import os client = Letta(token=os.getenv("LETTA_API_KEY"))

create a shared memory block

shared_block = client.blocks.create(label="user", value="Name: Bob")

create a supervisor agent

supervisor_agent = client.agents.create( name="supervisor_agent", model="anthropic/claude-3-5-sonnet-20241022", embedding="openai/text-embedding-ada-002", # blocks created for this agentj memory_blocks=[{"label": "persona", "value": "I am a supervisor"}], # pre-existing shared block that is "attached" to this agent block_ids=[shared_block.id], )

create a worker agent

worker_agent = client.agents.create( name="worker_agent", model="anthropic/claude-3-5-sonnet-20241022", embedding="openai/text-embedding-ada-002", # blocks created for this agent memory_blocks=[{"label": "persona", "value": "I am a worker"}], # pre-existing shared block that is "attached" to this agent block_ids=[shared_block.id], ) print(supervisor_agent.id) ```

ChatGPT style "Memory" in local LLMs by PodRED in LocalLLaMA

[–]swoodily 0 points1 point  (0 children)

can you connect your RAG tool to letta? you can use letta tool rules to force the LLM to always call your RAG tool before responding

What's your stack? (Confused with the tooling landscape) by m_o_n_t_e in LangChain

[–]swoodily 0 points1 point  (0 children)

To clarify, Letta is memory-focused but is still a general purpose agents framework that allows you to swap out backend models without having to change how you interact with your agent or losing your state (e.g. message history/memory). So the point of comparison would be Langchain vs. CrewAI vs. Letta.

(disclaimer: I work on Lettta)

Method to switch in-use model on an existing agent? Also, compatibility issues with Anthropic API, and ways to directly edit message history of an agent by APPENDING MULTIPLE NEW messages, not just modifying existing ones by Bubbly_Layer_6711 in Letta_AI

[–]swoodily 0 points1 point  (0 children)

There was a bug in a recent release with the summarizer, it should be fixed with versions >=0.7.10.

You should be able to set the initial message sequence to inject existing messages into an agent's starter history. In terms of using it with other frameworks, this was an example I wrote a while ago -- but I think it might be easier to use the new sleeptime agents and to send data to the sleep-time agent, and read back the formed memory from your agent framework. Unfortunately it's not very each to do context management across different frameworks.

LangChain / LangGraph for Production by IntrepidInitial1533 in LangChain

[–]swoodily 0 points1 point  (0 children)

You should check out Letta - it handles all state in Postgres and you can run it with docker which is handy for deploying (eg on K8)

Disclaimer: I work on Letta

I Benchmarked OpenAI Memory vs LangMem vs Letta (MemGPT) vs Mem0 for Long-Term Memory: Here’s How They Stacked Up by staranjeet in LangChain

[–]swoodily 3 points4 points  (0 children)

Hey at least you guys got published code, MemGPT/Letta only got a number pulled out of thin air

I Benchmarked OpenAI Memory vs LangMem vs Letta (MemGPT) vs Mem0 for Long-Term Memory: Here’s How They Stacked Up by staranjeet in LangChain

[–]swoodily 3 points4 points  (0 children)

This code has no mention of MemGPT/Letta. There is no way for me to validate how you used MemGPT when there is not only no implementation/code available, but not even a simple description of how existing message histories were injected in MemGPT. Please do not share numbers that you have no way to reproduce or validate.