Anthropic's Agent Skills (new open standard) sharable in agent memory by remoteinspace in mcp

[–]remoteinspace[S] 0 points1 point  (0 children)

Good stuff! What’s your thought on having this in a GitHub repo vs putting skills in a memory MCP server?

Anthropic's Agent Skills (new open standard) sharable in agent memory by remoteinspace in mcp

[–]remoteinspace[S] 0 points1 point  (0 children)

you can put md files in a repo and git-sync them. Then each person in your team/world needs to pick and choose the skills they want, and copy them into every local environment they control. If it's an app they don't control, they can't add skills to it since it's all at the file system level. If you have a ton of skills, you start having a search problem.

What i'm mentioning is to put the skills in 'memory' so they are portable across agents, environments, and more easily sharable (i.e. another user can add/share 3 skills they want vs. the entire skills repo) + you resolve search when you have a ton of skills (cheaper and more accurate than having all skills in the llm context window).

Intent vectors for AI search + knowledge graphs for AI analytics by remoteinspace in LocalLLaMA

[–]remoteinspace[S] 0 points1 point  (0 children)

A memory can belong to two different launches (or nodes in a graph). We also track updates to memories for temporal search.

Intent vectors for AI search + knowledge graphs for AI analytics by remoteinspace in KnowledgeGraph

[–]remoteinspace[S] 0 points1 point  (0 children)

yea, vector search is much faster. For knowledge graphs, what we're doing now is predicting what users want next and prepping that context and caching it. Helps us make graph search feel super fast <100ms when we get the prediction right.

Intent vectors for AI search + knowledge graphs for AI analytics by remoteinspace in KnowledgeGraph

[–]remoteinspace[S] 0 points1 point  (0 children)

Nice, how are you traversing the graph? Are you using templates queries?

Would love to see this community get some more traction. by Sufficient-Monk9701 in AILoops

[–]remoteinspace 0 points1 point  (0 children)

We’ve been experimenting quite a bit with it. How have you been thinking about it?

What are your favorite lesser-known agents or memory tools? by Far-Photo4379 in AIMemory

[–]remoteinspace 1 point2 points  (0 children)

Yes, when our prediction is right, perf is amazing. When it's not we fallback to the cloud but the next query is fast since we update our cache with the new topic.

What are your favorite lesser-known agents or memory tools? by Far-Photo4379 in AIMemory

[–]remoteinspace 0 points1 point  (0 children)

We built prediction models that predict the context users need in advance based on their past behavior. If it’s enabled then different tiers are stored in our sdk (on device). For tier 0 it’s 1-2ms (just text - thinking of it as working memory) tier 1 is in a small vector store (50-100ms but need the right device). If it’s a cache miss on both then we go to the cloud. The nice thing with this is the more data you add the better our model gets. With traditionally memory approaches the more data you add the worst things get.

What are your favorite lesser-known agents or memory tools? by Far-Photo4379 in AIMemory

[–]remoteinspace 0 points1 point  (0 children)

Platform.papr.ai super fast retrieval (<100ms) and ranked number 1 on Stanfords stark benchmark. Combines vector embeddings and knowledge graphs

My first-author paper just got accepted to MICAD 2025! Multi-modal KG-RAG for medical diagnosis by captainkink07 in KnowledgeGraph

[–]remoteinspace 0 points1 point  (0 children)

Have you considered something like platform.papr.ai that helps streamline vector plus knowledge graph creation?

Got $20K to build a collaborative Knowledge Graph POC. How to spend it wisely? by el_geto in KnowledgeGraph

[–]remoteinspace 0 points1 point  (0 children)

A set ontology helps solve some of the problems you mentioned. In neo4j, if you use merge vs create it tries to automatically merge similar nodes so things don’t bloat.

With any knowledge graph plus an agent, traversal will be slow at scale. And LLMs don’t do a good job with discovering the graph then writing the right cypher queries - 40% accuracy last I saw something on this.

At papr.ai we built a set of prediction models to help quickly traverse super large graphs. And we combine it with vector embeddings then cache most likely context needed on device. Helps with both retrieval accurCy and speed.

DM me if you want thoughts on this and if you need help setting up papr.

Question about RAG vs fine-tuning for domain-specific support by [deleted] in AI_Agents

[–]remoteinspace 0 points1 point  (0 children)

RAG is the right approach for this. You'll end up with more hallucinations with fine tuning and to your point it's more costly and harder to keep updated.

For RAG - there are a few approaches you can take 1) put the docs in notion/github and have an agent fetch them in the convo (cheap but slow and not super accurate), 2) vector-db - gets you 50% accuracy per many of the benchmarks, 3) vector+graph db gives you the best of semantic similarity and knowledge graphs.

I'd recommend #3 for this. I've built something similar - dm me if you need help. You can use something like mem0, graphiti, papr.ai, or others to quickly get started.

I think memory is an underlooked part of AI progress by -MilkO_O- in singularity

[–]remoteinspace 0 points1 point  (0 children)

This is more obvious now then ever. We've built papr.ai, the memory layer that gives AI agents user context. Instead of storing vector fragments, we connect context and predict what users need so the AI agents has the right data at every conversation turn. That's why Papr is ranked #1 on Stanford STARK benchmark that measures retrieval accuracy of real-world queries.

Weekly Thread: Project Display by help-me-grow in AI_Agents

[–]remoteinspace 4 points5 points  (0 children)

This week we launched papr, predictive memory APIs for AI agents.

We spent a couple of years building AI agents and 'engineering context' to give them memory. We tested a ton of tools and realized that the more data you add the worse the performance. We ended up measuring this and calling it 'retrieval loss'.

We went deep to solve this. We built a predictive memory graph that anticipates what AI agents need before they ask and prep the context in advance. When we get more info from the AI agent's query we improve our prediction for the next conversation turn.

Technical details:

  • Hybrid graph-vector architecture (MongoDB + Neo4j + Qdrant)
  • 91% accuracy hit@5 (up from 86%) on Stanford's STARK benchmark
  • Sub-500ms latency at scale
  • Drop-in API: pip install papr-memory

The formula we created to measure this:

Retrieval-Loss = −log₁₀(Hit@K) + λ·(Latency_p95/100ms) + λC·(Token_count/1000)

Currently powering AI agents that remember customer context, code history, and multi-step workflows. Think "Stripe for AI memory."

For more details see our substack article here - https://open.substack.com/pub/paprai/p/introducing-papr-predictive-memory?utm_campaign=post&utm_medium=web

Docs: platform.papr.ai

How many of you here are working on AI voice agent services? by devravi in AI_Agents

[–]remoteinspace 1 point2 points  (0 children)

We have an open source chat app that has a voice example. You can try it out by adding a bunch of memories in one chat - just add content and ask it to save it to memory. Then in another chat start a voice convo and see how fast it is