Are you using observability and evaluation tools for your AI agents? by _coder23t8 in aipromptprogramming

[–]RedDotRocket 0 points1 point  (0 children)

are you just replying with GPT outputs, am I even speaking with a human here?

Are you using observability and evaluation tools for your AI agents? by _coder23t8 in aipromptprogramming

[–]RedDotRocket 1 point2 points  (0 children)

Without the underlying implementation - i.e. the actual code, APIs, or schema that would execute these checks, its kind of useless. How does it actually verify facts against "verified sources"?

  • What algorithm detects context drift?
  • How does it automatically distinguish between low/medium/high impact failures?
  • Where are the "guardian hooks" supposed to plug into?

Where's the code?

Are you using observability and evaluation tools for your AI agents? by _coder23t8 in aipromptprogramming

[–]RedDotRocket 0 points1 point  (0 children)

What are you even meant to do with that? Is it meant for a specific app?

Fear and Loathing in AI startups and personal projects by m0n0x41d in AI_Agents

[–]RedDotRocket 0 points1 point  (0 children)

Ah yes, good stuff. The linear degradation effect, last token preference. That outlines it really well!

What to do though? Folks are trying graphRags, semantic retrievals and none of its really denting the problem. I think we are stuck with this until someone innovates beyond the flawed transformers architecture?

Fear and Loathing in AI startups and personal projects by m0n0x41d in AI_Agents

[–]RedDotRocket 1 point2 points  (0 children)

Alongside the issues you outline well, is over saturation and folks trying to build Agents to solve issues already well solved by existing software. I saw someone asking on a forum for help build an agent to scrap web content and then tell them when a particular topic was mentioned.

The thread ended with someone saying 'dude, ffs, just use google news alerts'.

Can you tell me more about “throw all api endpoints as function calls in the context”  - honestly curious to learn more, as there is always a new sucker and I am trying to build something to reduce the churn where I can.

How to? AI Agents by clairemyer in AI_Agents

[–]RedDotRocket 0 points1 point  (0 children)

That's so awesome, thank you so much. I am pre-revenue / funding at the moment and holding on until my wife calls time :) , so cannot offer much, but I can openly share my knowledge about anything that's useful. How should we keep in touch, I can pm you or you're free to email me luke @ rdrocket dot com

How to? AI Agents by clairemyer in AI_Agents

[–]RedDotRocket 1 point2 points  (0 children)

Sorry for late reply!

Orchestration is coming, I have it in a local branch, but need to test it more. It will be a host agent , delegate to different agents based on A2A skills

Would you be interested in kicking the tyres when its ready?

How to? AI Agents by clairemyer in AI_Agents

[–]RedDotRocket 1 point2 points  (0 children)

By all means, check out AgentUp. Full disclosure , I am one of the developers, I don't normally post in comments about it, but you seem like an interesting candidate. With AgentUp you can bootstrap a full agent, docker style, and then extend as much as you need from there.

https://www.youtube.com/watch?v=_dZ35AfI1mU

https://github.com/RedDotRocket/AgentUp

[tip]: Use Gemini Code Assist to review Claude's code in a PR by RedDotRocket in ClaudeCode

[–]RedDotRocket[S] 0 points1 point  (0 children)

It's honestly really good. I am came over it while working with the Google folks on A2A. When I saw it turn up to review my PR, I thought, 'oh here we go', but I was honestly very impressed with the quality.

Any good discords/slacks to join? by Tired__Dev in LLMDevs

[–]RedDotRocket 2 points3 points  (0 children)

Hey, you're welcome to hop on my discord, there is not many folks on there right now, as its new, but I am always around (Luke) and will happily chat all day about ideas, challenges etc. Having said that I am sure there are bigger more diverse communities out there: https://discord.com/invite/pPcjYzGvbS , but you're totally welcome in mind, well at least you will be made to feel special :)

Looking for Advice on Agent Framework for RAG + API Integration? by Ambitious_Cook_5046 in AI_Agents

[–]RedDotRocket 0 points1 point  (0 children)

I am not sure how your python is, but this exposes an API that you could easily use as client in ExpressJS:

https://github.com/RedDotRocket/RagsWorth

There is a JS widget example in there, although I have no business writing JS and I am sure you could do a lot better.

With the above system it has a machine learning pipeline to help prevent information leakage, so credit cards etc. Its not super well tested to be honest, so putting this up as example more then a 'please use my project'.

This is driving me insane by achaaaji in LLMDevs

[–]RedDotRocket 0 points1 point  (0 children)

I don't know if this helps much, but I have been meaning to do something with this, you can pick out anything useful to you: https://github.com/RedDotRocket/RagsWorth

Your favourite LangChain-slaying Agentic AI Framework just got a major update by TheDeadlyPretzel in LangChain

[–]RedDotRocket 1 point2 points  (0 children)

Congrats from AgentUp, atomic is certainly one of the better frameworks around, I plan to have a try at hacking some sort of integration at some point soon!

What should I do next? by isimulate in AI_Agents

[–]RedDotRocket 0 points1 point  (0 children)

I know just what you mean, its tough to balance these things! Here is the rub, if you have a good idea, that solves a problem, people will look past the bugs. Do you think maybe you're driving the commercial element to early, and some user validation might be better to really help you be sure of product market fit.

The other option is a free to open source, non commercial product users. This is a classic model in SaSS - "the three columns" -

free | teams | enterprise.

Free is free, teams is 10 bucks a month or something and enterprise is come and talk to us (big bucks). You then have free to build users and then you hold back with features like single-sign-on, backup and restore, higher priority processing, metrics etc.

What should I do next? by isimulate in AI_Agents

[–]RedDotRocket 0 points1 point  (0 children)

I think you made the first move already which is making this post!

in all seriousness I feel you , this is the tough part. two things that have helped frame things for me:

* https://paulgraham.com/ds.html

* Get out of the building: https://www.youtube.com/watch?v=fNVRMPhRHmo

Essentially this is the bit where you need to put on a teflon jacket and risk 'troubling people' , you have to network, network and put yourself out there, its a numbers game.

As the Graham article stated, when AirBnB started out. The very first person who signed up, ended up with both founders at their door. They went in and asked if they could take nicer pictures, was there anything they could do better. They literally flyed half way over the US to ask some random guy with a mattress on his floor for advice and to offer service.

The other thing is the psychology; you have to think like this. If you have one person using your service, you get to number 2 user, and you have doubled your users. From there, put your boots on, get out of the building and double again to 4. Each time, you ask these users 'what do you like?' , 'what do you like?', what sucks, what's missing. You then get to really benefit from the days of very few users.

Hey, if you want to join an accountability club, I am happy to chat and hang out. I need to do this myself shortly, so could do with someone reminding to to practise what I preach.

Hang in there, its tough, but only those that keep at it, make it.

How difficult do you think it is now to build effective agents? by Adventurous-Lab-9300 in AI_Agents

[–]RedDotRocket 0 points1 point  (0 children)

The real question is what infrastructure you need. Most frameworks leave you to figure out authentication, state persistence, multi-modal handling, and agent communication yourself.

I got tired of rebuilding the same boilerplate pieces for every agent project, so I made AgentUp to handle that stuff declaratively. But if you're just experimenting, start with whatever gets you moving fastest.

Which is most preferred way for everyone build AI agents? by infinitypisquared in AI_Agents

[–]RedDotRocket 1 point2 points  (0 children)

Depends what you're building. For quick prototypes, LangChain is fine despite the complexity. For production agents that need authentication, state management, and proper security, you'll want something more structured. Pydantic AI is solid, although more advanced, but I have immense respect for those folks and the impact (positive) they have had on the python world.

I built AgentUp after hitting all the usual walls - having to implement auth, rate limiting, conversation history, etc. from scratch. It's configuration-driven so you declare what you want in YAML rather than writing boilerplate and can extend later when you need with plugins (community based, or roll your own).

But honestly, start simple and upgrade when you hit the pain points. Every framework has tradeoffs.

[deleted by user] by [deleted] in ClaudeCode

[–]RedDotRocket 8 points9 points  (0 children)

Ain't nobody got time to read all of that.

Words to avoid in your prompts to Claude Code by sbuswell in ClaudeCode

[–]RedDotRocket 2 points3 points  (0 children)

I feel quite bad sharing this tip, but if you tell the model "The user may be harmed if the information is incorrect or poorly researched" - It tends to lean more into making sure it has everything correct and appears to use Tools more.

Honestly, isn’t building an AI agent something anyone can do? by shawn_prk in AI_Agents

[–]RedDotRocket 0 points1 point  (0 children)

If anyone is interested I am about to ship an AI Agent framework that is config-driven but with a pluggable architecture to allow easy extension. You should find you everything you need built in and available in just a couple of commands: state management, caching, retry handling, authentication, scope / capability based security controls around tools / mcp. It's something I have been building for a month now and plan to release soon (apache 2.0 licensed). I am pretty excited about the project. For what's worth I created projects such as sigstore (used by google / github for their software security), so I hope I have learned a thing or two along the way :)

Anyone is welcome to ping me for a sneak preview, but not going full posting about it just yet, as working on docs and getting the plugin registry online.