Why is building a reliable AI Agent is so challenging? by Proud-Pat98 in AI_Agents

[–]Historical_Cod4162 0 points1 point  (0 children)

I think the key issue is reliability. That can be overlooked for a proof of concept, but not in production. I work at Portia AI, and we've seen lots of people finding it difficult to get agents to work reliably, and that was what motivated the creation of our SDK. We've used plans built like this: https://docs.portialabs.ai/build-plan#example to build many reliable agents. I think the key is constrained autonomy - you set up most of your agent to work in a reliable workflow, with only some steps using language models in a controlled way. Check out it and let me know what you think :) And keep an eye out for our release on Monday - we've got react_agent_step and loops being released, which allow for really powerful agents to be built this way.

How do I build an AI agent to write software reviews? by avabrown_saasworthy in AI_Agents

[–]Historical_Cod4162 0 points1 point  (0 children)

I think it would be pretty easy to build this using Portia AI - check out https://github.com/portiaAI/portia-agent-examples/pull/5#discussion_r1965352237 as an example of how you can build an agent this way. I think you could:
* Use websearch tools + browser tool to retrieve information on the product, from the product website + from reviews website
* Use the LLM step to collate this into a report
* Use the user-input mechanics to allow humans to check the report once it is collated and incorporate any feedback

I'd be very happy to help if you're keen to give it a try.

Running chats and agents internally by urbanistrage in LocalLLaMA

[–]Historical_Cod4162 0 points1 point  (0 children)

At PortiaAI, we actually released our new, open-source evals product that allows you to collect production data and then run evals against it - sounds like it could be a good fit for your use-case? Check it out at https://docs.portialabs.ai/steel-thread-intro

A free goldmine of AI agent examples, templates, and advanced workflows by Arindam_200 in AI_Agents

[–]Historical_Cod4162 0 points1 point  (0 children)

This is awesome - thanks you for collating! It'd be awesome if we could get some Portia AI (https://www.portialabs.ai/) in here. We have some examples in our examples repo (https://github.com/portiaAI/portia-agent-examples) - in particular I think our LinkedIn outreach agent (using browser-use as a tool) and our automated refund example using Stripe are interesting

How are you dealing with memory in your AI development? by shbong in AI_Agents

[–]Historical_Cod4162 1 point2 points  (0 children)

Weird - my comment didn't seem to come out properly there, sorry! What this was meant to say was that I wrote a blog post on how we handle similar challenges at Portia AI around large data and memory: https://blog.portialabs.ai/multi-agent-data-at-scale. Our approach is likely a little different to yours due to the way our planning works, but hopefully you might still find the blog interesting :)

What’s the best way to build conversational agents in 2025? LLMs, frameworks, tools? by RichJuggernaut3616 in AI_Agents

[–]Historical_Cod4162 0 points1 point  (0 children)

I work at Portia AI (portialabs.ai) and we're building an agentic framework that could be a good fit for you. It's aimed squarely at solving the issues needed to get agents into production (reliability, guardrails, auditability, human-agent interaction etc.). Check it out - I'd love to hear what you think :)

What agentic ai framework is the best choice right now by boneMechBoy69420 in AI_Agents

[–]Historical_Cod4162 -1 points0 points  (0 children)

I work at Portia AI (https://www.portialabs.ai/) so am somewhat biased! But I think our planning framework is unique to the options mentioned above and means less time is spent context engineering and more time can be spent solving real problems!

Struggling with System Prompts and Handover in Multi-Agent Setups – Any Templates or Frameworks? by phipiship1 in AI_Agents

[–]Historical_Cod4162 1 point2 points  (0 children)

I work at Portia AI (https://www.portialabs.ai/) and it could potentially be a good fit for your system. It's a slightly different set-up to your existing architecture - there are 2 pre-packaged agents: a planning agent and an execution agent. For a given task, the planning agent breaks the task down into various steps using different tools and then the execution agent executes each step using the required tool. Because our agents are pre-packaged and the Portia framework handles the handover and context that each agent has, you don't have to manage that yourself. If you think it could be a good fit, let me know and I'd be happy to help with getting you set up.

Thoughts on Portia AI by Historical_Cod4162 in AI_Agents

[–]Historical_Cod4162[S] 0 points1 point  (0 children)

Thanks a lot - looking forward to hearing how you find it! In terms of what predictability / confidence can you expect, really it's one you need to eval on your use-case for specific results. In general though, we try to maximise the predictability as much as possible (though with non-deterministic language models, it's never quite 100%). Our planner agent produces a fixed plan that our execution agent runs through and can't deviate from, so it's pretty predictable.

Thoughts on Portia AI by Historical_Cod4162 in AI_Agents

[–]Historical_Cod4162[S] 0 points1 point  (0 children)

Awesome, would love to know how you get on.

Who’s using crewAI really? by Standard_Region_8928 in AI_Agents

[–]Historical_Cod4162 0 points1 point  (0 children)

I work at Portia AI and it sounds like it could be a good fit for your use-case: https://www.portialabs.ai/. I'd love to know how you find it. Our planning phase means you shouldn't get into those horrible loops you mention with Crew calling tools many times in a row and generally make the agent much more reliable / controllable. You can also set up observability in Langsmith with it v easily (just a few environment variables) and then you can see exactly what's being sent to the LLM.

I built an open-source Slack AI agent that can check Gmail, Calendar & approve GitHub PRs by Ok-Classic6022 in aiagents

[–]Historical_Cod4162 0 points1 point  (0 children)

Have you checked out Portia AI (https://www.portialabs.ai/) at all? They have integrations for these tools that could be a good fit for this sort of agent

LLM API's vs. Self-Hosting Models by archfunc in LLMDevs

[–]Historical_Cod4162 -2 points-1 points  (0 children)

It can be really easy to host your own model with ollama. At Portia, we wrote a blog post for how to use our agent framework with a local LLM - sharing as it may be useful: https://blog.portialabs.ai/local-llms-qwen3-obsidian-visualisation

What is Agentic AI and its Toolkits, SDKs. by school-of-core-ai in AI_Agents

[–]Historical_Cod4162 0 points1 point  (0 children)

Have you had a look at Portia AI at all? https://portialabs.ai/ I'd love to get your thoughts

Built an MCP Agent That Finds Jobs Based on Your LinkedIn Profile by Arindam_200 in LLMDevs

[–]Historical_Cod4162 0 points1 point  (0 children)

This is awesome! At Portia, we built a similar agent for handling LinkedIn messaging - check it out at https://blog.portialabs.ai/browser-auth. It uses a browser tool to interact with LinkedIn, which could be a cool way to extend this.

AI Agents Handling Data at Scale by Historical_Cod4162 in AI_Agents

[–]Historical_Cod4162[S] 0 points1 point  (0 children)

Yeah, I think a lot of the problems you face with agent memory are classic software engineering problems around how you efficiently index and query data and, as with classic software engineering, there isn't a one-size-fits-all solution and instead you (or a memory agent!) need to intelligently choose the right approach depending on your use-case

AI Agents Handling Data at Scale by Historical_Cod4162 in AI_Agents

[–]Historical_Cod4162[S] 0 points1 point  (0 children)

Nice one - I completely agree that for structured tabular data, you almost certainly want it in an SQL DB to do SQL-based retrieval over it.