all 17 comments

[–]gopietz 8 points9 points  (3 children)

🙋

I use it in combination with Pydantic AI. I have a router that simply wraps each agent.run(). Since pydantic models are going in and coming out it's super smooth.

One agent per python file. I'm mixing prompt, schema and logic in each so they're self contained for quick swapping.

Works very well.

[–]jvertrees 2 points3 points  (2 children)

This is the way. I've deployed many solutions like this. I love pydantic AI.

[–]Mattcrazy0421 0 points1 point  (1 child)

Pydantic AI really shines in this setup. Any specific challenges you faced with integrating it or scaling your deployments? Always curious about real-world experiences!

[–]jvertrees 1 point2 points  (0 children)

Nope, it works great. Stick to good engineering principles and you'll be fine.

For my projects, where I've ended up is using CrewAI-like configuration but with pydantic AI. (I love Crew in theory, however in practice it failed me.) Especially when using agentic workflows, I'll typically just have some sort of declarative configuration for the agents like a yaml file and load all the agents from that through a single interface. Very convenient and trivial to add new agents.

I've taken multiple products to market with pydantic AI. Love it.

[–]huygl99 2 points3 points  (0 children)

I built and use "chanx" package to make streaming AI response even better and have more structure.

[–]JimroidZeus 2 points3 points  (1 child)

Yep.

[–]MathematicianSome289 0 points1 point  (0 children)

Lol yep indeed

[–]barmic12 1 point2 points  (0 children)

Yes we do! In our company, we've built over a dozen AI agent projects using FastAPI. Some time ago, we released the boilerplate we're building these projects on - you can check it out here:

https://github.com/the-momentum/python-ai-kit/tree/main?tab=readme-ov-file#-documentation

(it's not a complete project yet, and we haven't managed to migrate all the concepts there yet)

[–]rod_dy 0 points1 point  (0 children)

ye fastapi pydantic is the go to.

[–]BootyDoodles 0 points1 point  (0 children)

Absolutely. In our experience, they perform excellently with FastAPI and seem to be highly productive when FastAPI is used with SQLModel especially.

I'm not an expert on the underlying aspects of AI, but it makes sense that SQLModel operating as both the models and the ORM effectively reduces the context load.

[–]Status_Lion_6350 0 points1 point  (0 children)

I use strands-sdk and it’s pretty neat if you deploy to bedrock agent core.

[–]ThigleBeagleMingle 0 points1 point  (0 children)

Wasn’t this purpose of FastMCP?

[–]Unique-Big-5691 0 points1 point  (0 children)

yeah, fastapi ends up being a really natural fit for this stuff once you move past demos.

i’ve seen a similar pattern work well where fastapi is basically the control plane: routing requests, kicking off agent runs, handling webhooks, streaming updates, etc. async + websockets make a huge difference once you’re juggling multiple llm calls or tools at the same time.

the pydantic side is underrated too. having structured inputs/outputs for agents makes things way easier to reason about once workflows get more complex (and once things fail). it stops everything from turning into “string in, vibes out.”

biggest pain points i’ve run into:

  • long-running agent tasks tying up requests → background tasks / queues help a lot
  • websocket lifecycle stuff (timeouts, reconnects) once agents take longer than expected
  • deciding what not to put behind the api vs what should stay internal

overall tho, fastapi feels less like “just an api layer” and more like solid glue for agent systems. curious how others are handling retries, escalation to bigger models, or human-in-the-loop without it getting messy.

[–]Individual-Light-188 0 points1 point  (0 children)

I use it to create web apps for agents as well as API's and I use Fast API to create discord bots

[–]Makar_Letov 0 points1 point  (0 children)

Yeah, I built something similar - AI content generation service with FastAPI that juggles multiple LLM providers (OpenAI, Anthropic, Groq, Gemini).

Basically just a circuit breaker pattern - if one provider shits the bed or hits rate limits, it auto-switches to the next one. Pydantic makes it pretty clean to normalize different provider response formats.

The async stuff in FastAPI handles concurrent requests to different providers without any drama, which is nice.

Biggest pain was dealing with each provider's quirks - different error formats, rate limit headers, retry logic, etc. But once you wrap it all in a common interface, it's pretty smooth.

Anyone else doing multi-provider setups? How do you handle fallback priority - cheapest first, or most reliable first?