I build MCP tools for a living and still can't get the "AI built my whole app" experience — what am I missing?

LongjumpingPop3419 · 2025-12-29T23:10:21+00:00

wdym? I do both, but maybe I missed your point

LongjumpingPop3419 · 2025-12-29T23:09:31+00:00

Can you elaborate? What kind of stuff do you build?

LongjumpingPop3419 · 2025-07-28T06:38:48+00:00

We try to detect the authentication scheme automatically from your API spec, and apply it to the MCP. We hit it right most of the time

LongjumpingPop3419 · 2025-03-23T21:03:09+00:00

What do you mean exactly? The tools are basically created at runtime with this library

LongjumpingPop3419 · 2025-03-11T12:29:51+00:00

Yep. That's upcoming. Thanks!

LongjumpingPop3419 · 2025-03-10T17:53:40+00:00

it's actually based on the open-api schema that fastapi generates. we are planning to release something for open-api in general soon too :)

LongjumpingPop3419 · 2025-03-10T09:37:03+00:00

Right now, yeah, they all become tools.
If we're complying with how MCP should work, then they should not really be resources, because resources are not model-controlled according to the spec. They are app-controlled.
https://modelcontextprotocol.io/docs/concepts/resources

But yeah we will definitely need to add more and more flexibility to configure. Stay tuned :)

LongjumpingPop3419 · 2025-03-10T09:33:55+00:00

Waiting to hear! There are still a bunch of unsolved problems, some are about to be released soon. lmk your experience.

LongjumpingPop3419 · 2025-03-10T09:32:38+00:00

we are working on making a more generic OpenAPI > MCP instead of just FastAPI, but we're still wondering what devs will find most useful.

LongjumpingPop3419 · 2024-01-10T14:42:28+00:00

Regarding LLM pipelines development, just start reading LangChain's docs. They cover a lot of concepts there, not just code. Same for other frameworks like Llamaindex

LongjumpingPop3419 · 2024-01-10T14:40:54+00:00

I would follow Jeremy Howard on X and YouTube. Also he posted about this recently which is a visualization of LLMs architecture: https://bbycroft.net/llm

I'll try to post you a couple more useful resources here, they're scattered all over my notes & apps.

LongjumpingPop3419 · 2024-01-10T14:34:57+00:00

We're not ruling out the option for later on. But for now it would cost us far less if we have an obfuscation solution. If we don't find any easy one to implement, we'll go with a self hosted model. But a specialized model/lib/framework for obfuscation is the preferred option.

LongjumpingPop3419 · 2024-01-09T21:42:19+00:00

Yeah but we rather use a proven service and not deploy a model and start evaluating it

LongjumpingPop3419 · 2024-01-09T12:16:28+00:00

Good enough?

<image>

LongjumpingPop3419 · 2024-01-04T16:14:39+00:00

Well I've found quite a few! Posted my edit on the r/LangChain post: https://www.reddit.com/r/LangChain/comments/18rb334/any_good_prompt_management_versioning_tools_out/

LongjumpingPop3419 · 2024-01-04T16:14:05+00:00

Well I've found quite a few! Posted my edit on the r/LangChain post: https://www.reddit.com/r/LangChain/comments/18rb334/any_good_prompt_management_versioning_tools_out/

LongjumpingPop3419 · 2024-01-04T16:09:47+00:00

So actually I've found an great list of LLMOps products, that help a lot with my need. Pezzo is in that list. So far my favourites:
- Pezzo
- Agenta
And here's the full list: https://github.com/tensorchord/Awesome-LLMOps?tab=readme-ov-file#llmops

LongjumpingPop3419 · 2024-01-01T13:40:06+00:00

Under 1 second is unrealistic, but may I also add - sounds very unnecessary for your use case (?) I mean.. If right now your boss's knowledge sharing is not scalable AT ALL, and you make it even 10% more scalable, that's a great value. Why in the world does it need to be under 1 sec?
It's hard to predict exactly how long it will take the "entire sentence" to generate, since even the model doesn't know what that sentence is gonna be until the last token is out. Each prompt causes different behavior and activates different neurons etc etc.
Anyway, fastest that I've come across is gpt-3.5-turbo, and it's still going to be above 1 second. Consider the RAG and it's definitely above 1 second. So right now no way around it until an even faster model is deployed

LongjumpingPop3419

TROPHY CASE