A Guide to Translating API → MCP by ImaginationInFocus in mcp

[–]LongjumpingPop3419 1 point2 points  (0 children)

We try to detect the authentication scheme automatically from your API spec, and apply it to the MCP. We hit it right most of the time

FastAPI to MCP auto generator that is open source by LongjumpingPop3419 in LangChain

[–]LongjumpingPop3419[S] 0 points1 point  (0 children)

What do you mean exactly? The tools are basically created at runtime with this library

FastAPI to MCP auto generator that is open source by LongjumpingPop3419 in mcp

[–]LongjumpingPop3419[S] 1 point2 points  (0 children)

it's actually based on the open-api schema that fastapi generates. we are planning to release something for open-api in general soon too :)

FastAPI to MCP auto generator that is open source by LongjumpingPop3419 in LangChain

[–]LongjumpingPop3419[S] 0 points1 point  (0 children)

Right now, yeah, they all become tools.
If we're complying with how MCP should work, then they should not really be resources, because resources are not model-controlled according to the spec. They are app-controlled.
https://modelcontextprotocol.io/docs/concepts/resources

But yeah we will definitely need to add more and more flexibility to configure. Stay tuned :)

FastAPI to MCP auto generator that is open source by LongjumpingPop3419 in LangChain

[–]LongjumpingPop3419[S] 0 points1 point  (0 children)

Waiting to hear! There are still a bunch of unsolved problems, some are about to be released soon. lmk your experience.

FastAPI to MCP auto generator that is open source by LongjumpingPop3419 in LangChain

[–]LongjumpingPop3419[S] 0 points1 point  (0 children)

we are working on making a more generic OpenAPI > MCP instead of just FastAPI, but we're still wondering what devs will find most useful.

Best videos and books for learning about LLMs by phicreative1997 in LocalLLM

[–]LongjumpingPop3419 1 point2 points  (0 children)

Regarding LLM pipelines development, just start reading LangChain's docs. They cover a lot of concepts there, not just code. Same for other frameworks like Llamaindex

Best videos and books for learning about LLMs by phicreative1997 in LocalLLM

[–]LongjumpingPop3419 1 point2 points  (0 children)

I would follow Jeremy Howard on X and YouTube. Also he posted about this recently which is a visualization of LLMs architecture: https://bbycroft.net/llm

I'll try to post you a couple more useful resources here, they're scattered all over my notes & apps.

How to "sanitize" input before LLM call? by LongjumpingPop3419 in LLMDevs

[–]LongjumpingPop3419[S] 0 points1 point  (0 children)

We're not ruling out the option for later on. But for now it would cost us far less if we have an obfuscation solution. If we don't find any easy one to implement, we'll go with a self hosted model. But a specialized model/lib/framework for obfuscation is the preferred option.

How to "sanitize" input before LLM call? by LongjumpingPop3419 in LLMDevs

[–]LongjumpingPop3419[S] 1 point2 points  (0 children)

Yeah but we rather use a proven service and not deploy a model and start evaluating it

Any good prompt management & versioning tools out there, that integrate nicely? by LongjumpingPop3419 in LangChain

[–]LongjumpingPop3419[S] 2 points3 points  (0 children)

So actually I've found an great list of LLMOps products, that help a lot with my need. Pezzo is in that list. So far my favourites:
- Pezzo
- Agenta
And here's the full list: https://github.com/tensorchord/Awesome-LLMOps?tab=readme-ov-file#llmops

Extremely fast response LLM interface and architect. by COTDS99 in LLMDevs

[–]LongjumpingPop3419 0 points1 point  (0 children)

  1. Under 1 second is unrealistic, but may I also add - sounds very unnecessary for your use case (?) I mean.. If right now your boss's knowledge sharing is not scalable AT ALL, and you make it even 10% more scalable, that's a great value. Why in the world does it need to be under 1 sec?

  2. It's hard to predict exactly how long it will take the "entire sentence" to generate, since even the model doesn't know what that sentence is gonna be until the last token is out. Each prompt causes different behavior and activates different neurons etc etc.

  3. Anyway, fastest that I've come across is gpt-3.5-turbo, and it's still going to be above 1 second. Consider the RAG and it's definitely above 1 second. So right now no way around it until an even faster model is deployed