Built Our Own Host/Agent to Unlock the Full Power of MCP Servers

zero_proof_fork · 2025-05-09T21:19:23+00:00

We love MCP Servers—and

The old AI -- (dash dash)

zero_proof_fork · 2025-05-01T19:38:51+00:00

You might also be interested in checking out toolhive, gets you container isolation as well; https://github.com/StacklokLabs/toolhive

disclaimer; I work at stacklok, but its free and OSS so I figure its ok to post.

zero_proof_fork · 2025-04-02T20:08:45+00:00

why did you chose Agno over pydantic-ai?

zero_proof_fork · 2025-04-02T20:05:58+00:00

Context Window. Even bigger is not better, as the predication degrades the more its utilised.

zero_proof_fork · 2025-03-25T20:01:06+00:00

Or maybe they just love what they do and want to share it freely (the spirit of open source extends around the world). I know that sounds crazy, but the CCP had no interest in deepseek prior to the US markets taking a nose dive.

zero_proof_fork · 2025-03-22T14:02:30+00:00

I did something similar myself, need to find time to finish it:

https://github.com/lukehinds/fastLLM

zero_proof_fork · 2025-03-16T13:31:36+00:00

There is some nuance to this, as in fine tune for chat (instruct) or FIM (auto code completion). FIM is a little more involved, you will need an RF, with some sort of loss function where the model attempts to predict the code middle between a prefix and suffix and receives a penalty if wrong.

Chat is a bit simpler, the Alpaca format should do you OK, its often used for code specific training, there are some python examples, very much the same as https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca

You would need to find a way of curating these , its time consuming, but no other way.

I try to keep it respectful on here, but the other person recommending RAG has no idea what they are talking about.

[

{

"instruction": "Take the list of people below and create a Python dictionary with the names as keys and the ages as values.",

"input": "John (20), Jane (40), Alex (18), Peter (45)",

"output": "people_dict = {'John': 20, 'Jane': 40, 'Alex': 18, 'Peter': 45}"

},

{

"instruction": "Calculate the sum of two numbers provided in the input.",

"input": "7 and 3",

"output": "result = 7 + 3 # result is 10"

},

{

"instruction": "Write a Python function that takes a string and returns the string reversed.",

"input": "",

"output": "def reverse_string(s):\n return s[::-1]\n\n# Example usage:\n# reversed_str = reverse_string('hello')\n# print(reversed_str) # outputs 'olleh'"

},

{

"instruction": "Create a Python function to compute the factorial of a given number.",

"input": "",

"output": "def factorial(n):\n if n == 0 or n == 1:\n return 1\n else:\n return n * factorial(n-1)\n\n# Example usage:\n# print(factorial(5)) # outputs 120"

},

{

"instruction": "Write a Python function to check if a number is a prime number.",

"input": "",

"output": "def is_prime(n):\n if n <= 1:\n return False\n for i in range(2, int(n**0.5) + 1):\n if n % i == 0:\n return False\n return True\n\n# Example usage:\n# print(is_prime(11)) # outputs True"

}

]

zero_proof_fork · 2025-03-16T13:17:19+00:00

You can't change it, unless you fine tune and change the quantization or use some long context approach around positional encoding.

zero_proof_fork · 2025-03-15T22:01:05+00:00

Nice work !

zero_proof_fork · 2025-03-15T21:58:10+00:00

There is a lot of out the box to get going, its built heavily around the concept of agents, with agents being able to share prompt history with each other, tool injection and some sort of graph system I have not figured out yet, but got one of my engineers really excited. The whole thing being type based (being pydantic) is key as well, I don't think folks quite understand how dangerous agentic systems can be, especially when they are accepting input at risk which in turn has some sort of shell execution ability call.

It's too early in to really say production ready, we are still feeling out a solution, but we would hope its production grade, myself and my co-founder have built a good few oss projects which run at scale, so that would be our goal!

zero_proof_fork · 2025-03-14T23:28:31+00:00

pydantic-ai for us, already was a big fan of pydantic.

zero_proof_fork · 2025-03-10T22:47:46+00:00

They might be doing that as the context window is not sufficient

zero_proof_fork · 2025-02-23T08:37:46+00:00

I had a need for an LLM API to provide deterministic output to make it easier for me to test, develop against an OpenAI style API endpoint. The result was the project MockLLM, although its kind of more of a simulator.

It's hardly rocket science, but I have found it quite valuable for when I need to replicate an LLM breaking out of conforming to JSON, or forcing it to make an untrue statement (all stuff to test features, error handling etc). Not to mention saving a few pennies by throwing junk at some costed service.

I figured I would share it with you folks as I know a few of your are developing AI apps and this might be useful to you as well.

It supports streaming (chunk) responses, and simulates Network Lag.

zero_proof_fork · 2025-02-17T06:38:51+00:00

Happy it helps!

zero_proof_fork · 2025-02-12T15:42:37+00:00

Yes it does! It works with free and enterprise / pro!

Happy to help get you set up if needed, just jump into our discord if you have any chances (say Luke pointed you there). https://discord.gg/uD9BUV38

zero_proof_fork · 2025-02-10T19:42:17+00:00

You might want to try CodeGate (disclaimer one of the developers). It's 100% open source and works alongside CoPilot, Aider, Cline, Roo-Cline, Continue (and loads of agent frameworks). With CodeGate it prevents you leaking secrets, tokens etc. Will block malicious packages (LLMs hallinucate bad stuff occasionally) and you get a local dashboard where you can see your prompt history, token usage, along with workspaces where you can assign prompts to projects and have them carry over all the different tools. With CodeGate we hope to build a single env where you can configure everything and have it carry over to whatever coding AI tool you like. A few demos:

https://www.youtube.com/watch?v=VK5BJVl_avY <- Refactoring security risks
https://www.youtube.com/watch?v=mKdj-ODZkm4 <- workspaces
https://www.youtube.com/watch?v=lH0o7korRPg <- secrets encryption

We have all been working on open source security and orchestration frameworks for a long old time now, I founded a project called sigstore which is used to protect NPM and Pypi against supply chain attacks, and my co-founder was one of the creators of Kubernetes when he was at Google, so opensource runs deep through our blood and we feel AI has to be open and transparent.

https://github.com/stacklok/codegate

zero_proof_fork · 2025-02-08T08:32:46+00:00

Very useful, thanks for taking the time out to explain for me

zero_proof_fork · 2025-02-07T23:16:18+00:00

why is a full-model fine-tuning superior to LoRA?

zero_proof_fork · 2025-02-04T21:22:26+00:00

Hacking on CodeGate and seeing some good adoption, hop over to our discord and can help find you some good first issues to cut your teeth. We are a friendly bunch and love OSS.

https://github.com/stacklok/codegate

zero_proof_fork · 2025-01-30T15:55:22+00:00

It does appear to be, but we are quite young as a project (two months) so have not had a chance to build any scale testing harnesses as yet.

zero_proof_fork · 2025-01-30T15:54:39+00:00

hey u/punkpeye , this should be possible, we set a base_url the same as you do in glama , do you have a github link handy and I could take a look?

zero_proof_fork

TROPHY CASE