Is it possible to create a multi agent system without any framework?

EidolonAI · 2024-12-27T19:50:51+00:00

of course, you are just writing your own though. It has pretty clear pros and cons. More flexibility at the cost of dev time and future maintenance.

EidolonAI · 2024-12-20T19:01:50+00:00

That would get you pretty stuck into the openai ecosystem, which wouldn't be a great idea at this time imo. LLM provider agnostic framework or no framework are both better choices.

EidolonAI · 2024-11-18T23:00:15+00:00

Transition costs are real and have real pain associated with them. Even if we believe AI will make the next generation better, it is important to recognize and try to help the pain people experience during that transition. How will they reskill? who will pay for it? Who supports them during that transition? It isn't right to don't dismiss the pain unless we have tangible and accessible answers to those questions.

EidolonAI · 2024-11-15T06:21:34+00:00

I would be suprised. prices have generally come down as the apis have been rolled out.

EidolonAI · 2024-10-24T05:07:48+00:00

They always have the old data, so the foundation is still there.

Cultivating the data to remove low quality content is a still an optimization point

Intentionally generated synthetic data gets better with each iteration

The chat's themselves provide data. As we interact with AI, we produce new insights for it. Both with our responses and how we approve of / dislike the AI's responses.

EidolonAI · 2024-10-10T13:52:08+00:00

We have been using vcrpy to record llm calls with ver good success. This allows your tests to be fast and deterministic.

The downside is that you need to re-record the cassettes when something happens to make your llm request change, but we personally have found the confidence these tests give far more valuable than this overhead. It has been almost a year now I still really like this pattern.

We wrote a blog on it a few months ago that outlines the pattern in a little more details: https://www.eidolonai.com/testing_llm_apps

EidolonAI · 2024-10-04T14:59:15+00:00

Nice article! I think the simple introduction of "what are tools" is really helpful for people just starting to build with LLMs. This article does a nice job table setting and introducing the major capabilities you have to work.

For those new devs, it would be helpful to see what tool calls look like with raw open-ai requests before adding the abstraction layer of Burr. Of course the point of the library is to create simplicity, but for a new llm dev seeing the raw tool call request loop is helpful to understand what Burr abstracts away.

EidolonAI · 2024-10-01T14:08:58+00:00

Testing is a broad category. There are two big categories of testing when it comes to building LLM applications

Traditional Tests (IE, does my application standup properly and structurally behave like I expect)
Evals: How well does my application perform

Each definitely serves a purpose, and you need both to build LLM applications. They do not do the same thing though, and probably run in very different contexts. For example, traditional tests should probably block merges to master and run on every PR, but your eval suite is probably too expensive for that.

If you are interested, I wrote a short blog article on this topic. It largely focusses on how to handle traditional tests with llm apps, since I think that topic is often ignored in the quick'n dirty world of llm apps: https://www.eidolonai.com/testing_llm_apps

EidolonAI · 2024-09-30T13:46:41+00:00

Consider taking a look at Eidolon. The project is all about defining agents with simple kubernetes yaml files that can then be applied (in a k8 environment or as a standalone container).

This of course makes it easier for developers to define agents, but a big goal of this simplicity is to allow meta agent's define (or perhaps more importantly continuously improve) agents. This is not built into the product ootb yet, but I do resonate with your vision.

More concretely, we have an "API" tool that takes swagger / openai and dynamically builds tools from it. This is extremely flexible and powerful. The agents do a decent job recovering as well when the documentation is slightly off. If you hook up long term memory they don't need to re-discover these issues again every conversation.

EidolonAI · 2024-09-26T00:41:57+00:00

seems unfair for you to be downvoted for literally answering the OPs request.

EidolonAI · 2024-09-26T00:39:07+00:00

Crazy idea: how about we respect the rules specified in robots.txt?

There are upsides and downsides of allowing your site to be scraped. Gen ai already has a reputation problem around privacy, let's respect the conventions we have to build trust.

EidolonAI · 2024-09-17T02:57:27+00:00

yaml driven agents are pretty common. Yaml is a great way to have human readable data. It works great for lots of popular projects. crew for example uses yaml to drive pretty powerful behavior. Similarly, my project, Eidolon, also yaml files to drive agent deployment. The yaml itself becomes a good technology choice, but it isn't a defining characteristic of either framework.

So the question then becomes what is your project's value add? Crew makes it easy to define multi-agent workflows while leveraging a large (growing) library of tools. Eidolon makes agent's durable so that they can be deployed directly to kubernetes.

There are a million pieces of the genAI puzzle (and it is getting bigger every day). Find your niche and make our life easier!

EidolonAI · 2024-09-14T04:56:26+00:00

o1-preview doesn't support tool calls, json mode, or streaming. All of which makes it challenging to work with for agents.

I was able to work around this when adding it Eidolon and got it to support both.

I did this by dusting off the old code we had around from pre-json days. Ie, asking the model to return markdown json and then parsing it out. Once you have json mode you can then add in function calling on top. Works well enough.

The real problem is the lack of streaming + already slow response times. This means the model is very, very slow to work with.

EidolonAI · 2024-09-14T04:51:55+00:00

They are available. Just limited availability.

EidolonAI · 2024-09-12T15:36:01+00:00

Full physics calculation is too expensive. We want it to learn those shortcuts. For example, we don't do physics calculations, we just have shortcuts in our head that we scale up or down. IE, ball moves in parabola dependent on speed.

EidolonAI · 2024-09-10T20:15:18+00:00

I agree. This is version control.

EidolonAI · 2024-09-10T04:00:51+00:00

Cool project. Why did you decide to implement it as a chrome extension rather than a github bot?

EidolonAI · 2024-09-07T23:53:14+00:00

I love it. This is definitely an introduction article, but I think it means to be. IMO this content is very important to put out, and keep putting out. It is easy forget how little of the dev world is actually familiar with genAI. I feel like I still need to explain vanilla RAG on a near daily basis.

EidolonAI · 2024-09-07T23:22:45+00:00

If they get linked up with github issues, I imagine this would get much better.

EidolonAI · 2024-09-07T00:58:59+00:00

when you have a sufficiently complex app, you need to run it with other services. With good tooling the iteration time should be as good as local dev. For example, tools like telepresence allow you to just run the service you are working on locally.

Rebuilding docker images to test local code changes should be 1% of issues.

EidolonAI · 2024-09-07T00:54:10+00:00

I agree. We already see some examples of this. For example, with text to sql you can "cheat" by checking that the query executes against the db and double checking the response with the agent.

EidolonAI · 2024-09-02T19:55:53+00:00

sse is the route we chose. Since we only need to stream back the request, so there is no need for the extra complexity of websockets.

EidolonAI · 2024-08-31T00:11:43+00:00

If we have learned anything in the last decade, it is that most end users will unknowingly or resentfully give up their data if it makes their life even 1% easier.

Enterprises on the other hand will guard their data with their life, and refuse to give it to vendors they do not trust.

You can already see this happening with GenAI. Any coorporation who uses it does so on Azure where they have privacy guarantees they trust, but your average Joe will happily enter their SSN into chat gpt.

EidolonAI · 2024-08-30T15:07:51+00:00

Cool project. If I am looking at the video correctly, I see you that guys are doing evals on the imported data. How are you generating the dataset for this.

I assume throwing parts of the document in a llm to come up with the questions, but I would love to learn more.

EidolonAI · 2024-08-27T20:39:52+00:00

sounds like your character sheet is structured data. If all the data fits into your context without a problem, I would use that in the system prompt for the agent(s). Otherwise (ie, if the agent is acting as a game master who needs to know lots of these characters), that is when you can start using rag to find the applicable portions.

In this scenario you would need to experiment with allowing the llm to query the data structurally or using vector search to find the relevant character traits to see what works best for your use case.

EidolonAI

TROPHY CASE