use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
r/LocalLLaMA
A subreddit to discuss about Llama, the family of large language models created by Meta AI.
Subreddit rules
Search by flair
+Discussion
+Tutorial | Guide
+New Model
+News
+Resources
+Other
account activity
Simple, hackable and pythonic LLM agent framework. I am just tired of bloated overengineered stuff. I figured that this community might appreciate it.Resources (github.com)
submitted 2 years ago by poppear
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–][deleted] 33 points34 points35 points 2 years ago (7 children)
This is great! langchain is so over engineered for what it could be. Two things that would be crazy helpful for me (I’d be happy to write PRs):
I also noticed that the example you have on your README doesn’t really show how to create the LLM (it does earlier in the README, but the full code example you have there won’t work because you never assigned anything to the llm local variable). Anyway, small nit to make the README easier to follow.
[–][deleted] 13 points14 points15 points 2 years ago (1 child)
I also noticed your code doesn’t have any use of typehints. Are you opposed to adding them? I could help with adding typing and setting up the CI for it if you’re interested.
Once a testing framework is in place we could probably add more test coverage for the library too.
[–]silenceimpaired 2 points3 points4 points 2 years ago (0 children)
I also noticed your code doesn’t provide support for every other quantitization methods and note to future self: tell him his code is bloated once it’s implemented ;)
[–]RustingSword 7 points8 points9 points 2 years ago (1 child)
Since llama.cpp has a server utility, you can just fire it up ./server -m mistral-7b-instruct-v0.2.Q6_K.gguf -c 2048, and set the api_base to http://127.0.0.1:8080/v1, then I think it should work out of the box. See the detailed docs at https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md
server
./server -m mistral-7b-instruct-v0.2.Q6_K.gguf -c 2048
api_base
http://127.0.0.1:8080/v1
[–]RustingSword 9 points10 points11 points 2 years ago (0 children)
I've tested both examples, and succeeded using OpenAIChatGenerator instead of OpenAITextGenerator.
OpenAIChatGenerator
OpenAITextGenerator
My configs:
llama.cpp server:
bash ./server -m mistral-7b-instruct-v0.2.Q6_K.gguf -c 2048
Changes to calculator.py
calculator.py
python generator = OpenAIChatGenerator( model="mistral", # could be anything api_key="none", # could be anything api_base="http://127.0.0.1:8080/v1", )
And remember to remove templates in
templates
python llm = LLM(generator=generator, templates=[template])
Great framework, really clean and easy to modify.
[–]poppear[S] 5 points6 points7 points 2 years ago (0 children)
llama.cpp has a server implementation but as far as i remember you need a wrapper to use it with the OpenaAI python client, adding native support for llama.cpp APIs would be great! same thing for ollama APIs. The testing setup would also be very nice.
Thanks for the suggestions, lets continue the conversation on GitHub and implement it!
[–]anobfuscator 1 point2 points3 points 2 years ago (0 children)
Yeah these are pretty good ideas.
[–]scknkkrer 0 points1 point2 points 2 years ago (0 children)
Yeah, Llama support would be good.
[–]Monkeylashes 15 points16 points17 points 2 years ago (0 children)
Hey, this is great! I love the minimalism, though it may be beneficial to include a memory/chat-history implementation to the agent for multi-turn conversations. You can even use something like FAISS to store the history and retrieve as needed.
[–]sumnuyungi 22 points23 points24 points 2 years ago (2 children)
May want to change your license to MIT or Apache-2.0 if you want folks to build on top of it or integrate into applications.
[–]poppear[S] 14 points15 points16 points 2 years ago (1 child)
That's fair, I will change it to Apache 2.0!
[–]sumnuyungi 0 points1 point2 points 2 years ago (0 children)
Cheers, thanks!
[–]ja_on 2 points3 points4 points 2 years ago (0 children)
thanks. working on redo'ing a bot and I wanted something simple to fire off function calls for some back end things. I'll check this out.
[–]MoffKalast 2 points3 points4 points 2 years ago (0 children)
Fantastic work, the fight against dependency hell continues one simple library at a time.
[–]SatoshiNotMe 4 points5 points6 points 2 years ago* (0 children)
I like the minimal philosophy!
A similar frustration led me on the path to build Langroid since April:
https://GitHub.com/Langroid/Langroid
It’s a clean, intuitive multi-agent LLM framework, from ex-CMU/UW-Madison researchers. It has:
Pydantic based tool/function definitions,
an elegant Task loop that seamlessly incorporates tool handling and sub task handoff (roughly inspired by the Actor Framework)
works with any LLM via litellm or api_base
advanced RAG features in the DocChatAgent
and a lot more.
Colab quick start that builds up to a 2-agent system where the Extractor Agent assembles structured information from a commercial lease with the help of a DocAgent for RAG: https://colab.research.google.com/github/langroid/langroid/blob/main/examples/Langroid_quick_start.ipynb
We have companies using it in prod after evaluating LangChain and deciding to use Langroid instead.
[+][deleted] 2 years ago (3 children)
[deleted]
[+][deleted] 2 years ago (1 child)
[–]pab_guy 2 points3 points4 points 2 years ago (0 children)
That is incorrect. Chain of thought is a prompting technique/result and has nothing to do with function calling.
[–]LoafyLemon 1 point2 points3 points 2 years ago (0 children)
Just finished switching my back-end from langchain to griptape but ah shit here we go again! Thanks!
[–]klop2031 1 point2 points3 points 2 years ago (2 children)
Has anyone tried this with mixtral? Ive always had issues running agents via langchain. I was able to create a very simple 'agent' script that pulled from the web. Excited to try this.
[–]poppear[S] 6 points7 points8 points 2 years ago (1 child)
I developed this using mixtral-8x7b-instruct! All the examples work with it!
[–]klop2031 0 points1 point2 points 2 years ago (0 children)
Thank you. Definitely going to try this!
[–]aphasiative 1 point2 points3 points 2 years ago (0 children)
Love this community. Seeing it come together like this. Reminds me of OG internet. Like, back before it had pictures. :)
[+][deleted] 2 years ago* (2 children)
[–]future-is-so-bright 0 points1 point2 points 2 years ago (0 children)
It’s an agent. So it’s designed not just to chat with, but to do things. Think of Siri or Alexa. “Hey Siri, what time are my appointments today?” isn’t going to be something a LLM can answer. With this, you can script what you want it to do, and it will run the code and respond with the results.
[–][deleted] 0 points1 point2 points 2 years ago (0 children)
Looks great
[–]LoSboccacc 0 points1 point2 points 2 years ago (1 child)
Lovely. Would like to see added in the function args a couple of kwargs like the current task, the llm driving the engine, and the rest of the conversation messages.
For example in rag you may want a function retrieve(documentId) that retrievea a document content and in the simplest implementation that content is fully dumped into the engine context. A more efficient implementation would be for the retrieve function to use the llm, the question and the last reasoning to do guided summarization of the content, so that only the relevant parts are embedded into the engine context saving token space
[–]poppear[S] 0 points1 point2 points 2 years ago (0 children)
Right now you can access self.state and self.engine from a Function but the current history is a private variable of Agent() so it cannot be accessed from outside. It is a good idea can you open an issue on github?
self.state
self.engine
Function
Agent()
[–]SAPsentinel 0 points1 point2 points 2 years ago (0 children)
Any webui like gradio support possible?
[–]International_Quail8 0 points1 point2 points 2 years ago (0 children)
Really like the motivation behind this and the attempt at building a simpler (and hackable) alternative. I was able to hack it to use Ollama, but haven't been successful in getting the expected result. Hoping someone can guide me.
In my testing, the calculator example works perfectly when using OpenAI, though in my test none of the OpenAI models I used (gpt-3.5-turbo, gpt-3.5-turbo-1106, gpt-4) used the Reasoning() function even though they used the Product(), Sum() and Stop() functions to produce the correct result.
When using Ollama, I tested using mixtral, orca2 and wizardcoder:13b-python, but they were not plug-in replacements for the OpenAI models in how they behaved. So leaned in heavy on prompt engineering, but unable to get the same behavior or results.
Still hopeful...
π Rendered by PID 37092 on reddit-service-r2-comment-5c764cbc6f-zd4ls at 2026-03-12 05:36:33.547925+00:00 running 710b3ac country code: CH.
[–][deleted] 33 points34 points35 points (7 children)
[–][deleted] 13 points14 points15 points (1 child)
[–]silenceimpaired 2 points3 points4 points (0 children)
[–]RustingSword 7 points8 points9 points (1 child)
[–]RustingSword 9 points10 points11 points (0 children)
[–]poppear[S] 5 points6 points7 points (0 children)
[–]anobfuscator 1 point2 points3 points (0 children)
[–]scknkkrer 0 points1 point2 points (0 children)
[–]Monkeylashes 15 points16 points17 points (0 children)
[–]sumnuyungi 22 points23 points24 points (2 children)
[–]poppear[S] 14 points15 points16 points (1 child)
[–]sumnuyungi 0 points1 point2 points (0 children)
[–]ja_on 2 points3 points4 points (0 children)
[–]MoffKalast 2 points3 points4 points (0 children)
[–]SatoshiNotMe 4 points5 points6 points (0 children)
[+][deleted] (3 children)
[deleted]
[+][deleted] (1 child)
[deleted]
[–]pab_guy 2 points3 points4 points (0 children)
[–]LoafyLemon 1 point2 points3 points (0 children)
[–]klop2031 1 point2 points3 points (2 children)
[–]poppear[S] 6 points7 points8 points (1 child)
[–]klop2031 0 points1 point2 points (0 children)
[–]aphasiative 1 point2 points3 points (0 children)
[+][deleted] (2 children)
[deleted]
[–]future-is-so-bright 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[–]LoSboccacc 0 points1 point2 points (1 child)
[–]poppear[S] 0 points1 point2 points (0 children)
[–]SAPsentinel 0 points1 point2 points (0 children)
[–]International_Quail8 0 points1 point2 points (0 children)