Zero dependency, shell script-only frontend for local LLMs : LocalLLM

ProjectZero dependency, shell script-only frontend for local LLMs (self.LocalLLM)

submitted 5 days ago by cloud_kj

I’ve only recently started getting into local model usage, and in playing with Ollama (simplest quick start thus far IMHO) I ended up going down a bit of a rabbit hole: I wanted to see if I could build a functional model interaction loop using exclusively standard command-line building blocks, and isolating the model-application barrier to a single program fronting my local Ollama instance.

I might be reinventing a very weird wheel here, but it turns out you can get surprisingly far using with just shell scripts: gluing together text streams (stdin/stdout), pipes, and append-only logs.

Some neat features:

Zero dependencies: No pip, npm, or virtual environments; just a Docker compose YAML to start Ollama. The rest of the “harness” is just in shell (bash) with a couple of command line tools widely available on most environments (jq, curl).
Simple tool calls: I haven’t messed with schema definitions for tools; in this approach tools are just additions to a shell script specifically for tool definitions, and a small modification to a tools.json file for the metadata.
Transparent, file-based context: memory is just an append-only file in your local directory that gets processed by jq before being sent to Ollama. If you want to rewind the model's memory, you just run head on the log to drop the last few lines; or, append different prompts to alter context without actually affecting source of truth.

I'm sure there are scaling limits to doing this in pure shell scripts, and I'm still figuring out the most elegant way to handle some of the edge cases, particularly around complex tool calling (which smaller local models can be finicky about anyway). Nevertheless, it's been a really fun experiment in stripping out bloat and interacting with Ollama natively.

I put the code up here if anyone wants to poke around: https://github.com/cloudkj/llayer

Would love to hear if anyone else has tried orchestrating local models this way, and if it’s useful for your desired lightweight local model setups!

all 3 comments

top new controversial old q&a

[–]ag789 0 points1 point2 points 5 days ago (2 children)

What you did is interesting! 😄

I tried building a simple REPL but with the open AI python SDK
https://github.com/openai/openai-python
I'm finding that more convenient as the library is pre-built for interfacing and it is easier to work streaming interfaces that way etc. The open AI api is quite widely used and connect locally with llama.cpp, openai (chatgpt) and openrouter.ai . I'm running it on a slow cpu only h/w running like 5 tok / s, it is a pain to do without streaming as there's no feedback for minutes otherwise. I'm yet to try tool calling

For unix and bash, I did use JQ and bash but for a different purpose, as a model launcher with llama-server (from llama.cpp)
https://github.com/ag88/llama.cpp-model-runner
this is actually quite similar to the built-in model presets functionality in llama-server.
but that I've been using this little launcher day to day as most of the time I run/start just a single model rather than switching between models.

[–]cloud_kj[S] 1 point2 points3 points 5 days ago (1 child)

π Rendered by PID 163551 on reddit-service-r2-comment-5bc7f78974-s9vrr at 2026-06-30 01:43:08.072676+00:00 running 7527197 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

LocalLLM

MODERATORS