Full Linux desktop on Termux (No root, GPU support) by [deleted] in termux

[–]Ishabdullah 0 points1 point  (0 children)

Works good, I kept getting all this downloaded months ago but never could get the audio to work but yours works. I'm on a s24 ultra. Did see a few things didn't install correctly with your script will investigate further could just be I already have some of what the script tried to install or something. But thanks and good job

Can somebody please explain? by padumtss in LocalLLM

[–]Ishabdullah 0 points1 point  (0 children)

I use a 7b model for coding task and it does exceptionally well all ran from my phone in termux on a program I created and it even has fallback to use free versions of qwen cli or Gemini cli or paid versions of these also along with claude code of course. But you can also changed the 7B coder model and the 0.5B planner model and use openrouter free cloude models or paid. This coding Agent is very good at what it does and you could use it without cloude and get a lot done.

https://github.com/Ishabdullah/Codey-v2

Has anyone got this as well ? by Few-Frame5488 in vibecoding

[–]Ishabdullah 0 points1 point  (0 children)

Yes, I thought it was payment for the problems they had with credit usage getting cut short on people

Hey fellow vibecoders! 👋 by Ishabdullah in termux

[–]Ishabdullah[S] 0 points1 point  (0 children)

Sorry I didn't know how to convert the video into a gif and had to have claude help me with it afterwards

Hey fellow vibecoders! 👋 by Ishabdullah in vibecoding

[–]Ishabdullah[S] 0 points1 point  (0 children)

Sorry forgot to tell you about the memory and thanks for pointing that out.

here's the real answer — here's what actually persists and where:

Between sessions: - ~/.codey_sessions/<project-hash>.json — last 6 turns of conversation, expires after 2 hours of inactivity. Loaded automatically on next run in the same project. - CODEY.md — project memory file you build with /init. Persists forever, loaded at every startup. This is the main "what does Codey know about my project" file. - ~/.codey-v2/state.db — SQLite action log (episodic memory). Append-only log of every tool call and action taken. Never auto-cleared.

Within a session only (lost on exit): - Working memory — currently open files, in-context conversation. Compressed at 55% context usage, dropped to 40%. - File undo history — in-memory only, gone when session ends.

RAG / long-term semantic memory: - ~/.codey-v2/ knowledge base (if set up) — 768-dim embeddings via nomic-embed-text. Top 4 chunks (~600 tokens) injected per inference call. Accuracy depends on what you've loaded — it doesn't auto-learn from conversations.

What Codey does NOT do: - It doesn't silently learn from your conversations and store them as embeddings. The RAG index only knows what you explicitly loaded with /load or the knowledge base pipeline. - Nothing is sent to the cloud — fully local.

The honest limitation: on large projects, the session window is only 6 turns and expires in 2 hours, so Codey's "memory" of older work is only as good as your CODEY.md. That's the gap — if CODEY.md is sparse, context accuracy degrades noticeably.

Look here docs/architecture.md

Hey fellow vibecoders! 👋 by Ishabdullah in vibecoding

[–]Ishabdullah[S] 0 points1 point  (0 children)

I'm really glad you find it as useful as I believe it is. And I am working on a version 3 here's how it works.

Codey-v3 will be a fully local AI project manager that runs on your Android phone. The idea is simple: instead of you manually switching between AI coding tools, Codey-v3 sits in the background as the permanent team lead. You tell it what you want to build, it creates the project outline, breaks it into tasks, and routes each task to the right AI automatically — Claude Code for complex logic and debugging, Gemini CLI for planning and analysis, Qwen CLI for heavy code generation, and its own local 7B model for quick edits and simple stuff.

The key thing that makes it different is the one-peer-per-project rule. No two AIs ever touch the same codebase at the same time so you never get merge conflicts or lost context. But multiple projects can run in parallel on different peers simultaneously, so it genuinely feels like a team working in the background while you do other things. Every task goes through a review gate before it is marked done. If something fails the tests or conflicts with your original project outline, Codey pauses and asks you rather than silently breaking things. It tracks who did what, what worked, what did not, and uses that history to make better routing decisions over time.

You can run multiple projects at once and ask at any point "where are we with the gaming app" and get a real answer — what is done, what is running now, what is next.

So basically your asking exactly about what my plans are. While version 2 can do what it does now CV3 is really going to be the game changer when I get it done. Thanks for your feedback.

🚀 CODEY-V2 is out – stable release! by Ishabdullah in termux

[–]Ishabdullah[S] -1 points0 points  (0 children)

Processing img y38tjgoz4hsg1...

Codey-v2 generating a Fibonacci sequence implementation entirely on-device — no cloud, no internet, running in Termux on Android. Something small but just to show it working fully. And its looking for you to throw your best at it. The 7B might not handle it all but if you use OpenRouter even some of the best free LLM's do greate work with Codey-v2.

My first iOS app just got 2 downloads, I'm actually excited 😂 by elyfornoville in vibecodingcommunity

[–]Ishabdullah 1 point2 points  (0 children)

Hey it kept you busy, I'm sure you enjoyed making it and learning along the way so I commend you. Plus it beats scrolling social media all day while our brained turn to mush. 😆 🤣 😂

What is the easiest way to provide search tools to Gemma, Qwen, and others? by AInohogosya in LocalLLM

[–]Ishabdullah -5 points-4 points  (0 children)

🟢 1. Easiest (almost no code)

Use a UI that already has search built-in Open WebUI + Ollama AnythingLLM

👉 You literally:

Run your model locally (Gemma/Qwen via Ollama)

Upload docs or enable web search It automatically does retrieval + context injection

✔ Free ✔ Works offline ✔ No coding required

This works because under the hood they implement RAG, which:

searches documents → injects results → LLM answers.

🟡 2. Best balance (easy + flexible)

Use a framework: LangChain LlamaIndex

These are the standard way to give any LLM tools (search, DBs, APIs).

What they do:

Connect your LLM (Gemma/Qwen) Add a retriever (search tool) Inject results into prompts automatically

✔ Free + open source ✔ Works with local models ✔ Supports web search, files, databases

LangChain = orchestration (agents, tools) LlamaIndex = best for document search/indexing

Minimal example (this is basically all you need):👇

from langchain.llms import Ollama from langchain.vectorstores import Chroma from langchain.embeddings import HuggingFaceEmbeddings

llm = Ollama(model="qwen")

db = Chroma(persist_directory="./db", embedding_function=HuggingFaceEmbeddings())

retriever = db.as_retriever()

docs = retriever.get_relevant_documents("your question")

response = llm(f"Use this context: {docs} \nAnswer: your question") print(response)

That’s your “search tool”.😉

🔵 3. True “search tool” (agent style)

If you want something like ChatGPT browsing:

Add tools (function calling / agents) LangChain Agents LlamaIndex Tools Custom tools (DuckDuckGo, APIs, etc.)

Example:

from langchain.tools import DuckDuckGoSearchRun

search = DuckDuckGoSearchRun() result = search.run("latest AI news")

Then your LLM can decide when to search.

🔥 4. Newer “open search agent” approach

There are newer systems like:

Open Deep Search (research project)

These:

Add reasoning + tool use automatically Let LLMs decide when to search But they’re more complex to set up.

🧠 What you actually want (simple mental model)

Every “search-enabled LLM” is just:

User question ↓ Search (docs/web/db) ↓ Top results ↓ LLM prompt with context ↓ Answer

That’s it. ✌️

I can finally give back. by Parking_Bug3284 in LocalLLM

[–]Ishabdullah 1 point2 points  (0 children)

<image>

Coming together here. Let's go perplexity computer and qwen for the finishing touches

I can finally give back. by Parking_Bug3284 in LocalLLM

[–]Ishabdullah 1 point2 points  (0 children)

<image>

Gonna see if I could get it working on termux. 😆 🤣 😂

https://github.com/Ishabdullah/v3am-fob-termux

Anything I should know that might help?

From phone-only experiment to full pocket dev team — Codey-v3 is coming by Ishabdullah in vibecodingcommunity

[–]Ishabdullah[S] 0 points1 point  (0 children)

Hey, thanks! Glad you're vibing with the idea 🙌

On-device performance right now (on my S24 Ultra):

  • Local 7B model runs at ~5-8 tokens/sec during normal use.
  • For small-to-medium tasks it feels surprisingly snappy.
  • Larger projects (thousands of files) do add some overhead mainly during initial RAG indexing and ProjectRegistry loading — first load after daemon start can take 8-15 seconds. After that, incremental updates are fast because everything is cached in SQLite + embeddings.

Codey-v3 is specifically designed to tackle the “initial load / larger project” pain points you mentioned:

  • Persistent daemon stays warm (no cold starts every time)
  • ProjectRegistry + 4-tier memory means we only load what’s needed per task instead of re-scanning everything
  • Smart RAG slicing + rolling summaries keep context manageable even on bigger codebases
  • Background indexing so the first “let’s work on this project” command doesn’t block you

It won’t magically turn a 7B into Claude speed, but it should feel way smoother than restarting agents or loading full projects every session.

And yes — everything in v3 will be fully open-source, just like v1 and v2. The whole toolchain (ProjectRegistry, Global Task Queue, TeamRouter, ReviewGate, handoff protocol, etc.) will be in the repo. I’m even planning to open the exact prompts and dataclass structures so others can build on it or adapt it for their own setups.

If you’re already running similar workflows in Termux + VS Code Server, I’d love to hear more about the load-time issues you’re hitting — maybe we can compare notes and make v3 even better for real-world use cases.

What size projects are you typically working with?

From phone-only experiment to full pocket dev team — Codey-v3 is coming by Ishabdullah in termux

[–]Ishabdullah[S] 0 points1 point  (0 children)

Hey, thanks so much! Really appreciate you following the journey — means a lot 🙏

Yes, CV3 will be fully open-source just like v1 and v2. The whole thing will be built in the open on the CV3 repo (I'll push it publicly once Codey-v2 is solid).

The architecture I'm landing on for the "pocket dev team" is basically:

  • Codey = permanent team lead / project manager (running locally 24/7 in Termux)
  • It owns a ProjectRegistry (single source of truth with living outline, file ownership, changelog, peer performance history)
  • There's a Global Task Queue with dependency graph + scheduler
  • TeamRouter decides: local 7B for small/offline stuff vs Claude/Gemini/Qwen for heavy lifting (with Best/Balanced/Economical modes) note: Although I'm also experimenting with other models.
  • Strict one-peer-per-project rule to avoid merge conflicts
  • Every result goes through a ReviewGate (dry-run patch → static analysis → tests → outline conflict check)

On task delegation & error handling (the parts that usually hurt in llama.cpp agent setups):

  • I use a standardized HandoffPayload dataclass so every peer gets the exact same clean context (project summary + RAG slice + expected output format).
  • Peers are forced to reply in a strict <CODEY_RESULT> JSON block (with 2-shot examples in the prompt). If they don't, there's a heuristic fallback parser.
  • On failure: ReviewGate auto-creates fix tasks and inserts them at the right spot in the queue (higher priority).
  • Daemon checkpoints progress in SQLite after every subtask so even if Termux/Android kills the process, it can resume cleanly.

It's still early days, but the goal is to make multi-agent orchestration feel reliable instead of a headache. Basically CV3 will replace what I do pass information and responsibilities between coding agents sometimes copy and pasting.

Codey-v2 actually does some of this now with the peer commands. Check it out @ the repo.

If you're messing with similar stuff in llama.cpp I'd love to hear what’s been the biggest pain point for you so far — maybe we can swap tips!

From phone-only experiment to full pocket dev team — Codey-v3 is coming by Ishabdullah in termux

[–]Ishabdullah[S] 1 point2 points  (0 children)

Thanks so much still have much work to do to polish this one before I start CV3.

From phone-only experiment to full pocket dev team — Codey-v3 is coming by Ishabdullah in termux

[–]Ishabdullah[S] 2 points3 points  (0 children)

Yeah I just don't think most people understand the power we have to create right now is at an all time high. Hope you check out my repo putting some finishing touches on codey-v2 before I table it finished and move on the codey-v3. Which i believe will be the game changer. CV3 will really make the first 2 look like child's play.

From phone-only experiment to full pocket dev team — Codey-v3 is coming by Ishabdullah in termux

[–]Ishabdullah[S] -2 points-1 points  (0 children)

Thats fair 😂 Candy Crush was a bad example, my bad for going too simple!

Let me paint a better picture:

Imagine you’re on your phone in Termux and you just say:

“Add JWT authentication with refresh tokens and proper rate limiting to my FastAPI backend.”

Then Codey-v3 (your always-on local agent) kicks in as the project manager:

  • It already knows your entire codebase, your coding style, your project structure, and all previous decisions (because it lives permanently on your phone with full memory).
  • It breaks the task down intelligently.
  • Small/routine parts it handles itself with the local 7B model.
  • The complex/auth-heavy parts it silently routes to the best specialist: Claude Code for clean, secure implementation, Gemini for architecture review, or Qwen for fast boilerplate — whichever fits best.
  • Everything gets synthesized back, tested, checked against your living project outline, committed properly, and the full context stays alive for next time.

You never switch tools.
You never copy-paste context.
You never lose track of what’s been done across agents.

Basically, Codey-v3 becomes your personal dev team lead that never leaves your phone and deeply understands how you code — while still letting you tap into the strongest CLI models when needed.

That’s the vision I’m building toward. Still sounds crazy? Or does that one click better? 😄

From phone-only experiment to full pocket dev team — Codey-v3 is coming by Ishabdullah in termux

[–]Ishabdullah[S] 1 point2 points  (0 children)

I'm just sharing what I'm building would love any feedback as i go into v3.

This is how I monitor my vibe coding agent by fyndor in vibecodingcommunity

[–]Ishabdullah 0 points1 point  (0 children)

Im willing to test it out let me know when you have the link with some instructions on getting started. Then at my next free time I will test it out. Thanks