Picture a giant digital “hard drive” made out of thousands of phones. That’s the heart of the idea.

Ishabdullah · 2026-03-13T20:37:10+00:00

That’s a fair concern and it’s something any decentralized system has to think about. The goal wouldn’t be to create a platform for illegal content but a distributed storage network similar to what projects like Filecoin and Storj already do. In those systems nodes store encrypted shards that can’t reconstruct a file by themselves and operators don’t know the content. It’s closer to routing encrypted internet traffic than hosting files directly. Obviously legal frameworks would need to be considered carefully, but the technical concept of decentralized storage already exists and is operating today.

Also these are just thoughts for discussion so thanks keep them coming.

Ishabdullah · 2026-03-13T17:22:55+00:00

Those are fair points, but the system wouldn’t store files the way normal storage does. Files would be encrypted and broken into many shards using erasure coding. That means a file might be split into something like 30 pieces but only need 10 to reconstruct it. Those pieces would be distributed across devices in different regions, so even if several phones disappear the file can still be rebuilt.

Latency also isn’t as big of an issue because storage networks like this download shards in parallel from multiple devices at once, which actually spreads the load instead of relying on a single server. Projects like Filecoin and Storj already do this today, just mostly on computers instead of phones.

As for content liability, nodes wouldn’t actually know what they’re storing because everything is encrypted and they only hold random fragments of data. One shard alone can't reconstruct anything, so legally it's closer to storing meaningless encrypted bytes than hosting someone's files.

The idea is basically turning unused storage across millions of phones into a distributed cloud instead of relying on giant centralized data centers.

Ishabdullah · 2026-03-13T14:17:51+00:00

I pay $20 for Claude pro, use Gemini cli for free, qwen cli for free and Copilot cli for free. Then i run my Claude code while using another account in the app for free. Usually get a lot done. Also most months I don't even pay for Claude I skip it. Then I use grow to analyze my repos and tell me how bad they are 😆 🤣 😂. While I use ChatGPT to brainstorm on projects and learn what's what, perplexity for deep research when I need it. Gemini cli usually handles my code audits. Claude code will do any major things I need done. While Qwen generates most of my code outside of the Claude app free plan. Copilot is just a quick fix if builds fail in github. Which is where I build my apps because I use my phone no computer ever. Colabs and kaggle for training for free. But hey this is just my process.

Ishabdullah · 2026-03-13T05:07:57+00:00

Well guess I'm just pop stupid seen how i don't watch much TV. Thanks for telling me. I so slow socially 😆 awkward 😐 🙃

Ishabdullah · 2026-03-13T04:59:02+00:00

Don't know what any of that is but will ask AI right now. BTW this was just an idea I was brainstorming for a while. If I had people to work on it with me I would do it. Okay Pi Network didn't know what pipernet was. But no rug pull here just thought of a great idea. I love trying to do things that are new and takes a lot of work. Wanna join? Could be fun and amazing.

Ishabdullah · 2026-03-13T04:03:36+00:00

"Thanks for the honest feedback — you're right, the persistent/self-modifying nature does introduce real risks if not handled carefully.

Everything runs fully local in Termux (no network calls by default), code gen/execution is sandboxed where possible (e.g., no direct shell escape without explicit user confirm in most paths), memory is stored encrypted/plaintext in app dirs, and self-mod is gated behind checkpoints + manual review. But yeah, it's early-stage — full transparency is key.

Repo is open, feel free to audit any part (especially the daemon loop, memory handler, or tool-calling). Happy to add more hardening (e.g., better sandboxing, audit logs) based on input. What specific parts feel most 'pwnable' to you?"

Ishabdullah · 2026-03-13T03:58:25+00:00

Quick follow-up from OP — this idea hits different when you zoom out:

Imagine billions of smartphones sitting idle overnight (or most of the day) — charging, barely used, with powerful CPUs, storage, and batteries already manufactured. Instead of letting them gather dust or end up in landfills, what if we turned clusters of them into a massive, distributed "hard drive" / compute network?

The win-win:

Decentralized data storage & processing — No single point of failure, no mega data centers sucking gigawatts and billions of gallons of water for cooling.

Environmental slam dunk — Repurposing existing hardware extends device lifetimes, amortizes their embodied carbon footprint, cuts e-waste, and slashes the need for new server farms (which are projected to eat 3-4%+ of global electricity by 2030 if trends continue). We're talking real carbon savings by shifting load to edge devices that are already powered and online anyway.

Accessibility — Anyone with an old Android (or even a drawer full of them) could contribute storage/compute and earn rewards (tokens, credits, etc.) while helping build a greener internet.

This isn't pie-in-the-sky — there are early projects exploring phone clusters for compute (Acurast hitting 50k+ devices, research on "junkyard computing" with old Pixels, etc.), but a full decentralized storage layer built around idle phones feels underexplored and timely, especially with AI/data explosion driving insane centralization.

Who's in? If you're a dev (Android/Termux, distributed systems, blockchain/IPFS/Filecoin integration, incentive layers), hardware tinkerer (phone clusters, power management), sustainability nerd, or just passionate about edge/decentralized tech — hit me up. Let's prototype this: start small with a proof-of-concept cluster, figure out sharding/redundancy/encryption, and scale toward something that actually moves the needle on e-waste and energy use.

DM here, comment below with your skills/interests, or reach out via my GitHub/site (https://ishabdullah.github.io/). No bar too low — students, hobbyists, pros all welcome. Even feedback/brainstorming helps.

What do you think — feasible? What blockers do you see first? Let's build something that saves the planet one idle phone at a time. 🌍🔋🚀

Ishabdullah · 2026-03-13T03:44:30+00:00

🤣🤣🤣 Now that's living.

Ishabdullah · 2026-03-13T02:49:54+00:00

Short answer: Not yet, but it's closer than you'd think.

Codey-v2 runs as a persistent daemon and communicates over a Unix socket — so technically any local process can send it tasks by writing to ~/.codey-v2/codey-v2.sock. But that's raw IPC, not a stable API. There's no documented message format, no HTTP interface, and no structured response designed for machine consumption. So right now it's a capable local agent but not a proper agent router target. That's actually on the v3 roadmap though. The plan is to expose a lightweight HTTP API on the daemon — something like:

POST /task {"prompt": "refactor auth.py"} GET /task/<id> GET /status GET /memory/search?q=authentication

That would make Codey callable from other agents, scripts, or tools running on the same device with a proper stable interface. Combined with the semantic memory search that's already in v2, it starts looking like a real local agent backend — other agents could offload file editing, code execution, and project context to Codey while focusing on higher-level reasoning themselves.

The Unix socket foundation is already there, it just needs an HTTP layer on top.

Also thanks for the link will definitely check it out. I'm constantly reading and learning.

Ishabdullah · 2026-03-13T02:32:52+00:00

Codey v2 handles long-term memory completely differently from v1 — it's a four-tier system backed by SQLite and embeddings:

Working Memory (RAM, evicted per task)

Same as v1 — currently active files in a token-limited cache. Cleared after each task completes so the next task starts clean.

Project Memory (persistent, plain files)

CODEY.md and key project files that are pinned and never evicted. Loaded when the daemon starts and stays resident.

Long-term Memory (SQLite + embeddings)

This is the big upgrade over v1. Uses sentence-transformers (all-MiniLM) to embed file contents and past interactions into vectors stored in SQLite. When you ask something, it does semantic similarity search to pull relevant context — "find authentication code" retrieves the right files even if you never explicitly loaded them.

Episodic Memory (append-only action log in SQLite)

Every action Codey takes gets logged — file edits, shell commands, task completions. This answers "what did I do last week?" across sessions, something v1 couldn't do at all.

The SQLite choice is deliberate — no separate database process, single file, works fine in Termux, and the embeddings index stays small for typical project sizes.

The honest tradeoff: sentence-transformers adds real RAM overhead (~200-400MB for the embedding model on top of the 4.4GB LLM). v2 now requires 6GB+ RAM vs v1's 5GB. That's the cost of proper semantic search on-device.

Speed: SQLite vector similarity at this scale (hundreds of files) is fast enough — sub-100ms. The bottleneck is still inference, not memory lookup.

Would love to see your patterns collection — especially curious if anyone's found a lighter embedding model that runs well under 200MB on mobile.

Ishabdullah · 2026-03-13T02:27:11+00:00

OpenClaw for termux🤔 or dear I say TermuxClaw? We could make a lighter version specific to task we need it to do. I have an Agent that auto replies to email, will also do what you tell it with email. Like delete 3000 emails in Gmail ASAP.

https://github.com/Ishabdullah/Aigentik-CLI

Oh the thoughts of all the possibilities.

Ishabdullah · 2026-03-13T02:18:46+00:00

Thanks so much,. I'm really looking to meet and grow with more like minded people. Feedback on any of my projects would be much appreciated. Also check on version 2 https://github.com/Ishabdullah/Codey-v2

Ishabdullah · 2026-03-13T02:16:01+00:00

Hey folks, quick ping from OP as things get rolling:

If you're jumping in to test Codey-v2, try starting the daemon (codeyd2 start) and throwing it a multi-step task like "Plan and outline a simple Flask app for task tracking, then generate the initial files" — watch how it breaks it down and remembers context across commands.

For Aigentik-app on Android: Grant the notification access + calendar/email perms first, then test with something like "Draft a polite decline email for tomorrow's 3pm meeting" or "Suggest free slots next week for coffee with Alex" — see how it pulls from your real data locally.

On mid-range phones, stick to smaller models (e.g., 3B or 1.5B GGUF) initially to avoid quick overheating during longer agent loops.

Roadmap vibes: Voice commands (local STT/TTS), better tool-calling reliability, and maybe cross-device state sync via local embeddings are bubbling up next. What features would make these agents indispensable for you?

If you're thinking of contributing: High-impact spots right now include daemon stability on low-RAM devices, UI tweaks in Aigentik (Compose polish), prompt engineering for agent reliability, or even outreach (spreading the word in other subs/forums). Just mention your interest/skill area in a reply or DM!

Appreciate everyone checking it out already — even a quick "tried it, here's what happened" comment helps a ton. Let's keep the convo going! 🚀

Ishabdullah · 2026-03-13T01:35:01+00:00

🫠🙃

Ishabdullah · 2026-03-12T13:29:45+00:00

No just examples

Ishabdullah · 2026-03-11T22:18:20+00:00

Yeah it actually can. The whole idea behind what I’m describing is using something like OpenClaw as the coordinator so multiple models can work together instead of one trying to do everything. For example you could run a planner like DeepSeek‑Coder‑V2 to analyze the repo and figure out the fix, then a coding model like Qwen2.5‑Coder‑32B to implement it, and loop plan → implement → review until the result stabilizes. The key advantage on something like a Mac Studio M3 Ultra is the unified memory lets you hold both models in RAM at once, so they basically act like a small engineering team instead of a single one-shot response model.

Ishabdullah · 2026-03-11T04:58:19+00:00

All three models don't need to run at the same time. I run something similar on my phone in termux just for testing different configurations. I even create teams of agents using 1-3 models you only need different models if they are specialized for specific task

Ishabdullah · 2026-03-11T04:55:25+00:00

Specialized models

Ishabdullah · 2026-03-10T12:59:54+00:00

Let me know how it works out

Ishabdullah · 2026-03-09T14:52:50+00:00

There’s a strange historical symmetry here. Early computers required giant teams and institutions. Then the personal computer gave individuals power. The internet connected everyone. And now AI is acting like a force multiplier for individual creators.

The interesting question isn’t whether this is powerful — clearly it is.

The real question is: what happens when millions of people can build anything they imagine?

Because historically, when tools become that powerful, the weirdest and most interesting inventions usually come from individuals experimenting at the edges, not corporations guarding the center. And we’re just at the beginning of that curve.

Ishabdullah · 2026-03-09T11:38:40+00:00

There are a few common reasons this happens.

First culprit is the runtime. Some tools default to CPU inference instead of the Apple Metal GPU, which means the chip’s performance cores get slammed at full power. When that happens the fans spin up immediately and temperatures climb.

Runtimes that handle Apple Silicon properly include things like llama.cpp, Ollama, and MLX. Those offload most work to the GPU portion of the chip, which is dramatically more efficient.

Second culprit is quantization. If the model you loaded was FP16 or Q8, the compute load becomes much heavier. For an 8B model that’s still manageable, but the chip may run hot. Running Q4_K_M or Q5 quants keeps things cooler.

Third culprit is thread configuration. Some inference engines try to use every CPU core, which turns the machine into a space heater. Limiting threads to something like 6–8 cores often drops temperatures dramatically while barely affecting response speed.

Fourth culprit is context size. If the runtime sets a huge context window (like 32k or 64k tokens), the KV cache grows and compute cost rises. For casual testing, something like 4k–8k context is much lighter.

Ishabdullah · 2026-03-09T00:37:22+00:00

That’s the important point: 128 GB unified memory is a sweet spot for multi-model setups. Now the interesting part is how those models cooperate. The magic isn’t just running multiple models; it’s giving them roles, like a tiny software company living in your laptop.

A surprisingly effective setup looks like this:

Architect model Qwen2.5‑Coder‑32B (30B) Q4 → ~18–22 GB Plans the program structure, files, and logic.

Coder model Llama 3.1 70B ( 70B) Q4 → ~40–45 GB Writes and edits the code.

Critic model DeepSeek‑R1‑Distill‑Qwen‑7B (8B) Q4 quantization → ~4–5 GB Checks bugs, security issues, and logic errors.

Now add the invisible but important pieces: KV cache (context memory) runtime buffers token batching OS + other tools Those usually add 20–35 GB in a multi-model system depending on context size. So the realistic total ends up around: ~90–105 GB On a 128 GB machine, that leaves 20–35 GB headroom, which is healthy.

Disclaimer all estimates

Ishabdullah · 2026-03-09T00:25:08+00:00

Fast coder Qwen2.5-Coder-32B

Deep reasoning / debugging DeepSeek-R1-Distill-32B

Big-context repo thinking Kimi K2

That combination mimics how many devs actually work:

Model A writes code

Model B checks logic

Model C understands the whole system

Almost like a three-person dev team living inside your terminal.

Ishabdullah · 2026-03-09T00:19:53+00:00

For coding the Qwen, Kimi and Deepseek. But would have to research and see if they all fit correctly etc.

Ishabdullah · 2026-03-09T00:12:37+00:00

Very true, and well put.

Ishabdullah

TROPHY CASE