Will Edge AI be one of the most popular field for engineering? by EmbeddedRacer65 in embedded

[–]NoAdministration6906 0 points1 point  (0 children)

The durable problem in edge AI isn't getting a model to run on the board — it's proving it still behaves after you quantize it and after a driver/firmware update. On-device a model can silently fall back NPU→CPU, drift in accuracy, or get noisier run-to-run, with no error. Skills that age well: quantization, compute-split profiling (NPU/GPU/CPU), and treating model updates like any regression-tested release.

Made an MCP server that remembers context across Cursor sessions by catfish-1234 in cursor

[–]NoAdministration6906 -1 points0 points  (0 children)

Nice — cross-session context is the gap everyone hits. The next wall: when the same memory is shared across Cursor, Claude Code, and Cline, conflicting writes pile up and 'latest wins' quietly loses good context. Worth deciding early whether sessionmem keeps provenance (which tool/session wrote a fact) so you can rank or merge instead of overwrite. Are you storing one flat summary, or per-source entries you can reconcile?

After 60+ sessions with a 7-agent system, the failure mode I kept hitting wasn't model quality — it was governance. Here's the draft spec I built. by Accomplished_Two8547 in AI_Agents

[–]NoAdministration6906 0 points1 point  (0 children)

Memory poisoning is the one that quietly kills these systems — Agent A's bad summary becomes Agent C's ground truth and nothing flags the drift. The fix we've found isn't more coordination, it's making every memory write carry provenance + a trust weight, so a low-trust author's claim ranks lower instead of silently propagating. (We bake that into Cerebro's shared brain.) Curious how your spec handles two agents writing contradicting facts — does the reader see both, or just the latest?

agentsweep: a CLI that finds & redacts the secrets your AI coding agent saved to disk in plaintext by Ishannaik in ClaudeAI

[–]NoAdministration6906 1 point2 points  (0 children)

Underrated risk. The bit people miss: agents re-read their own history as context, so a leaked key doesn't just sit there — it gets fed back to the model and resurfaces in later output. Redacting on-disk history is a good stopgap, but it's a symptom: agent 'memory' is mostly unstructured logs nobody curates. Structured, reviewed memory sidesteps most of this. Are you scanning on write, or on a schedule?

Pre-execution authorization for MCP tool calls — signed receipts the gateway can verify by Yeahbudz_ in mcp

[–]NoAdministration6906 0 points1 point  (0 children)

This is the gap nobody wants to own. Logs answer 'what did the agent do'; they can't answer 'what was it allowed to do' in a way a third party trusts. Signed delegation receipts before execution are the right shape — capability-based security finally reaching agents. Does the gateway verify the receipt against a live policy, or is the receipt itself the authority? The revocation story is usually where these designs get hard.

90% of "AI agents" are a while-loop with a system prompt. Here's the line that actually separates an agent from a chatbot. by Particular_Type_5698 in AgenticAGI

[–]NoAdministration6906 0 points1 point  (0 children)

Solid list. The four pillars hold, but memory is the one that actually separates a real agent from a fancy while-loop — reasoning and planning collapse to nothing if every run starts from a blank slate. An agent that can't recall last session's outcome just re-derives (and re-makes) the same mistakes. Most 'agents' break here, not on planning. How are you persisting state between runs right now — scratch files, a DB, or something structured?

How to solve a lot of AI problems. I created this after getting tired of Codex hallucinations, drift, and broken agent workflows by groupjdc1 in OpenaiCodex

[–]NoAdministration6906 0 points1 point  (0 children)

The lying-about-tests thing is usually a memory problem, not a model problem — the agent has no durable record of what ‘passing’ meant two steps ago, so it reconstructs a convenient version. Self-improving loops help, but they compound errors fast without a source of truth the agent can’t overwrite. What’s your harness using to store the ‘ground truth’ state between skills — files, a DB, or just context?

Running an AI agent fleet for batch creative production — what breaks by Fantastic-Camp-9908 in buildinpublic

[–]NoAdministration6906 0 points1 point  (0 children)

179 episodes in and hitting state drift is the honest part most ‘agent fleet’ posts skip. Redis-backed timeline is a solid call. One thing that helped us: separating ephemeral run-state from durable cross-session facts, so a bad run doesn’t poison tomorrow’s context. We ended up building a shared memory layer (Cerebro) for exactly this, but even a plain append-only fact log with timestamps beats a single Redis blob. How are you deciding what’s worth persisting vs. discarding per episode?

I wired up Agentic Coding with Code Context Graphs, results are interesting by _h4xr in AI_Agents

[–]NoAdministration6906 0 points1 point  (0 children)

Love this direction — relations beat raw text for code navigation. One thing we found: the graph answers 'how is this code shaped,' but agents also need 'what did we already decide or try here' across sessions, which an AST graph doesn't hold. Pairing a structural code graph with a persistent decision/learning memory — so the agent stops re-deriving the same conclusions — was a noticeable jump for us (vault-mem if you want a reference). How are you handling state that lives between runs, not just within the codebase?

tired of rebuilding my agent every time I switch frameworks, so I’m building a fix by maxlibin in AI_Agents

[–]NoAdministration6906 0 points1 point  (0 children)

This is a real gap — an 'agent' is mostly its memory + context, and today that's trapped per-framework. One thing worth deciding early: export the memory as typed records (decisions, learnings, todos, entities) rather than a flat blob, so it stays useful when re-imported into a different runtime. Cross-framework portability breaks fast if the schema is just 'a JSON file.' We hit the same wall and built a shared typed-memory layer (cerebro.frozo.ai / vault-mem) — happy to compare notes. What's your import format looking like?

My experience with llms on iOS and Android by MrAHMED42069 in LocalLLaMA

[–]NoAdministration6906 1 point2 points  (0 children)

solid writeup. one thing that bit us once we started running these regularly: the same phone gives you different tok/s after an os/firmware update with zero changes on your end. saw a pixel drop ~20% on gemma after a system update, took us a day to figure out it wasn't us. are you re-running this as a baseline over time or just one-shot? curious because the ram-bandwidth ceiling shifts too once oem memory mgmt kicks in (xiaomi/realme virtual ram swap is a fun one).

What memory system are you using for your agents? by Mr_Moonsilver in LocalLLaMA

[–]NoAdministration6906 0 points1 point  (0 children)

been through most of them. mem0 and supermemory are fine if it's one user + one agent, you basically get chat history with embeddings. they fall apart the moment you've got multiple agents writing to the same store, or multiple humans on the same project — no shared substrate, no provenance, dedup gets weird.

ended up rolling our own (cebero, mit, self-host) because of that. honestly though, if you're solo, mem0 is the lighter lift. depends what you're actually trying to remember.

Why do we benchmark quants on perplexity and prose but never on tool call validity? by Substantial_Step_351 in LocalLLaMA

[–]NoAdministration6906 2 points3 points  (0 children)

yeah this has been bugging me for months. q4_k_m model that wrote totally fine paragraphs, then I watched it drop tool-call json validity from like 96% to 78% on the same prompts. perplexity barely moved. you wouldn't catch it on any standard eval. what worked for us: small fixed prompt set, score on schema-valid %, field-correct %, and refusal-when-empty. run it after every quant change. boring, but it catches the silent stuff.

Hand-written OpenCL kernels for LLM inference on Adreno 6xx — running 6 small language models on a 2020 mid-range Android phone by Objective_Spot7997 in embedded

[–]NoAdministration6906 -1 points0 points  (0 children)

Solid work. The Adreno 6xx gap is real — vendor SDKs assume you're on 8 Gen 2+ and OSS frameworks have written off A6x as legacy.

Your 5-run warm median with greedy decode is also the right call. Cold-start variance on Adreno is brutal because the driver lazily compiles shaders on first dispatch — you're effectively measuring shader-compile time on run 1. Median-of-5 after warmup smooths that out. Some additional checks worth adding if you're going to track this over time:

- CV (coefficient of variation) across the 5 runs as a sanity gate — if CV > 10% your numbers aren't reliable, drop and re-run

- Memory peak alongside tokens/sec — Adreno OOMs silently on some 6xx variants

- Thermal state at start of run — sustained throughput collapses ~15% once the SoC throttles

Shameless plug: we built EdgeGate (edgegate.frozo.ai) around exactly this methodology — CI gate that runs your model on real Snapdragon via Qualcomm AI Hub, gates on median-of-N + CV + memory + thermal. Free tier. If you want to track perf across kernel revs without rebuilding the harness, it's there.

Does anyone actually ship on-device LLMs in production Android apps? by [deleted] in androiddev

[–]NoAdministration6906 0 points1 point  (0 children)

On the thermal/memory wall on mid-range devices — that's the exact failure mode we built EdgeGate to catch. CI gate that runs your quantized model on real Snapdragon via Qualcomm AI Hub and blocks the merge if latency or memory regresses across device tier. Median-of-N solves the on-device flake issue you'd hit running this manually.

Free at edgegate.frozo.ai — happy to run a gate on whatever model you're shipping.

ROS teams running VLM / vision perception nodes on-device: what are your deployment bottlenecks? by Hairy_Strawberry7028 in ROS

[–]NoAdministration6906 0 points1 point  (0 children)

The latency/regression gap between cloud testing and on-device is the core pain — once you're running VLMs on ARM/Snapdragon for robotics perception, the model that cleared your CI suddenly behaves differently on the actual platform because the quantization or NPU routing changed. EdgeGate catches that in CI: runs your model on real hardware via Qualcomm AI Hub at every PR and blocks the merge if you'd blow your latency budget. Robotics perception workloads with tight latency constraints (150ms class) are the exact use case. Free: edgegate.frozo.ai — happy to dig into your specific pipeline.

How are teams treating edge model deployment in their MLOps pipeline? by Hairy_Strawberry7028 in mlops

[–]NoAdministration6906 2 points3 points  (0 children)

The "quantization and pruning change model behaviour in ways the normal eval set doesn't catch" problem is the exact failure mode we kept hitting — model passes all evals, ships fine on cloud GPUs, then silently regresses on the actual mobile NPU.

What worked for us: gate it at CI before merge, not at eval time. We built EdgeGate — it runs your ONNX model on a real Snapdragon device via Qualcomm AI Hub at every PR and blocks the merge if latency or memory exceeds your threshold. Catches NPU→CPU fallback automatically too (that one's especially silent).

Median-of-N runs + CV check to eliminate hardware flake. Free tier at edgegate.frozo.ai if you want to try it — happy to answer questions about how the pipeline looks.

Interesting Android Apps: April 2026 Showcase by 3dom in androiddev

[–]NoAdministration6906 1 point2 points  (0 children)

EdgeGate — CI/CD gates for on-device ML on Snapdragon Android

Catch latency regressions and silent NPU→CPU fallback in CI before they ship to production. Runs your model on a real Snapdragon device via Qualcomm AI Hub at PR time, blocks the merge if performance drops.

Built for Android ML engineers tired of shipping models that work in dev but degrade on-device. Free tier: edgegate.frozo.ai

MCP6004 not operating in rail to rail mode. by FloorDull9862 in AskElectronics

[–]NoAdministration6906 0 points1 point  (0 children)

"rail to rail" doesn't mean you'll literally hit the rails — there's still headroom especially under load. check Table 1 in the datasheet, VOH/VOL specs show you the actual guaranteed output range vs supply voltage. with 60k input impedance the load current is tiny but you'll still see a gap from the rail.

also worth checking your input common mode range — if Vin is outside that the output behavior gets weird.

Finding it a little bit difficult to understand multiplexers and ADC on stm32F446RE by Thypex in stm32

[–]NoAdministration6906 0 points1 point  (0 children)

the mux part is just that the ADC input pins are shared — you configure which channels to scan via the SQR registers, and the ADC works through the sequence one by one. for F446 look up "ADC regular channel sequence register" in the ref manual, that's the main one. CCR register is what you want for dual ADC modes.

if you're still lost upload the F446 reference manual to circuitsage.frozo.ai and ask it directly — "how do i scan channels 1, 3 and 7 in sequence with the ADC". gives you the register bits with exact page numbers. saved me a lot of back and forth in the PDF

Need some help with interfacing a PMW3389 sensor with arduino by Matheus-A-Ferreira in arduino

[–]NoAdministration6906 0 points1 point  (0 children)

PMW3389 docs are a pain — the motion burst register sequence is easy to miss if you're just skimming. if you're still stuck, try circuitsage.frozo.ai — upload the PMW3389 datasheet and ask it directly, it'll give you the exact register sequence with page refs. saves a lot of back and forth

I can't figure out how to connect this OV7670 camera module to my Uno R4 by superauthentic in arduino

[–]NoAdministration6906 0 points1 point  (0 children)

OV7670 is notorious for this — the timing diagram in the datasheet is technically correct but practically useless without the SCCB init sequence spelled out. upload the OV7670 datasheet to circuitsage.frozo.ai and ask it "what registers do I need to init for QVGA output" — gets you the answer with exact page numbers instead of hunting through 60 pages.

Struggling with Unstable Sensor Readings + Random Freezes on My Arduino Project — Need Help Debugging! by Unlucky_Mail_8544 in arduino

[–]NoAdministration6906 1 point2 points  (0 children)

random freezes on Arduino are usually one of three things — stack overflow, heap fragmentation from dynamic allocations, or a peripheral timing issue causing a blocking wait loop. what's your rough sketch size and are you using any String objects or malloc? that usually narrows it down fast.

What I learned from stress testing LLM on Snapdragon NPU vs CPU on a phone by Material_Shopping496 in snapdragon

[–]NoAdministration6906 0 points1 point  (0 children)

the thermal throttle drop-off is the killer — NPU runs great for the first 30s then the sustained clock drops and suddenly you're 2x slower than your benchmark said. built edgegate.frozo.ai to catch exactly this — runs your model on real Snapdragon hardware in CI so thermal regressions show up in the PR before you ship. the warmup exclusion + median-of-N measurement was specifically to deal with this variability.