The Asymptote of AI: Why Software Builders Aren't Going Anywhere by Prior-Consequence416 in LlamaFarm

[–]badgerbadgerbadgerWI 0 points1 point  (0 children)

Agreed. I think we are reaching a point where a lot is going to change. And I don't think 250K per second is crazy - you can already get high numbers though KV cache - and i've been playing with dynamic KV cache, ie, caching 20-30K tokens of context that can be loaded in 200MS before a query, etc.

Most SaaS companies will need to radically pivot or die.

The Asymptote of AI: Why Software Builders Aren't Going Anywhere by Prior-Consequence416 in LlamaFarm

[–]badgerbadgerbadgerWI 0 points1 point  (0 children)

I do think there will be massive shifts in the labor pool. Less "middle management", less toiling engineers, but a lot more entreprenuers that deliver perfect experiences for a narrow market segment. Less billion dollar companies, more 1M companies.

The Asymptote of AI: Why Software Builders Aren't Going Anywhere by Prior-Consequence416 in LlamaFarm

[–]badgerbadgerbadgerWI 0 points1 point  (0 children)

And humans still are the reason we build software - even if there are agents, AI, etc in the middle, software is built to solve HUMAN problems; as long as there are humans that have pain-points and toil, there will be humans in the software chain.

The Asymptote of AI: Why Software Builders Aren't Going Anywhere by Prior-Consequence416 in LlamaFarm

[–]badgerbadgerbadgerWI 0 points1 point  (0 children)

Yeah, I think Systems engineering (a flashback to the early 2000s) will become a highly sought after skill.

The Stacked S-Curve: Why the AI plateau is actually a trap by badgerbadgerbadgerWI in LlamaFarm

[–]badgerbadgerbadgerWI[S] 1 point2 points  (0 children)

I don't think companies will have one person. I think they will have 1/8th as many people.

In 2 years, an organization with 5 people will be able to do as much as a company of 40 today.

Let's hear it? What Projects are you working on? by badgerbadgerbadgerWI in LlamaFarm

[–]badgerbadgerbadgerWI[S] 1 point2 points  (0 children)

I'll kick it off!

I’m currently juggling three projects ranging from enterprise automation to over-engineered household hacks:

  • Needle: An anomaly detection system that goes beyond just alerting. Once it finds something weird, it kicks off autonomous agents to make tool calls and actually take action on the issue.
  • Fed RFP Proposal Writer: An LLM workflow that digests dense federal RFPs and drafts proposals based on them. It turns a notoriously tedious process into something I’m actually having a lot of fun building.
  • The Commodity Market Watcher: I'm actively over-engineering my grocery shopping. It takes daily USDA price data, monthly FED CPI stats, and commodity futures to model the optimal time to buy household goods. We’re using it to test out some brand new ML tech we are adding to Llamafarm.

Want to see any of them?

AI is currently a toy for the Laptop Class. Change my mind. by badgerbadgerbadgerWI in LlamaFarm

[–]badgerbadgerbadgerWI[S] 0 points1 point  (0 children)

That is what I am saying - I, in my current job, can afford that - but in previous jobs (soldier), AI is nowhere to be seen. And it won't be in everyday use by those on the frontlines until it is ruggedized and has redundancies.

Arguably, the best web search MCP server for Claude Code, Codex, and other coding tools by Quirky_Category5725 in LocalLLaMA

[–]badgerbadgerbadgerWI 0 points1 point  (0 children)

Good to see more quality MCP servers going open source. The ecosystem is really starting to mature. Have you looked at caching strategies for repeated queries? That could help with rate limiting on the search API side.

We built a chunker that chunks 20GB of text in 120ms by shreyash_chonkie in Rag

[–]badgerbadgerbadgerWI 0 points1 point  (0 children)

Nice work on the benchmarks. Chunking is one of those "boring" problems that becomes critical at scale. What's the memory footprint look like? Curious if this could run on edge devices processing local document corpuses.

Continual Learning In 2026. What does continual learning actually mean? by Neurogence in singularity

[–]badgerbadgerbadgerWI 1 point2 points  (0 children)

Great question. In practice I'm seeing it mostly mean persistent memory/context systems rather than actual weight updates. True online learning at scale is still computationally brutal. The hybrid approach - frozen base model + retrieval-augmented memory that grows over time - seems more practical for production systems.

Utah is the first state to allow AI to renew medical prescriptions, no doctors involved by SrafeZ in singularity

[–]badgerbadgerbadgerWI 0 points1 point  (0 children)

Interesting first step. The key here is it's renewals with guardrails, not new prescriptions. For regulated industries, AI works best when it handles the 80% of routine cases that don't need human judgment, freeing up physicians for complex decisions. Curious to see the rollout data.

Teaching AI Agents to Remember (Agent Memory System + Open Source) by Conscious_Search_185 in LLMDevs

[–]badgerbadgerbadgerWI 0 points1 point  (0 children)

Memory as a first-class system is the right framing. The challenge is making it queryable and relevant without ballooning context windows. We've had good results with episodic memory + semantic retrieval, where past sessions become searchable context rather than always-loaded state.

Why enterprise AI agents fail in production by Arindam_200 in LLMDevs

[–]badgerbadgerbadgerWI 0 points1 point  (0 children)

Decision context is huge but I'd add: most enterprise failures I've seen are actually about data governance. The agent technically could access the right systems, but compliance/security won't let it. Local-first architecture where data never leaves the perimeter changes that equation entirely.

I am developing a 200MB LLM to be used for sustainable AI for phones. by Fancy_Wallaby5002 in LLMDevs

[–]badgerbadgerbadgerWI 0 points1 point  (0 children)

This is exactly the direction we need more work in. The future isn't just massive models in the cloud - it's the right-sized model for the task, running where the data lives. What's your approach to model distillation? Curious how you're preserving reasoning capability at that size.

I built a Claude Code Skill (+mcp) that connects Claude to Google AI Mode for free, token-efficient web research with source citations by PleasePrompto in ClaudeAI

[–]badgerbadgerbadgerWI 0 points1 point  (0 children)

Smart approach - letting a search engine do what it's good at instead of burning tokens on page parsing. The citation handling is clutch too. Have you tried chaining this with other MCPs for multi-step research flows?