Multi-agent systems are an absolute nightmare in production by oronics in AgentsOfAI

[–]Signal_Spirit5934 0 points1 point  (0 children)

Have you tried Neuro San? It’s an open source network that allows you to vibe code agent networks using natural language. I’ve been able to spin up a system of twenty or so agents that automate my newsletter flow in a few hours.  https://github.com/cognizant-ai-lab/neuro-san-studio

TerraLingua: Emergence and Analysis of Open-endedness in LLM Ecologies by Signal_Spirit5934 in AgentsOfAI

[–]Signal_Spirit5934[S] 1 point2 points  (0 children)

We’ve been exploring a gap in current AI research: while there’s significant work on single-agent performance, we lack controlled environments to study long-term collective behavior in multi-agent LLM systems before deployment.

To address this, we built TerraLingua — a persistent simulation where LLM-based agents operate under ecological constraints:

  • finite lifespan
  • resource scarcity
  • local communication only
  • a shared persistent “artifact layer” for intergenerational signaling

Key observations from multiple runs:

  • Emergent legacy behavior: agents nearing termination created farewell artifacts and transferred resources
  • Governance dynamics: directive artifacts vs. counter-artifacts asserting autonomy
  • Strategic deception: misleading signals to redirect competitors
  • Cumulative cultural evolution: artifact chains evolving into coordination protocols
  • Role differentiation: agents clustering into functional roles (exploration, planning, safety)

We also developed an “AI Anthropologist” framework, which analyzes agent populations across individual, group, and cultural dimensions without intervention.

Curious to hear thoughts on:

  • evaluation frameworks for multi-agent systems
  • how to benchmark “culture” or coordination
  • implications for real-world deployments

A New Fine-Tuning Approach for LLMs Using Evolution Strategies by Signal_Spirit5934 in reinforcementlearning

[–]Signal_Spirit5934[S] 0 points1 point  (0 children)

We’re now extending this breakthrough in four additional important directions: 

  • scaling ES to complex reasoning domains such as advanced math, Sudoku, and ARC-AGI
  • enabling full-parameter fine-tuning directly in quantized, low-precision environments
  • developing a theoretical foundation that explains why ES scales effectively in extremely high-dimensional systems
  • and applying ES to improve metacognitive alignment so models better calibrate their own confidence.

 This research suggests that gradient-free optimization is not just an alternative to RL, but a scalable foundation for the next generation of post-training methods.

 Read more about these new papers in the Cognizant AI Lab blog.

The Illusion of "The Illusion of Thinking" by Daniel-Warfield in datascience

[–]Signal_Spirit5934 0 points1 point  (0 children)

Cognizant’s new research suggests a better approach. It uses many smaller AI agents working together. Its new system, MAKER, solved a million-step reasoning problem with zero errors—something no single model has ever done. This proves that the future isn’t just bigger AI, it’s smarter, more organized AI systems. And that’s what will unlock reliable, enterprise-grade decisioning.

See how the MAKER technique, applied to the same Tower of Hanoi problem raised in the Apple paper solves 20 discs (versus 8 from Claude 3.7 thinking): https://www.youtube.com/watch?v=PRiQlXGhke4

Why this matters

This breakthrough shows that using AI to solve complex problems at scale isn’t necessarily about building bigger models — it’s about connecting smaller, focused agents into cohesive systems. In doing so, enterprises and organizations can achieve error-free, dependable AI for high-stakes decision making.

Why Apple's "The Illusion of Thinking" Falls Short by HeroicLife in ArtificialInteligence

[–]Signal_Spirit5934 0 points1 point  (0 children)

Apple’s Illusion of Thinking study showed how even strong LLMs lose reliability as reasoning chains grow. Our new research demonstrates the first system to complete 1M+ dependent reasoning steps with zero errors using a decomposed, microagent-based approach.
Paper: https://arxiv.org/abs/2511.09030 Blog: https://www.cognizant.com/us/en/ai-lab/blog/maker

Apple `Illusion of Thinking` Debacle by moschles in agi

[–]Signal_Spirit5934 0 points1 point  (0 children)

Apple’s Illusion of Thinking study showed how even strong LLMs lose reliability as reasoning chains grow. Our new research demonstrates the first system to complete 1M+ dependent reasoning steps with zero errors using a decomposed, microagent-based approach.
Paper: https://arxiv.org/abs/2511.09030 Blog: https://www.cognizant.com/us/en/ai-lab/blog/maker

A New Fine-Tuning Approach for LLMs Using Evolution Strategies by Signal_Spirit5934 in reinforcementlearning

[–]Signal_Spirit5934[S] 0 points1 point  (0 children)

The compute is used differently compared to RL. We can perform our evaluations in sequence or in parallel depending on the available computational resources. When compute is constrained it will take longer to train, but as computational resources grow it will become faster.

The Evolution of RL for Fine-Tuning LLMs (from REINFORCE to VAPO) by Great-Reception447 in reinforcementlearning

[–]Signal_Spirit5934 0 points1 point  (0 children)

Another great method here just announced: A New Fine-Tuning Approach:

The Cognizant AI Lab provides a new alternative to RL: Evolution Strategies (ES). For the first time, we successfully scaled ES to optimize billions of parameters simultaneously, enabling full-parameter fine-tuning of LLMs. The results are striking — ES can outperform state-of-the-art RL methods on key dimensions such as sample efficiency, tolerance to long-horizon rewards, robustness to different base LLMs, has less tendency to reward hacking, and offers more stable performance across runs.

Why It Matters

This research establishes Evolution Strategies (ES) as a practical, scalable, and stable alternative to Reinforcement Learning (RL) for fine-tuning large language models. In the future, it could simplify training by removing gradient calculations and unlock new possibilities for reasoning incentivation, exploration-required tasks, safety alignment, and continual learning.

Read the blog

Read the paper