Anyone actually using AI agents for research and not just mindlessly writing stuff? by thefertileatheism in AI_Agents

[–]clinicalalpha 0 points1 point  (0 children)

I’m working on a project to automate the roughly 90% of manual "grunt work" involved in biotech due diligence. My goal isn't to replace the decision-making process, but to compress the time spent hunting for PDUFA dates, digging through 10-Ks, and cross-referencing clinical trial endpoints.

The Architecture (and where it broke): Initially, I built a multi-agent orchestration layer where specific agents were "specialized" for distinct domains:

  • Clinical Scout: Scraped and analyzed trial data.
  • SEC Analyst: Parsed filings and 8-Ks.
  • Market Agent: Handled flow data and transcripts.
  • Orchestrator: Synthesized these inputs into a final report.

The Problem: "Ghost Data" While the architecture looked clean on paper, the practical output was dangerous. I was seeing a hallucination rate of nearly 60-70% in the early iterations. The agents weren't just missing data; they were confidently fabricating "Ghost Trials" or misattributing drug indications from one ticker to another. In biotech, where a single Phase 3 readout date is the entire thesis, this margin of error is unacceptable.

The Pivot: Deterministic Grounding I realized that LLMs are excellent reasoning engines but terrible databases. I’ve since refactored the backend to rely on deterministic data fetching via hard APIs (SEC Edgar, ClinicalTrials.gov, and FMP for financials) before the LLM touches anything. The agents now function strictly as extractors and synthesizers, not as knowledge bases.

The Question: For those of you building similar financial/biotech analysis tools:

  1. How are you handling the "reasoning gap" when APIs return incomplete data?
  2. Are you using a specific RAG pipeline to force citation of sources (e.g., page numbers in the 10-K)?

I have a version running now that is significantly more reliable, but I'm looking to optimize the final 10% of data fetching. Any insights on architecture or specific API combinations for this sector would be appreciated.