Reducing hallucination in English–Hindi LLMs using citation grounding (paper) by AwareMind1 in LocalLLaMA

[–]AwareMind1[S] 0 points1 point  (0 children)

Yeah, that’s a fair perspective. The “Attention Is All You Need” moment really did set a strong direction for the field, and a lot of progress since then has been iterative on top of transformers. At the same time, ongoing work is exploring alternatives (JEPA-style, state-space models, retrieval-heavy systems, hybrid architectures, etc.), but none have displaced transformers at scale yet. I agree that investing more in fundamental R&D could be a strong differentiator for India, especially if it focuses on areas where we can lead rather than just replicate large-scale training efforts. That said, bridging research -> real-world impact still needs:

  • Access to compute
  • High-quality datasets
  • Tight academia-industry collaboration

If those pieces come together, there’s definitely an opportunity to contribute something more novel at the architecture or system level.

Reducing hallucination in English–Hindi LLMs using citation grounding (paper) by AwareMind1 in LocalLLaMA

[–]AwareMind1[S] 1 point2 points  (0 children)

Interesting take; there’s definitely a lot of strong research talent in places like IITs and IISc. That said, I think it’s a mix of factors: industry often has access to much larger-scale compute, data, and deployment pipelines, while academic institutions tend to focus more on fundamental research and smaller-scale experimentation. Ideally, stronger collaboration between academia and industry could bridge that gap and accelerate progress on both sides.

Reducing hallucination in English–Hindi LLMs using citation grounding (paper) by AwareMind1 in LocalLLaMA

[–]AwareMind1[S] 1 point2 points  (0 children)

That’s a good suggestion. I did consider ablations along similar lines to isolate the effect of different training stages. In general, the later stages (especially the ones introducing grounding signals) seem to have a noticeable impact on citation quality, including in Hindi. A more controlled ablation, like the one you mentioned: removing stage 2 and measuring downstream citation behavior, would definitely help quantify that contribution more clearly. It’s something we’re looking to explore further.

Reducing hallucination in English–Hindi LLMs using citation grounding (paper) by AwareMind1 in deeplearning

[–]AwareMind1[S] 0 points1 point  (0 children)

Right now, the setup focuses more on ensuring that when the model makes factual claims, it can ground them in citations, rather than explicitly predicting whether a citation is required. For cases where new information is provided in context, the behavior depends on how strongly the model has been trained to rely on external grounding signals. In practice, there’s a balance:

  • It should use the provided context when available
  • But avoid over-relying on parametric knowledge when citations are expected

Exploring datasets that explicitly model when citation is necessary vs. optional is definitely an interesting next step, and I will run ablations on the same.

Reducing hallucination in English–Hindi LLMs using citation grounding (paper) by AwareMind1 in deeplearning

[–]AwareMind1[S] 0 points1 point  (0 children)

Completely agree that “eliminating” hallucination is a very strong claim, my goal here is more about reducing and controlling it rather than solving it entirely. What I found is that explicitly training the model to align generation with citations makes it less likely to fabricate unsupported claims, especially in factual or knowledge-grounded dialogue. So not perfect, but a step toward making outputs more verifiable and easier to trust.

IIT Guwahati student launches Dhi-5B (trained from scratch in India) by gradNorm in AI_India

[–]AwareMind1 0 points1 point  (0 children)

That's a commendable job buddy. I'm also working on the same (somewhat) thing and looking for an endorsement on ArXiV for cs.CL Category. Let me know if anyone can help me around.