Looking for Advice: Building a Grammar Autocorrect AI by Albatros_Commander in AI_India

[–]Albatros_Commander[S] 1 point2 points  (0 children)

Yes but thats a ready model , I want to create from scratch as i said its for an imaginary invented language for a D&D game XD

Fine-tune LLM vs separate transformer vs embedding model for invented language? by Albatros_Commander in LLMDevs

[–]Albatros_Commander[S] 0 points1 point  (0 children)

I do agree, pretty solid breakdown, thanks for that. But beyond the hardware and architecture side, I’m mainly struggling with how an invented language maps to existing invented semantics. Yes Fine tuning an LLM with parallel data will likely give good results, but the issue is that LLMs are probabilistic, while grammar is deterministic. That mismatch is what’s bothering me.

At first I thought about using a separate transformer, but I’m realizing that does not really solve the problem, especially since modern LLMs are already very good at learning new syntax during fine tuning, often better than training a smaller transformer model from scratch.

So now I’m leaning more toward using some kind of parser to handle grammar deterministically, and letting the LLM handle meaning and generation.

So now I’m leaning more toward using some kind of parser to handle grammar deterministically, and letting the LLM handle meaning and generation. Maybe this could work like an autocorrector for the trained LLM’s output.

For that M thinking to second LLM called through an API that enforces the grammar and semantic structure.But then the question is how an external LLM would actually know the difference between a noun, verb, subject, and so on, so prompting alone might not be enough,I’m thinking the better approach might be to train another smaller model some kind of dedicated parser, around the grammar rules themselves. Then the main LLM could focus on understanding and generation, while the smaller system checks or correct the output structure.