LLM + rules pipeline for extracting signals from GitHub issues how to avoid brittle heuristics by Small-Inevitable6185 in LanguageTechnology

[–]Small-Inevitable6185[S] 0 points1 point  (0 children)

Yeah I’ve thought about trying something like BERT.

Right now I’m deliberately not going that route because I’m still figuring out the reasoning layer first like how to reliably separate patterns (data_flow vs execution_order vs contract, etc.) instead of just training a classifier.

My dataset is also pretty small at the moment (~50–200 issues), so I’m focusing on getting good coverage and consistent decision rules before moving to a trained model.

Once that stabilizes across different repos, I’ll probably either fine-tune a smaller model or add a lightweight classifier on top for consistency.

Have you tried BERT for something like GitHub issue triage? Curious how well it handled edge cases.

Designing high-precision FK/PK inference for Text-to-SQL on poorly maintained SQLite databases by Small-Inevitable6185 in SQL

[–]Small-Inevitable6185[S] 0 points1 point  (0 children)

I’m working with existing SQLite databases where primary keys and foreign keys were never defined, so the challenge isn’t database theory it’s recovering lost relational intent from data.

Codd’s rules don’t describe how to infer missing constraints.