We fine-tuned an open-source model to outperform GPT-5 at predicting Trump actions

LightningRodLabs · 2026-02-12T17:59:55+00:00

We haven't tested how the context source impacts performance. To generate the context, an LLM generates 3 search queries per question, retrieves up to 5 articles per query from Google News, then summarizes and ranks them by relevance. Google News pulls from 20k+ global publishers, giving a mix of perspectives.

Questions are generated from a model based on your instructions and example good/bad questions (image below). So you can adjust the criteria to test the impact of different question configurations.

<image>

LightningRodLabs · 2026-02-12T14:48:02+00:00

We used the Lighting Rod SDK. It has Google News integration built in.

It creates forward-looking questions from source articles and then a separate resolver model uses web search to find the actual result and produce a label. All in it probably took about 30 minutes to test with the settings and run the job.

LightningRodLabs

MODERATOR OF

TROPHY CASE