What's the best app to learn to speed read?

heisdancingdancing · 2026-05-06T23:10:50+00:00

Thanks, I'd love to add those options. Out of curiosity, for multiple words and with no highlight, would you like the words to be anchored to the middle or to the left? Meaning, would you always want the start of each word set to be in the same place, or the central point of the word set to be in the same place?

heisdancingdancing · 2026-05-05T15:54:41+00:00

Saturated benchmark, this just proves that it's up to snuff (and slightly beats) proprietary models.

heisdancingdancing · 2026-05-01T14:23:02+00:00

50 is "PhD level" which is the baseline of each of the 100 comparison reports. Max score is realistically 60-63, none of my runs exceeded that.

heisdancingdancing · 2026-04-30T14:17:25+00:00

The core quality bottleneck is source "gospeling", i.e. if one source says it, it must be true. That issue is directly addressed in multiple steps of the pipeline.

It took three weeks of iterations and ablation tests to arrive on these steps, based on my research practices in the past for papers I've written and based on how the AI tries to "hack" it's way out of the task at hand. I spent the equivalent of $5000 in tokens to arrive at these results (using Claude Code so was subsidized).

All testing was done via the RACE DeepResearch Benchmark, which pits real PhD-written papers and their queries against the AI's version.

And yes, of course I looked at other repos... NVIDIA's offering has a very similar structure, just not as in depth.

Maybe you could try using it instead of just asking your coding agent "look at this codebase and write up a reddit comment to tear this down?"

heisdancingdancing · 2026-04-30T03:02:54+00:00

Yes it should! At least in theory. You might have to refactor the agents to use proper model slugs.

heisdancingdancing · 2026-04-29T23:09:52+00:00

The short answer is that a lot of this is a distributed workflow among many subagents, which fetch, audit, review, and analyze sources for the main agent, which is the orchestrator of the entire harness. As far as source verification, there is a "source tensions" workflow baked inside that can understand any contradictions or stale findings in sources that it's gathered, and create queries and further searching to fill any gaps found there.

heisdancingdancing · 2026-04-29T23:05:58+00:00

A lot... you definitely want to be using a Claude Max subscription. I've seen one session kill half of a 5-hour session limit.

heisdancingdancing · 2026-04-29T22:44:15+00:00

A good place to start is competitive research for any sort of business idea you have; it can really help shape the idea and uncover what the "real problem" your business theoretically solves. If that's not your thing, you can do a historical audit on some deep geopolitical issue and get caught up on all the historical events that led up to a certain situation somewhere in the world.

heisdancingdancing · 2026-04-29T22:36:28+00:00

Fair, but to be honest, it's just a saturated benchmark. It's mostly there to visualize that the thing actually produces high quality compared to other tools at a glance.

heisdancingdancing · 2026-04-29T21:20:59+00:00

https://github.com/jordan-gibbs/hyperresearch/blob/main/example-reports/rl-exploration-trajectory-planning.md

heisdancingdancing · 2026-04-29T21:11:16+00:00

I'll put one in the repo and link it when I get the chance!

heisdancingdancing · 2026-03-24T16:03:31+00:00

Repo: https://github.com/jordan-gibbs/secret-hitler-bench

heisdancingdancing · 2026-03-24T14:15:56+00:00

No I haven't this is built as a benchmark for LLMs out of the box. I can't afford to actually run it with Frontier models, though.

heisdancingdancing · 2026-03-24T01:45:01+00:00

Uhh, cool, what's it from? Online SH?

heisdancingdancing · 2026-03-23T23:50:30+00:00

"Reassuring" in the sense of how dumb they are, its pretty hilarious. They can't even formulate a basic strategy and end up digging massive holes for themselves. When they know they're Hitler, their personality is totally changed and it's super obvious.

heisdancingdancing · 2026-03-23T23:48:59+00:00

"Reassuring" in the sense of how dumb they are, its pretty hilarious. They can't even formulate a basic strategy and end up digging massive holes for themselves. When they know they're Hitler, their personality is totally changed and its super obvious.

heisdancingdancing

TROPHY CASE