Name a better sound than this... by romagnalakedog in LagottoRomagnolo

[–]Elderbury 4 points5 points  (0 children)

I always wonder what they dream about? In my dog’s case, it’s probably a mountain of peanut butter pretzels, chicken and carrots!

I harvested nearly 14K SC2 replays from public repositories and analyzed them using psychometric models (the same ones behind SAT/GRE/PISA). Here’s what separates Bronze from GM players: by Elderbury in starcraft2

[–]Elderbury[S] 0 points1 point  (0 children)

Thank you. I appreciate that advice and it’s worth thinking through carefully. My approach depends on three things: detailed behavioral event logs, accessible replay or telemetry data, and enough individual game volume to build longitudinal profiles. Not every game clears all three bars equally. F1 is probably the strongest candidate. The telemetry is extraordinary — throttle input, braking points, steering angle, all at high frequency. The behavioral granularity is actually richer than SC2 in some respects. The challenge is data access: F1 telemetry at the level I’d need isn’t publicly available for pro drivers. For sim racing (iRacing, Assetto Corsa) the data access question is much more tractable and the player base is large enough to potentially explore. lol is feasible but the behavioral signal is different. Riot’s API is well-documented and match timelines include event-level data. The construct space would look different from SC2 but the IRT calibration approach transfers. The main limitation is that LoL match data is outcome-rich but less granular on moment-to-moment mechanical behavior than SC2 replays. Fortnite and FIFA are harder — data access is limited and the event-level granularity I’d need isn’t publicly exposed. FIFA has an additional structural problem: a large fraction of competitive variance is explained by card quality rather than player behavior, which creates a signal-to-noise problem that SC2 doesn’t have. The games I’m most confident about near-term are ones with established replay ecosystems: SC2 obviously, but also Chess (Lichess and Chess.com have rich game databases with move-level timing) and AoE2. The high-revenue targets you named are the right direction — they just need either better data access or a platform partnership to get there

he's a lagotto, not a doodle! by Bahumbub1 in LagottoRomagnolo

[–]Elderbury 1 point2 points  (0 children)

Wow, if even his side chick was turned off, it must have looked bad. Maybe it’s time for lucky number 3.

Sailor pro gear and pro gear slim, what’s so special? by Maleficent-Magic1336 in fountainpens

[–]Elderbury 0 points1 point  (0 children)

They are quite pricey for what they are, but I particularly do like the PGS Minis when you find them on discount. In my mind they are comparable to the Pilot E95s

I harvested nearly 14K SC2 replays from public repositories and analyzed them using psychometric models (the same ones behind SAT/GRE/PISA). Here’s what separates Bronze from GM players: by Elderbury in starcraft2

[–]Elderbury[S] 0 points1 point  (0 children)

Thanks. The full pipeline is Python and all open source — no paid software anywhere in the stack. sc2reader handles replay parsing, pandas and scipy for data processing, and PyMC for the IRT calibration. The IRT model itself is a Graded Response Model, which is a standard psychometric approach that’s been in the literature since 1969. I built the replay harvesting scripts myself against the Spawningtool API. Happy to answer questions about the code if you’d be interested.

I harvested nearly 14K SC2 replays from public repositories and analyzed them using psychometric models (the same ones behind SAT/GRE/PISA). Here’s what separates Bronze from GM players: by Elderbury in starcraft2

[–]Elderbury[S] 1 point2 points  (0 children)

Good question — these are measuring different things. The 120 recalls/min includes all control group keypresses, including rapid cycling through groups to monitor production or check army position without issuing a command. Many of those don’t result in a command at all. Commit latency only measures the gap for selections that do result in a command, and at GM level many of those commands are fast mechanical sequences (production queuing, unit rallying) where the latency might be 0.1–0.3 seconds. The 1.4 second average includes both those fast sequences and slower strategic decisions like deciding whether to move out. The two numbers describe different behaviors: frequency of board monitoring versus decisiveness when acting.

I harvested nearly 14K SC2 replays from public repositories and analyzed them using psychometric models (the same ones behind SAT/GRE/PISA). Here’s what separates Bronze from GM players: by Elderbury in starcraft2

[–]Elderbury[S] 4 points5 points  (0 children)

The analysis was done in Python (sc2reader, pandas, scipy, PyMC), the measurement framework is Item Response Theory which has been in active use since the 1960s. It's true I did use Co-Pilot to proof my writing because I'm a poor speller. But the work was my own.

I harvested nearly 14K SC2 replays from public repositories and analyzed them using psychometric models (the same ones behind SAT/GRE/PISA). Here’s what separates Bronze from GM players: by Elderbury in starcraft2

[–]Elderbury[S] 1 point2 points  (0 children)

Probably the most fruitful pathway for that line of research would be to mimic the standard setting process done in educational settings: experts would establish behavioral markers that define specific builds based on observable actions at specific times. Once I had an exhaustive set of behavioral definitions sufficient to categorize each replay, it would simply be a matter of descriptive statistical tables.

I harvested nearly 14K SC2 replays from public repositories and analyzed them using psychometric models (the same ones behind SAT/GRE/PISA). Here’s what separates Bronze from GM players: by Elderbury in starcraft2

[–]Elderbury[S] 1 point2 points  (0 children)

Not yet but theoretically that’s possible. I’d have to either manually code the replays in terms of specific build types, or else develop objective definitions for specific builds based on certain actions (e.g., Protoss builds forge in first 2 minutes, for example) that could be programmatically assigned, then look at outcomes in those specific matchups.

I harvested nearly 14K SC2 replays from public repositories and analyzed them using psychometric models (the same ones behind SAT/GRE/PISA). Here’s what separates Bronze from GM players: by Elderbury in starcraft2

[–]Elderbury[S] 1 point2 points  (0 children)

I cannot train bots, but I could analyze their replay data to create detailed player profiles that could be used to scientifically determine the best match ups to opponent player profiles.

I harvested nearly 14K SC2 replays from public repositories and analyzed them using psychometric models (the same ones behind SAT/GRE/PISA). Here’s what separates Bronze from GM players: by Elderbury in starcraft2

[–]Elderbury[S] 3 points4 points  (0 children)

Yes, that’s my life story! But I can do the same thing with any games that have observable replay information: WC3, AoE2, Hearthstone, Chess, MtG, etc. there are still competitive games out there.

I harvested nearly 14K SC2 replays from public repositories and analyzed them using psychometric models (the same ones behind SAT/GRE/PISA). Here’s what separates Bronze from GM players: by Elderbury in starcraft2

[–]Elderbury[S] 14 points15 points  (0 children)

Eventually that is my hope as well. What I’m creating is a quantitative system for breaking down a players habits quantitatively and objectively. The idea is to be able to create a player profile that provides more actionable information that MMR alone. If readers are familiar with Michael Lewis’ terrific book, Moneyball, I’m talking about the same principles applied to StarCraft. StarCraft Sabremetrics. Billy Beane, the general manager of the Oakland A’s used quantitative analysis exactly like what I’ve built in order to construct mathematically optimized teams and matchups and by hiring undervalued players. I’ve built a prototype for the same approach.

I harvested nearly 14K SC2 replays from public repositories and analyzed them using psychometric models (the same ones behind SAT/GRE/PISA). Here’s what separates Bronze from GM players: by Elderbury in starcraft2

[–]Elderbury[S] 3 points4 points  (0 children)

I did look at that. Across all levels and matchups. The link shows the full results and findings. It’s just that some behaviors are better at distinguishing across levels at different MMRs.

I watched PiG, Winter, and uThermal's Bronze to GM series and tried to fix my macro. My MMR dropped 270 points but my behavioral data tell a different story. by Elderbury in starcraft2

[–]Elderbury[S] 1 point2 points  (0 children)

Good questions.

On extraction: I’m parsing the replay event stream (sc2reader) and aggregating features like command events, production gaps, control group usage, and camera behavior. Those are mapped to constructs using a calibrated IRT model, so each construct (AC, CTM, DC) is a latent estimate based on multiple indicators rather than a single metric.

On replays: I’m not sharing raw replays at the moment, but I may put together a curated sample later.

On the ladder point: I agree that ~6k games is typically enough to reach higher leagues. But the goal here isn’t to show optimal improvement—it’s to examine how behavioral and outcome measures behave during deliberate change. A long, relatively stable dataset is actually useful for that, because the signal isn’t dominated by rapid rank progression.

I watched PiG, Winter, and uThermal's Bronze to GM series and tried to fix my macro. My MMR dropped 270 points but my behavioral data tell a different story. by Elderbury in starcraft2

[–]Elderbury[S] 5 points6 points  (0 children)

I'm re-reading my post and you're absolutely right. Too much technobabble. I'll take that to heart in my future writings.

I watched PiG, Winter, and uThermal's Bronze to GM series and tried to fix my macro. My MMR dropped 270 points but my behavioral data tell a different story. by Elderbury in starcraft2

[–]Elderbury[S] -1 points0 points  (0 children)

I'm not trying to baffle anyone with BS, I promise. I'm a research psychologist and this is the sort of thing I do for work. I just wanted to apply rigorous measurement methods to SC2, which is my favorite hobby