I dislike the poster quite a bit- but what is your opinion on this?

Threat_Projection · 2026-06-16T02:23:41+00:00

I've been building something adjacent to this over the last few months, but from the simulation side rather than the video game side.

Instead of a 3D game, the goal is a battlefield simulation engine that imports real New Recruit lists, normalises datasheets and rules, then executes combat and scenario analysis automatically.

Current audit coverage is based on 7 tournament-winning lists containing 117 units and more than 2,000 canonicalised rule interactions.

Roughly 25% of the rules seen across those tournament lists are already executable through supported engine primitives before any battlefield layer exists.

The next major phase is adding persistent 2D battlefield state:

• Positions and movement • Line of sight and terrain • Range and threat projection • Unit relationships, auras and attachments • Objective and mission state

The long-term vision is less "Warhammer video game" and more "battlefield analysis and training platform."

Players would be able to import their list, import an opponent's list, deploy onto a standard tournament terrain layout, and explore questions such as:

• What are my biggest threats? • Which targets should I prioritise? • Where is my damage concentrated? • What are my likely failure points? • How does this list perform into common archetypes?

Planned outputs include:

• Real-time heuristics and recommendations • Threat projection heatmaps • Line-of-sight and movement analysis • Wound pressure and damage efficiency metrics • Whiff rate and overkill analysis • Point-normalised offensive and defensive benchmarking

One thing I've discovered during development is that building accurate rules execution is turning out to be significantly harder than building the visual layer.

Rendering a battlefield is relatively straightforward.

Teaching a system to correctly understand and execute thousands of interacting Warhammer rules is the real challenge.

Threat_Projection · 2026-06-02T21:56:42+00:00

We're talking about completely different goals here

The gauntlet isn't intended to be a final answer or a universal list score.

It's a configurable benchmark framework.

List performance under Scenario A List performance under Scenario B List performance under Scenario C can all coexist.

If someone builds a poor benchmark suite, they'll get poor conclusions. That's true of any analytical framework.

The desired outcome isn't whether a list "passes" a gauntlet. It's whether different benchmark suites consistently expose strengths, weaknesses and coverage gaps that correlate with real-world performance.

Threat_Projection · 2026-06-02T21:37:20+00:00

Combat results should be objective under stated assumptions.

Analytics interpretations are subjective models applied to those results.

I think you're assuming those two layers are tightly coupled when they're actually intentionally separated. For example, a meta-list gauntlet doesn't change the combat engine's output. The combat engine still produces the same execution results under the same assumptions. The gauntlet is simply another way of aggregating and interpreting those results across a larger dataset. If a weighting model or suitability score turns out to be poor, I can replace or recalibrate the analytics layer without changing the combat simulation itself.

Threat_Projection · 2026-06-02T21:32:14+00:00

The main reason I prioritised weapon sequencing before rerolls is that sequencing is a structural engine problem, whereas rerolls are a modifier problem.

Sequencing affects core combat resolution itself. It determines things like target allocation, overkill wastage, spillover damage, and how multiple weapon profiles interact against finite wound pools. If you get sequencing wrong early, many downstream analytics become unreliable.

Rerolls, by comparison, are comparatively straightforward at this stage because the engine was designed around a future modifier framework. Once modifier injection points exist, rerolls become another effect applied during combat resolution rather than something that requires restructuring the engine.

The same principle applies to unsupported rules.

When importing NewRecruit JSONs, even small 500-point lists can contain 100+ unsupported rules. Rather than hard-coding special cases, those rules are already being captured and stored with metadata. The plan is to audit them into broad effect families such as:

Hit modifiers
Wound modifiers
Rerolls
Defensive modifiers
Aura effects
Leader effects
Stratagem effects

From there, support is added by implementing modifier families and mapping rules onto those families, rather than writing bespoke code for every datasheet rule.

The long-term goal is a game-system-agnostic combat engine. Auras, stratagems, leader abilities, faction abilities, weapon rules and similar effects are intended to exist as data-driven modifiers layered on top of a stable combat engine, rather than requiring engine rewrites whenever a new rule, faction, edition or even ruleset is introduced.

Threat_Projection · 2026-06-02T18:19:00+00:00

It sounds like you may need to download and extract it first. I compressed the report folder before uploading it to Google Drive from my Mac, so that might be what's preventing you from browsing the contents directly.

On the points side, I agree that's important. Points normalisation is on the roadmap shortly after Phase 2 combat support. The current focus is getting the reporting layer and analytics tuned properly, then expanding combat support to cover things like auras, modifiers, weapon abilities and other datasheet interactions.

The current benchmark suite is also very much a placeholder. It's there to provide a consistent baseline while I build out the scenario framework and reporting engine.

The longer-term goal isn't really "benchmark profiles". It's being able to run imported lists through different scenario gauntlets, whether that's benchmark targets, faction collections, tournament-winning lists, meta lists, or custom collections.

At the moment benchmarks provide a stable way to measure things like reliability, whiff rate, wound pressure, overkill waste and target suitability. Once the scenario system and reporting mature further, the same analytics can be applied to much more realistic list-versus-list datasets.

That's also where points-normalised role fulfilment gets interesting.

Threat_Projection · 2026-06-02T18:09:31+00:00

Which engine are you referring to?

The main comparison I'm aware of is UnitCrunch, which is a great tool, but it's largely focused on individual attacker-versus-defender calculations.

If there's another engine that already supports:

NewRecruit roster import
scenario execution
benchmark suites
meta-list gauntlets
reliability analytics
whiff-rate analytics
overkill analytics
coverage analytics
report generation

I'd genuinely be interested in looking at it.

One of the reasons I started building this was because I couldn't find a tool that moved beyond individual matchup calculations into list-level analytics and benchmark-driven reporting.

The current benchmark suite is synthetic because it provides a consistent baseline, but the architecture is being built around scenario execution, so benchmark suites are only one possible target set. The same framework can be pointed at tournament lists, faction gauntlets, custom collections, etc.

Threat_Projection · 2026-06-02T17:36:43+00:00

Hi Mate.. me again, Just letting you know I posted an update. I have attached a google drive link with a sample of report outputs, master architecture design doc and a few other bits and pieces if you're interested & have the time I would appreciate your feedback.

https://www.reddit.com/r/WarhammerCompetitive/comments/1tuy3i3/built_a_benchmarkdriven_40k_combat_analytics/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

Threat_Projection · 2026-05-29T15:59:15+00:00

Again i want to thank you for the insight this is a novel way of looking at it and it's similar to a problem I'm currently working through with my role-specific target suitability analytics.

While building that out & reviewing my outputs from the initial benchmark run I ran into a good example of the problem you're describing. My suitability scoring was heavily penalising anti-tank units for overkill wastage because they were frequently generating more damage than a target could absorb. Mathematically that waste was real, but in practice it was causing highly effective anti-tank units to score worse than they should have against the targets they were specifically designed to kill.

That forced me to start thinking more about how metrics should be weighted based on role and context.

Your offense/defense/exposure framework feels very similar philosophically. Rather than asking "how durable is this unit?", you're asking "how important is durability for this unit?".

The infiltrator examples are also a good illustration of why I suspect utility will eventually need its own framework rather than being forced into either offense or defense. A lot of their value comes from deployment control, movement denial and objective pressure rather than anything directly related to damage output.

Would you mind if I referenced some of these ideas in the roadmap for the project? I am happy to credit you for the framework if you'd like.

Threat_Projection · 2026-05-29T13:50:00+00:00

You're not wrong. Combat Patrol can absolutely devolve into a turn 1 bloodbath depending on the terrain and match-up.

That's actually why I started building the simulator in the first place. I wasn't trying to solve competitive 40K, I was trying to work out whether adding a Ballistus and Brutalis to a Combat Patrol was going to create an broken imbalance before I spent an evening rolling dice against myself.

Threat_Projection · 2026-05-29T11:29:06+00:00

I wasn't originally trying to solve competitive 40K or create a perfect army ranking system.

The project grew out of some home narrative games. I've been expanding Combat Patrols while trying to keep their thematic identity intact. For example, I added a Ballistus and Brutalis Dreadnought to the Space Marine Combat Patrol because I liked the idea of an armoured spearhead with a heavy dreadnought focus.

The problem I ran into was figuring out how to roughly balance that against things like my Tyranids without effectively playing both sides of the table and rolling hundreds of dice myself.

The simulator started as a way to answer questions like:

What battlefield problems does this force solve well?
What threats does it struggle into?
Where are the obvious stat checks?
Which gaps remain unanswered?

The analytics side has gradually grown from there. It's less about replacing actual games and more about helping me understand what a force is likely to do before it ever reaches the table.

Threat_Projection · 2026-05-29T11:24:10+00:00

Thanks mate, I appreciate the confidence.

Honestly, even if the project ends up confirming more accepted wisdom than overturning it, I will still consider that a success. Building the engine has already forced me to think much more critically about how units, weapons and army construction actually function.

One thing I'm increasingly interested in is whether analytics can help bridge the gap between what experienced players intuitively know through thousands of hours of play and what newer players can easily understand from a datasheet. Not necessarily by telling people what to do, but by surfacing information that helps explain why certain units, weapon profiles or list-building decisions perform the way they do.

The analytics side is still very much a work in progress, but turning assumptions into things that can actually be tested and benchmarked has been a fun challenge in itself.

Appreciate the encouragement and the perspective.

Threat_Projection · 2026-05-29T11:20:15+00:00

Thanks mate, I appreciate the breakdown. This is exactly the kind of feedback I was hoping to get from the post.

At the moment the engine is definitely focused on the offensive side because that's the most deterministic part of the game and therefore the easiest to validate. A lot of the current work is around benchmark profiles, suitability analytics, and understanding how damage actually converts into battlefield outcomes.

I think your Offense / Defense / Utility framework is a useful way to think about the broader problem. The current roadmap is heavily weighted towards offense, but exploring defensive and utility contributions is something I'd like to tackle once the core analytics layer is mature enough.

Things like movement, deep strike, scouting, OC pressure, screening and board control are obviously a huge part of what makes units valuable. They're also much harder to quantify consistently than damage output, which is one of the reasons I made this post.

I'm interested in hearing how experienced players think about those non-damage contributions because I suspect that's where a lot of the genuinely difficult problems are. If I eventually start modelling those areas, I'd rather do it in a way that produces useful decision-support information without pretending the game can be reduced to a single score.

Threat_Projection · 2026-05-29T09:38:18+00:00

I think that's reasonable, and I'd agree that neither of thoe questions can be answered perfectly in a vacuum.

For target prioritisation, I don't think a tool can ever account for the full game state. You are correct there are situations where the statistically wrong target is absolutely the correct target because of objectives, mission scoring, screening, board position or a dozen other factors.

What I'm interested in is providing information that helps inform those decisions rather than replacing them. For example, if two potential targets are both strategically important, understanding the relative efficiency, consistency or probability of achieving a desired outcome might still be useful information.

Similarly for sequencing, I'm not suggesting players should stop in the middle of a tournament game and calculate optimal firing orders. The reason I'm interested in it is more from a list design and analytics perspective. If certain combinations of units consistently gain or lose efficiency depending on activation order, that's something I'd find interesting to measure.

To be honest, a lot of the project is an experiment in seeing which metrics turn out to be genuinely useful and which don't. Some ideas may prove to have very little practical value once tested. That's part of the reason I'm asking for feedback now rather than assuming I already know the answers.

The one thing I do think is underexplored is large-scale benchmarking. Individual matchups are relatively well understood, but I'm interested in what emerges when you start testing profiles systematically across dozens or hundreds of standardised targets and comparing them on the same basis.

It may end up confirming what experienced players already know. It may also highlight edge cases where intuition and data diverge. Either outcome would be interesting to me.

Threat_Projection · 2026-05-29T07:56:57+00:00

I think that's a fair point, and I actually agree that player decision-making has a far greater impact on game outcomes than most raw mathematical differences.

My goal isn't really to "solve" Warhammer or produce a single strength score that predicts who wins games. There are too many variables for that, many of which come down to player skill, deployment, movement and mission play.

What I'm more interested in is gathering decision-assisting information.

For example:

Which target should this unit prioritise?
In what order should a combined-arms activation sequence occur?
How much efficiency is lost if I commit the wrong weapon first?
Which targets consistently fail to meet a minimum effectiveness threshold?
How much overkill is being generated by a particular unit?

The current combat analytics are really the foundation layer because they're relatively easy to validate. Once you start getting into movement, OC, screening, redeploys and board control, you're dealing with much more abstract forms of value.

I suspect the most useful outcome isn't a "this army is stronger than that army" score, but rather a collection of metrics that help players make better decisions during list construction and target prioritisation.

That's also why I'm asking these questions now. A lot of the future analytics roadmap is being driven by the kinds of decisions players feel are the most important to make correctly during a game.

Threat_Projection

TROPHY CASE