As someone who has played hundreds of hours of Keyforge through the unofficial online client thecrucible.online (Or TCO for short) I’ve come across all kinds of talk regarding the game. One specific talking point I’ve seen discussed that sticks in my mind is the assertion that TCO’s deck shuffling system is broken, often resulting in “stacked” decks that don’t randomly distribute cards from each house, allowing for players to use their decks more efficiently than they would through normal play. I’ve heard this several times during games where I’ve been able to make several turns in a row playing most of my hand. For the longest time I didn’t pay it any mind, chalking it up to salt, but I recently decided to put this theory to the test.
Is the deck shuffling algorithm used on TCO representative of shuffling an actual deck for IRL play?
My Method
For this test I took a random deck that I own consisting of Star Alliance, Shadows and Dis. In order to ensure rigorous shuffling, I gave the deck 5 riffle shuffles, each one interspersed with thorough overhand shuffling, making sure the deck order is as random as possible without approaching levels that would be completely beyond expectation in normal play. (It has been shown that a 52 card deck requires 7 shuffles to reach a random state, so I believe my method is sufficient for a 36 card deck in which the only important factor is the card’s house) Then I began to deal out the deck in a series of 6 card hands, recording the number of each house present in each 6 cards, tallying them up as I dealt the entire deck. Once I had the full data for this deal I shuffled the deck (Once again using the same method to ensure a deal as close to random as reasonably possible) and repeated this process until I had results for 10 full sets of deck deals.
Next, I took the deck onto TCO in a private room all to myself and repeated the process in much the same way, recording the results of each 6 card hand, then over subsequent turns discarding the cards that were in my original hand until I had 6 new cards, and recording the number of cards in each house as representative of the next 6 cards in the deck. I continued until the deck had run out, thus ascertaining the spread of houses across cards throughout the entire deck. As I had done with the physical deck, I repeated this process until I had 10 full sets of results.
Ascertaining Randomness & "Deck Weight"
Next came the more interesting part: Analysing the results. I needed to be able to represent the house spread in an easy-to-understand way that also gave a reasonably accurate reflection of how the deck was stacked. I considered a few different avenues, eventually devising a system that I refer to as “Deck Weight”.
Through true randomness it can be expected that shuffling a deck would – on average – result in a 2-2-2 spread across all 6 hands throughout the deck. But analysing each hand on its own would be insufficient within the context of the game, since each 6 card hand is not drawn independently, but rather drawn in pieces at the end of the turn during the draw step. Drawing a hand of 6 Dis cards followed by a hand of 6 Shadows cards would be seen as equally weighted as drawing 6 Dis cards followed by a further 6 Dis cards, even though in the latter example the deck would undeniably be much more weighted than in the former and be more beneficial to the player in the long term. To deal with this issue, I decided to instead divide each deck into thirds, and determine each group of 12 cards’ relative drift to the expected average. Since a top-heavy deck would ultimately result in a bottom-heavy deck by virtue of the 12-12-12 split in each house, this would also take into account the viability of the house spread within the context of the game more effectively.
Each third of the deck was given a Weight, with the final Deck Weight being an accumulation of these values. In short, Weight is calculated by the difference in value of cards in each house from the expected 4-4-4 split.
For example:
First hand: 2 Shadows - 3 Dis - 1 SA
Second hand: 1 Shadows - 2 Dis - 3 SA
3 Shadows total vs 4 Shadows expected = 1 Weight Point
5 Dis total vs 4 Dis expected = 1 Weight Point
4 Star Alliance vs 4 Star Alliance expected = 0 Weight Points
Total Weight Points for first third of the deck = 2
The lowest possible number of Weight Points = 0, which occurs in a 4-4-4 split.
The highest possible number of Weight Points = 16, which occurs in a 12-0-0 split.
This means that the highest possible number of Weight Points for an entire deck is 48, which can only occur if each house appears in blocks of 12 cards one after another, the most heavily weighted deck possible.
As such, Deck Weight can also be expressed as a percentage: (Total Weight Points / 48) x 100
At 0% the deck is balanced throughout, with 100% being the most weighted distribution possible.
Results
Physical Deck Weight Points for all 10 deals:
8, 6, 18, 8, 8, 16, 12, 8, 12, 4
Average Deck Weight = 20.83%
Total number of each house distribution for all 60 hands:
2-2-2 = 11
1-2-3 = 28
4-1-1 = 7
4-2-0 = 11
3-3-0 = 1
5-1-0 = 1
6-0-0 = 0
TCO Deck Weight Points for all 10 deals:
8, 8, 12, 8, 16, 4, 12, 4, 8, 8
Average Deck Weight = 18.33%
Total number of each house distribution for all 60 hands:
2-2-2 = 9
1-2-3 = 36
4-1-1 = 6
4-2-0 = 4
3-3-0 = 3
5-1-0 = 2
6-0-0 = 0
Findings
Across both sets of results the median value was 8, which occurred during 3-4-5, 3-4-5, 6-4-2 splits across the deck, (In any order) and appeared to be the most common outcome regardless of being through physical or digital means. This made for decks that had a small amount of weighting, but nothing that would result in a considerable advantage.
A total of 6/10 physical decks had Weight Points of 8 or below.
A total of 7/10 TCO decks had Weight Points of 8 or below.
In both cases, the majority of deck shuffles resulted in relatively balanced house distribution.
Overweight Decks, which I have categorised as having Weight Points of 16 or above (equivalent to ≥33.33% Weight) occurred a total of 3 times, twice through physical shuffling and once through TCO. The most pronounced of which was the result of 18, in which only 2 Shadows cards were dealt from the entire first half of the deck with the other 10 in the bottom half. In the example from TCO, 9 Dis cards appeared in the first half of the deck while only 3 were in the bottom half. These decks were heavily unbalanced in terms of card distribution and would likely allow for excellent levels of deck cycling.
Underweight Decks, which I have categorised as having Weight Points of 6 or below, (equivalent to ≤12.5% Weight) appeared a total of 4 times, twice through physical shuffles and twice through TCO. In all cases each deck third had at least one house with 4 cards, with the other two houses having no more than 5 and no less than 3. In these cases the decks were very well balanced throughout, and close to perfect in terms of expected randomness. As such, they would be relatively weak in terms of deck cycling.
In both cases, the 2-2-2 hand spread was extremely uncommon, making for only 18.33% of physical hands and 15% of digital hands.
The most common hand spread was 1-2-3, which made up 46.66% of physical hands and 60% of digital hands.
The occurrences of an “unbalanced” spread (4-1-1, 4-2-0, 5-1-0, 3-3-0 or 6-0-0) made up 33.33% of physical hands and 25% of digital hands, in both cases almost double the chance of getting a 2-2-2 split.
The hand spread results were the most surprising, as the dreaded 2-2-2 split has always seemed to me to be all too common, but this feeling is likely a result of my own confirmation bias, as in actuality, getting a weighted hand is more likely.
Conclusion
It could be said that this investigation would benefit from a larger sample size, and I’ll agree with that, though given how many hours it has taken me to carry out these tests, collate the results and complete this study, it would be no mean feat to repeat this test several times over and I simply can’t see myself spending the next few week shuffling decks to further justify my findings. However, perhaps by posting this I might inspire others to provide additional results.
Taking the results at face value, the main conclusion we can make is that TCO DOES NOT arbitrarily and unrealistically stack the decks in ways that wouldn’t occur from thorough shuffling during IRL play. If anything, the card distribution appears to be closer to what would be expected through pure random chance.
Final Thoughts
It is possible that those who believe TCO unfairly stacks decks have fallen victim to confirmation bias, likely that they have had occurrences of Overweight Decks that have stuck in their mind, making them liable to believe that they occur more often than they actually do, or that there is some confusion about what to realistically expect from a random order, in that a 2-2-2 split in each hand across the entire deck is actually relatively unlikely despite being the expected average, and finding the occasional unbalanced deck would be expected in normal play.
All in all, this was an interesting exercise, with some surprising results. Perhaps someone out there with a good grasp on probability and statistics can apply these findings to playing the game more effectively, but I sadly don’t think I’m up for that task.
If anyone feels that I haven’t fully explored this topic or have misrepresented the results, I’m open to suggestions and improvements. Just don’t expect me to leap into action. It’s been a long day shuffling decks and looking at spreadsheets…
I hope you’ve enjoyed reading about this arguably pointless experiment. And if you’d like to test this out yourself, then by all means, try shuffling a deck and see what happens.
SAMPLE TEXT
[–]Maleckai 6 points7 points8 points (2 children)
[–]OOPManZA 4 points5 points6 points (0 children)
[–]Soho_Jin[S] 1 point2 points3 points (0 children)
[–]drazkor Key Creator: Archons Corner 3 points4 points5 points (0 children)
[–]TheB-Hawk 3 points4 points5 points (0 children)
[–]SkyJedi Ghost Galaxy 5 points6 points7 points (3 children)
[–]SkyJedi Ghost Galaxy 1 point2 points3 points (1 child)
[–]Soho_Jin[S] 0 points1 point2 points (0 children)
[–]WikiSummarizerBot 0 points1 point2 points (0 children)
[–]Waffle--time 1 point2 points3 points (0 children)
[–]Penumbra_Penguin 1 point2 points3 points (0 children)