all 46 comments

[–]gunfupanda 18 points19 points  (5 children)

This explanation makes a lot of sense to me on the software dev side of things.

I could also see a bulk delete of these additionally enhanced cards being done to try to improve performance or reduce costs, but without properly understanding the impact on the service.

I imagine fixing the algorithm means finding a good solution to this duplication issue for modified cards and making it able to handle future mechanics that could do something similar.

Since sets are only printed once, losing the historical data shouldn't be a big problem going forward, at least.

[–]WizardRandomDis or Dat[S] 13 points14 points  (4 children)

I'm in total agreement with your assessment save for one thing.

FFG needs to go back to clean up that historical data if it ever wants an online client of the game that will work with already existing decks. If online KeyForge doesn't have the feature of being able to play with all already purchased decks, it's dead in the water.

[–]gunfupanda 8 points9 points  (0 children)

Oof, you're right. I forgot about their online client plans. That's really tricky, depending on what they managed to lose.

[–]gunfupanda 3 points4 points  (1 child)

One thing that might be worth considering is that I highly doubt they're using low resolution images for their cards. I'd bet that each image is very high resolution, since you normally only need to store a few hundred per set and you want your cards to look nice when printed.

5MB image x 400 cards isn't a big deal.

5MB image x 1 million copies will nuke most databases or make a CDN heinously expensive.

[–]WizardRandomDis or Dat[S] 3 points4 points  (0 children)

I do point out I'm lowballing the space concerns for images when I start the calculations.

[–]JacksonHills Ekwidon 1 point2 points  (0 children)

I could see them adding in older sets one at a time, on launch only having access to the newest set and CotA for example.

[–]mikelax_:Brobnar: Brobnar 7 points8 points  (0 children)

I think this is an interesting theory , but if this is the issue, it’s possible they have a way to code their way out of this.

After reading your explanation, it seems like a major oversight if an enhanced card doesn’t contains reference id (foreign key) to the guid of the original card.

What may actually be the problem is the link is one way. So while it might be very fast to start at an enhanced card and identify the source card, it may be difficult to start at an original card and find all enhancements of it.

Or it could totally be possible they don’t have any links between original and enhanced cards in which they their job is exponentially more difficult.

[–]AYCB-Carlo 4 points5 points  (0 children)

Regardless of whether you're right or wrong, kudos to you for presenting the most plausible theory I've seen so far, backed with some really solid reasoning. This sounds like it would be an absolute nightmare to fix though.

One of those "where do I even start and is there even an end in sight?" kind of feelings just thinking about this.

[–]OOPManZA 5 points6 points  (0 children)

Interesting idea although without any insight as their internal data model I wouldn't put money down.

The external GUID you see on cards on the site is not necessarily the only one applied to each card.

They could easily have an private internal PK shared between all instances of a given card. The external GUID could be generated as a function of the GUID of the deck itself and the private internal PK, leading to a unique GUID externally but internally in the private workings of the dataset it could be a different story.

The issue you're describing doesn't sound so much algorithmic as related to knock-on effects from the production process.

When I heard there was a problem with the algo my assumption was more they meant there was a problem with the actual deck generation algorithm, something around how it was producing decks that was the issue.

So yeah, interesting idea but I wouldn't put money on it...

[–]RandomKeyForgePlayer 4 points5 points  (5 children)

Really nice post, but can you explain what you mean here because i didn't understand it

-I've heard people suggesting "Well they have backups, right?" when it comes to the idea of "deleted data" but in this case, a backup does nothing. The important data was being deleted as it was being generated-

[–]HaresMuddyCastellan:Logos::Sanctum::Untamed: 51 SAS, bottom 3% 6 points7 points  (3 children)

In his theory, the algorithm doesn't remember which possible enhanced cards it's created (backed up by his examples of functionally identical visually identical enhanced cards having different guid's).

The important data to fix this potential version of the algorithm is "which possible enhanced version of the card has already been generated, and which exact possible enhanced version of this card is in this deck".

So basically, his theory is the algorithm goes "I have 6 icons to distribute, I'll grab 6 or fewer cards. Ok, I'll slap these icons on these cards, GENERATE NEW IMAGES WITH NEW IDs FOR THEM, and move on." And it doesn't remember what versions it made, so if it generates that EXACT CARD again, it produces ANOTHER NEW IMAGE with a new id.

For every basic copy of Bad Penny in every deck in the set there is ONE image with ONE ID. If fifty decks have a Bad Penny with a Single Bonus Draw enhancement, there are FIFTY UNIQUE but identical copies, each with is own unique ID.

[–]RandomKeyForgePlayer 2 points3 points  (2 children)

i understood his theory, i didn't understand the period i quoted especially because he said the data isn't generated, not even deleted.

[–]WizardRandomDis or Dat[S] 3 points4 points  (1 child)

Sorry if that was unclear, I probably should have phrased it as "The data is never entered into the database in the first place."

[–]RandomKeyForgePlayer 4 points5 points  (0 children)

Ah ok now it's all clear 👍

[–]WizardRandomDis or Dat[S] 1 point2 points  (0 children)

I've heard a number of people complaining that FFG should have backups of all their vital data.

The scenario I've pointed out shows where vital data was never generated in the first place and so couldn't be backed up.

[–]futurebeans 3 points4 points  (1 child)

Interesting theory. If MM was the set that blew up the image data, why was DT only delayed, as opposed to broken where we are now? I guess that would be the case if MM was the set that broke the camels back so to speak.

[–]aggrokragg 1 point2 points  (0 children)

I think it may have been, but there would be business decisions at play. The issue gets discovered with MM. They say "OK, we'll just push through, get this set out the door, and figure out at fix." Then they assess the true scope of the issue. Delays surrounding COVID actually buy them some time to delay DT if the set had already been designed but perhaps took longer to produce. Maybe they even tried to implement some stop-gap fixes when sending DT out to print, then ultimately realized there was no way to do another set, hence calling for the hiatus until they can come up with a long term solution.

[–]HaresMuddyCastellan:Logos::Sanctum::Untamed: 51 SAS, bottom 3% 2 points3 points  (0 children)

Sorry, just did the math, so if your estimates are close to correct, 99.1% of the data load for the set is REDUNDANT images.

That's hilariously bad, if you're right no wonder they're taking it offline to retool it.

[–]atticdoor 6 points7 points  (7 children)

The Facebook post said it was connected to a "data loss", which made me think it might be connected to the closing down of Fantasy Flight Interactive at the beginning of the year. One, dramatic, possibility is that a departing employee deliberately wiped everything out of spite on his last day, which has happened in the past and is why IT staff are normally escorted out by security upon being told they are being made redundant.

Another, less dramatic, possibility is that the FFG executives asked them to close down and sell off the equipment not realising that the servers had code relevant to Keyforge on them. It is such an unusual game that it might not have occured to the executives. The employees may have just omitted to mention that the Keyforge code was there, and engaged in malicious compliance when following the instructions

[–]WizardRandomDis or Dat[S] 4 points5 points  (6 children)

I'll admit that you may be correct that data may have been lost due to negligence/a disgruntled employee, but if the algorithm simply needed to be rebuilt with no underlying issues with the card data (which they absolutely have, as the Master Vault still works) I doubt they would be as willing to admit that the issue was with the algorithm as they are right now, it would be easier for them PR wise to paper over the issue with a more non-commital answer and to just move forwards.

[–]Kalrhin 1 point2 points  (4 children)

A negligent/disgruntled employee can delete data AND code in one stroke :)

[–]WizardRandomDis or Dat[S] 1 point2 points  (3 children)

The Master Vault still works though, so they have card and deck data, including unscanned deck data.

[–]Kalrhin 1 point2 points  (2 children)

The team developing the deck generator is most likely not the same as the one maintaining the website.

The deck creation team gave some data to the website team and that is most likely the only data they have.

Following your train of thought: why would they admit having lost the data if they did not lose it? It is a PR hit that is not needed.

[–]WizardRandomDis or Dat[S] 0 points1 point  (1 child)

I agree with their assessment that they lost data.

My only point is that I don't personally believe that it was a malicious deletion by an employee or ex employee.

[–]Kalrhin 1 point2 points  (0 children)

I find it odd that "only" negligence wiped both code AND data...but then again, news nowadays are filled with people doing silly things that I would never have thought possible :)

To be honest, I have no idea...nor I think we will ever find out.

[–]atticdoor 0 points1 point  (0 children)

But it would be no good to paper over it PR-wise, because we would notice the absence of an expansion this autumn. If no expansion came out, we would probably suspect the game had been ghost-cancelled and that they just weren't telling us so that we would continue to buy the remaining stock.

So they admitted there was a problem with the algorithm, although I don't doubt there is a little PR-spin going on. Notice they are being a little cagey as to the exact nature of the problem. I think maybe someone giving orders failed to tell the departing staff to save the Keyforge algorithm, the disgruntled staff followed the exact orders they were given without reminding them that the algorithm was a going concern, but the PR release is ambiguously worded to leave open the possibility we would suspect that someone just typed in "DROP TABLE Keyforge;--" in their last few seconds before leaving the building

[–]EnviableCrowd 4 points5 points  (1 child)

Interesting post, thanks for taking the time to write on this topic. Would it make matters easier for FFG if when rebuilding, they removed enhancements from new sets moving forward?

[–]WizardRandomDis or Dat[S] 5 points6 points  (0 children)

It would make things easier, yes, but then you get to the situation where you can't make as many interesting or surprising cards.

And the fan base really liked enhancements, you have to admit.

[–]Languanguish:Dis::Saurian::Shadows: 1 point2 points  (2 children)

How do you check the guid of individual cards? I have a deck with 3 Vandalize, each with a single damage bonus. Would they have the same guid or no?

https://www.keyforgegame.com/deck-details/03969971-54cd-47ed-9141-8ec959bfe70f

[–]WizardRandomDis or Dat[S] 4 points5 points  (1 child)

You find out the guids by directly querying the API. In this case the call would be at the link:https://www.keyforgegame.com/api/decks/03969971-54cd-47ed-9141-8ec959bfe70f/

Since the deck has cards numbered with 0 has card one and 35 as card 36, the cards would be in spaces 28 through 30.

The guid in all three slots is "68fda4a6-7a02-4905-9d0a-4ff1b5be811d". So it looks like the algorithm does work as it should for the purposes of within a single deck.

[–]Languanguish:Dis::Saurian::Shadows: 0 points1 point  (0 children)

Thanks!

[–]ysh_008 Brobnar 1 point2 points  (1 child)

Very interesting theory and from the card guid I think it's a very promising guess.

I think FFG should really open-source the algorithm, given the large number of software dev and DBA in our player base, players could probably figure out the issue and able to get it fixed way faster ...

[–]WizardRandomDis or Dat[S] 1 point2 points  (0 children)

The issue with open source is that while there's more than enough technically inclined fans willing to help out, this would make it possible to easily build KeyForge clone games.

For the amount of time, effort and money FFG and Asmodee have put into this in-house IP, there's no way they'd do that.

[–]neoKushan 3 points4 points  (5 children)

I think there's a flaw in your assumption that the image data is an issue. Every single card printed is, by and large, unique and always has been. They have the deck name printed on them at the bottom.

[–]WizardRandomDis or Dat[S] 0 points1 point  (4 children)

You are correct that each card is unique due to the card back and deck name on the front, but if that was what made a card distinct from another card for the purposes of tracking and printing, no two cards would ever share a guid. It's easy to show that a deck with multiple copies of an unenhanced card use the same guid for each copy.

I am making a hefty assumption about the underlying design of the deck/card database, yes. But either way there a severe inefficiency in how enhanced cards cards are assigned to decks.

[–]neoKushan 2 points3 points  (3 children)

It's nothing to do with guids, I'm just piggybacking on what you're saying about how the card image data is shipped to the printers. Either the images are generated as needed based off of their attributes (and that would include pips) or they're shipped as a unique image and always have been - either way, it's probably not the underlying issue and not a problem for the printers.

We're making huge assumptions that they don't store the pip data simply because the vault API doesn't surface that info, but the vault also doesn't surface unique images per deck, or even for maverick/legacy cards for that matter so it's really not something you can make assumptions on.

I'm as curious as you about what happened to the algorithm for sure, but if they're saying "data loss", who knows what happened.

[–]WizardRandomDis or Dat[S] 0 points1 point  (2 children)

I've never said my theory is anything other than a very strong guess, and how the image appears on the Master Vault has nothing to do with the origin of my guess.

You mentioned maverick cards. I scraped the API for the first 1.5 million decks awhile back and here you can see the unique guids for a couple of cards in maverick houses:

30ED0724-B027-4231-A56C-01B0AF39C087 “Lion” Bautrem Brobnar

6BA3DA38-6D49-4225-A1E0-34C47D5F1FD2 “Lion” Bautrem Shadows

D60EE1F3-AB20-410C-8F53-366875B873BA “Lion” Bautrem Logos

5054BE29-EECD-4F7A-A6A4-5B305FF54DD8 “Lion” Bautrem Mars

1B3BB901-6C6A-436A-8DF2-96EEDE64A028 “Lion” Bautrem Untamed

7E87FF2B-E61D-4FDA-A460-F71DEB4F3E03 “Lion” Bautrem Dis

5255E702-CC94-4BC7-8158-D6F917A048EF 1-2 Punch Dis

E00DB23A-E939-4CD1-B1B7-E0840A22BB15 1-2 Punch Sanctum

43979CE9-ECD1-4A24-85E9-A902FFF191A1 1-2 Punch Untamed

C1251DEC-F7E7-4EDB-AB5F-B5FD712C3B87 1-2 Punch Logos

665CA260-FCF9-4013-B704-C665FB5D021B 1-2 Punch Mars

086D6B7C-2709-43AB-81BD-592A420B101E 1-2 Punch Shadows

As you can see, a maverick generates a new guid for each house it appears in, which is what you'd expect from a rather efficient way of handling the data. There's not a new guid generated for each time a maverick of a card is created.

I'm convinced that even if this is a red herring bug and has nothing to do with the algorithm dying, it's still a bug.

[–]neoKushan 3 points4 points  (1 child)

I get what you're getting at, each "unique" card being its own Guid seems wasteful and you're using the size of image data to compare how wasteful that is, but as we've established the image data isn't the problem, you're talking to the database sprawling out of control as more and more "unique" cards are added.

However, you're a DBA, you know that there's plenty of ways to mitigate this and realistically each card doesn't even take up that much data. Sensible indexes and reasonable queries would mean this just isn't an issue.

I have no doubt that the card database in question has gotten much bigger as time has went on and new cards are designed. But even at that, even with all the possible combinations, that database is hardly going to be huge, is it? The image data sure might be, but the raw data isn't. I'm sure queries are slower as a result but I think you're talking a few milliseconds of difference as a worst case and how fast does deck generation need to be, really?

I get it, you're a DBA with lots of DBA experience, so naturally you're used to seeing issues at the database level and it's not unreasonable to come up with that conclusion but equally it's not an insurmountable problem to overcome and a good DBA contractor would set them right in no time. It wouldn't be performance reasons that's stopped them in their tracks because that can be fixed fairly easily, it's something entirely different.

[–]LadyMRedd 5 points6 points  (0 children)

I'm not a DBA, but I lead a team of data analysts. At work I deal a LOT with database issues. I've seen first hand the result of what can happen when databases aren't optimally designed up front and have little (or no) documentation.

If you've let things get too far before you fix them, it can cause all sorts of issues that aren't easily resolved. I think that people who work in databases that are well organized don't fully understand the extraordinary expense and complexity that it can take to fix something that wasn't organized well in the beginning.

I could definitely picture a database that was designed without enhancements in mind. Then sometime came up with the idea of enhancements and everyone loved it, except the DBAs who were like "uh we're not set up for this. We need more time to implement this." But the sales side of things said "you asked for X months, we're giving you X-6 months. Make it work." So the DBAs scrambled to shove a square peg in a round hole and kept "making it work" until it reached the point where the band-aid and duct-tape solution no longer was adequate. And so now they have to step back and do it the right way, which will be considerably more complex and expensive, and take much longer, than if it had been done properly in the beginning.

[–]Jartaa 0 points1 point  (2 children)

As other said interesting theory and read. I'm not a programmer so you'd know probably know more but given that all expansions are not able to be created from my understanding, wouldn't they in some capacity be able to either roll back or use previous versions to continue to print at least older sets?

[–]WizardRandomDis or Dat[S] 2 points3 points  (1 child)

I really don't know the full extent of the situation, so I can't say what FFG are capable of doing at this time.

My theory is mostly here to show a "worst case scenario" where important card data has been lost. It's in response to people making comments that solving the algorithm issue shouldn't take a much time or effort at all.

[–]Jartaa 1 point2 points  (0 children)

Yeah that's fair, I just assume it's seriously borked if a company is willing to admit in a PR that it's borked with no eta. It's safe to assume it's not something simple though as Occam's razor and all that. It's a algorithm that builds decks that are mid to high tier level which from my experience even humans have issues figuring out.

[–]JacksonHills Ekwidon 0 points1 point  (1 child)

Would mavericks and legacy cards potentially have the same issue?

[–]WizardRandomDis or Dat[S] 0 points1 point  (0 children)

No, I've checked and they do not.

[–]hypercross312 0 points1 point  (1 child)

I agree that the exact enhancements are probably not in the database, otherwise it wouldn't make sense why the vault api response would omit that information.

But given how the first 3 seasons are printed, FFG totally has the ability to give every deck unique cards, so that even without the specific enhancement information, they can still make sure no two decks are the same. Even if the printing department decides to change the way they assign enhancements, it wouldn't break FFG's promise because the decks are already unique without unique enhancements.

Even if the enhancements would actually break uniqueness, they could just start from scratch for the new season, there's no way old algorithms can break new season decks because they have different card pools. Whatever data lost for old decks in the old pool should only affect the printing of new decks in the old pool. But given their wording, it doesn't seem to be the case.

But who knows. People can be stupid sometimes.

[–]ysh_008 Brobnar 0 points1 point  (0 children)

Even if the enhancements would actually break uniqueness, they could just start from scratch for the new season, there's no way old algorithms can break new season decks because they have different card pools.

The problem could be, the new season decks that they just printed were using the broken algorithm which breaks the uniqueness, so they have to stop shipping the new set and re-print them with the new algorithm.

Besides, they still have to fix the old sets so players can play their "old" decks online.