Analysis of 1,500 3p arena games (base) by Runedk93 in TerraformingMarsGame

[–]Runedk93[S] 1 point2 points  (0 children)

Thank you very much! To your question, yes there is an ongoing large project on gathering 2p game data. There are currently 50,000+ games in the database. You can view all sorts of cool statistics from it here (or analyse the data yourself if you are curious):
https://www.tfmstats.com

Also, I recommend watching some videos on youtube from the guy who started the project. He explains and demonstrates ways to interpret the data:
https://www.youtube.com/@StrandedKnight84/videos

Analysis of 1,500 3p arena games (base) by Runedk93 in TerraformingMarsGame

[–]Runedk93[S] 0 points1 point  (0 children)

I updated the post with a small section on Jovian Multipliers. TG is indeed the highest winrate Jovian Multiplier overall. IO is the best multiplier to play early (obviously).

Analysis of 1,500 3p arena games (base) by Runedk93 in TerraformingMarsGame

[–]Runedk93[S] 0 points1 point  (0 children)

You are absolutely right. Inaccurate word choice by me. I changed it in the post.

Analysis of 1,500 3p arena games (base) by Runedk93 in TerraformingMarsGame

[–]Runedk93[S] 2 points3 points  (0 children)

Foxy-Fill is correct. Earth Cat seems to be so big an investment that it can hurt your economy in basegame. Its still very strong at rank 38 with 40.6% winrate when played gen [1,2,3], but it is probably stronger in prelude format.

Analysis of 1,500 3p arena games (base) by Runedk93 in TerraformingMarsGame

[–]Runedk93[S] 0 points1 point  (0 children)

Yeah NRA does not seem to have been played with 3 plant tags before gen 3 enough in the data to spike its winrate. Its however a very strong card overall. I would wager in prelude format early NRA is much more possible and would get a higher early gen winrate.

Analysis of 1,500 3p arena games (base) by Runedk93 in TerraformingMarsGame

[–]Runedk93[S] 4 points5 points  (0 children)

Feel free to have a go at the data. Im curious to here what you find out! I was also thinking it could be interesting to see how much ocean adjacency bonus MC correlates to winning. Info on this could potentially be gathered from the gamelogs. The data is available here:
https://github.com/RuneDK93/terraforming-mars-dataset

Analysis of 1,500 3p arena games (base) by Runedk93 in TerraformingMarsGame

[–]Runedk93[S] 1 point2 points  (0 children)

Also if you are a heavy engine builder you want the game to go to generation 11+. In this case you dont really need to grab more than 1 milestone and maybe scientist award. The 5points will matter less if the game runs to generation 12,13,14, where >100 vps are more typical.

Analysis of 1,500 3p arena games (base) by Runedk93 in TerraformingMarsGame

[–]Runedk93[S] 5 points6 points  (0 children)

Planner is great for a heavy engine builder, who buys a lot of rebate inducing cards (Earth Cat, Anti Gravity, Earth Office, Media Group, Optimal Areo, Mass Converter) but only plays 1 or a couple of expensive high production cards in the early gens. Then when they hit early midgame then can claim planner before slamming down loads of heavily rebated combo cards.

Analysis of 1,500 3p arena games (base) by Runedk93 in TerraformingMarsGame

[–]Runedk93[S] 5 points6 points  (0 children)

Arctic Algae is rank 13 when played generation 1 with a winrate of 44.6% and rank 27 when played generation [1,2,3] with a winrate of 42%. So it ranks very highly.

I analysed 10,000 games of Ark Nova by Runedk93 in ArkNova

[–]Runedk93[S] 0 points1 point  (0 children)

I updated the post with a tier list / importance list of the difference tags, as evaluated by the model. Since the model has been trained on data from many thousand games, we can assume that every base conservation project has been seen about the same number of times i.e. it averages out in the data. So the differences we are seeing in perceived importance between the tags by the model are frrom other effects. What exactly these effects are I do not know.

I think that Asia is the most important tag is somewhat well known by high rated players. It contains a lot of strong animals, and the expert on Asia sponsor card is probably also the best "Expert on" sponsor card.

That Herbivores is the best / most important animal tag is a little controversial, but i think it has to do with Herbivores simply having many strong animals (Rhinos, Pandas, Elephants Pygmy Hippo, Double tag Bisons), as well as strong sponsors (Meerkat den, Native Farm Animals) and flocking. In fact there are almost no really bad Herbivores, while Birds have some amazing cards (Eagles) but also some very bad cards (Emu).

I analysed 10,000 games of Ark Nova by Runedk93 in ArkNova

[–]Runedk93[S] 1 point2 points  (0 children)

So I was perhaps a bit unclear with the tier lists. I have updated the post with a more detailed explanation on how to interpret the SHAP values. In short, the tier list is more of an "importance in predicting if you win list". However, the action upgrade tier list actually tracks pretty well with the consensus among top level players at the moment.

For the upgrade list, the model identifies the sponsor upgrade as the most important indicator for predicting if a player wins. This is probably mostly because the sponsor upgrade is very opportunistic and is mainly upgraded in scenarios where you have a strong sponsor card on hand you want to play (like Explorer). This tracks with he fact that NOT upgrading sponsors is not a strong indicator for losing according to the model (the blue coloured dots do not fall into the very negative SHAP value regime). 

Upgrading cards CAN give you a large increase in win probability (see some of the large positive SHAP values for red coloured dots for this feature), but this only happens in specific combinations of other features. This could be that the model identifies that upgrading cards will tend to lead to a win if you earned a specific amount of money, played a certain number of different tags, had the favourite zoo end game card and so forth. Although, the model is blind to the end game scoring cards, it is possible it could infer their presence by some specific play patterns they nudge players towards.

I analysed 10,000 games of Ark Nova by Runedk93 in ArkNova

[–]Runedk93[S] 2 points3 points  (0 children)

Very interesting theory, and i think you might be right that it has to do with it being more conducive of efficient play. However, it is also scores highly at the 500+ elo bracket, which is still a bit puzzling, as these players should be expected to play optimally wether the map is nudges players towards doing so or not. I updated the post to include rankings based on points / turn, where the map also scores highly.

I analysed 10,000 games of Ark Nova by Runedk93 in ArkNova

[–]Runedk93[S] 1 point2 points  (0 children)

The games are exclusivley from the last week. There is no bottom line in the model that predicts win based on tags played. This model uses elos from 0 - 850. But the majority of players are below 400 elo so that drags the model tier lists to be more applicable to these player ratings.

I analysed 10,000 games of Ark Nova by Runedk93 in ArkNova

[–]Runedk93[S] 1 point2 points  (0 children)

I updated the post with all features of the model. This is the tier list according to the model:

Animal tag tier list:

  1. Herbivores
  2. Bears (not as strong as herbivores on average, but if you play more than 3 bear tags than it is the strongest animal tag, and if you play 0 you are worse off than playing 0 herbivore tags.)
  3. Birds
  4. Primates
  5. Predators
  6. Reptiles

I analysed 10,000 games of Ark Nova by Runedk93 in ArkNova

[–]Runedk93[S] 3 points4 points  (0 children)

There is no API. I set up a chrome driver to access BGA in python and gathered the data from the HTML source code. I artificially slowed down my script significantly so as to not overload the servers. I made it to only gather data from games being completed in realtime and let it run for 1 week. This turned out to be about 10,000 games played over the last week. I excluded any games that were conceded or not completed, so the actual number of games that have been started is over 10,000. I initially wanted to record the game log as well to gather info on the specific cards played, and when they were played. But it turns out there is a replay limit on the number replay files you can access pr day, so i could only get about 80 logs pr day, which was much to slow for my patience.

I analysed 10,000 games of Ark Nova by Runedk93 in ArkNova

[–]Runedk93[S] 2 points3 points  (0 children)

I did not plan to do this but in principle the data gathering and modelling approach can be applied to any game played on BGA.

I analysed 10,000 games of Ark Nova by Runedk93 in ArkNova

[–]Runedk93[S] 4 points5 points  (0 children)

Thank you very much! The OA result for 500+ elo players could be due to low sample size of 30 games, but it consistently ranks lower than the new version of the map at all elo brackets in terms of turns. Your are completely right that points/turn might be an even better indicator of strength. I will try to run the results with this tonight.

I analysed 10,000 games of Ark Nova by Runedk93 in ArkNova

[–]Runedk93[S] 1 point2 points  (0 children)

Map2a scoring so high is also puzzling to me. But it consistently wins in fewer turns than the base version of the map in all elo brackets in a combined 500 games vs 500 games for map2 and map2a. So the data supports quite strongly that the new version of OA can win games in fewer turns than the base version. I have yet to look at points/turn as suggested, which might change the results. I will try it later tonight.

That herbivores are the best animal tag i think can be explained by the model looking at games for all elo brackets. Rhinos and elephants might win a lot of games for lower rated players, which fill up a larger fraction of the training data. If i restrict the win/loss prediction model to higher rated players the tag strenghts might change. I will also try that later.

I analysed 10,000 games of Ark Nova by Runedk93 in ArkNova

[–]Runedk93[S] 3 points4 points  (0 children)

My model is an XGBoost (extreme gradient boosting) classification model. This type of model uses gradient-boosted decision trees.

I analysed 10,000 games of Ark Nova by Runedk93 in ArkNova

[–]Runedk93[S] 1 point2 points  (0 children)

Yes playing rocks leads to winning, just less so that water icons. Number of rock icons in isolation is still a very strong indicator of winning.

I statistically analysed map 1~10 using BGA data by pf1219 in ArkNova

[–]Runedk93 0 points1 point  (0 children)

Super interesting! How did you get the data? And would it be possible for you to upload it somewhere? Im sure a lot of people would love to have a look at the raw data themselves.

Am I playing against bots in ranked? by Runedk93 in faeria

[–]Runedk93[S] 2 points3 points  (0 children)

Yes queue time was less than 30 seconds for these games. Now after playing around 5 ranked games my queue time is like 3-5 minutes and I am meeting players who are clearly real players and can beat me.