Age of Empires 2 Civilisation Performance Statistics

Introduction
Statistics
Methods
Frequently Asked Questions & Critiques

Introduction

This document provides win rate and play rate statistics for Age of Empires 2 DE across different game modes and map types. Credit to https://aoe2.net/ for providing the data used to generate these statistics.

Statistics

Links	Leadboard	Map Classification	Lower Elo Inclusion Limit	Remove Single Civilisation Players?
Link	1v1 Random Map	Any	1200	False
Link	1v1 Random Map	Open	1200	False
Link	1v1 Random Map	Closed	1200	False
Link	Team Random Map	Any	2000	False
Link	Team Random Map	Open	2000	False
Link	Team Random Map	Closed	2000	False
Link	1v1 Empire Wars	Any	1100	False
Link	Team Empire Wars	Any	1100	False
Link	1v1 Random Map	Open	1200	True
Link	1v1 Random Map	Closed	1200	True
Link	1v1 Random Map	Open	1700	False
Link	1v1 Random Map	Closed	1700	False

A comparison of naive win rates across the different groups can be found here.

Methods

Confidence Intervals and why they are Important

A common critique you will hear when talking about any statistic is “you can’t trust that value, the sample size is too small!”. A natural question is then “well how big should the sample size be?” or the equivalent question of “how much should I trust this statistic given the sample size”. This is where confidence intervals come in.

A key thing to realise is that when we create statistics, like win rates, what we are creating are estimates of some true unknown value. Confidence intervals can thus be thought of as the range of values in which the true value is likely to be found in i.e. there is a 95% chance that the true value for the win rate exists within this band. Throughout these documents the 95% confidence intervals are presented as error bars around the point estimates. More generally speaking, the wider the confidence interval is the less trust we should have in the estimate whilst the narrower the confidence interval is the more trust we should have in the estimate.

Please note that my above description of confidence intervals isn’t technically correct in that if you repeated it to a statistician they will probably roll their eyes at you or lecture you. That being said it is good enough to give an intuitive sense of what confidence intervals represent and how to interpret them. If you want a more accurate description please see here.

Naive Win Rates

Whenever something is indicated as being a “Naive win rate” it means that it has been calculated by fitting a logistic regression model to each civ’s match data independently, i.e.:

\[ Y_{ij} \sim Bin(1, p_{ij}) \\ p_i = \text{logistic}(\beta_i + \beta_d d_{j}) \]

Where:

\(Y_{ij}\) is 1 if civilisation \(i\) won its \(j\)’th match
\(\beta_i\) is civilisation \(i\)’s logit win rate
\(d_j\) is the difference in mean Elo between team 1 and team 2 in match \(j\)

All mirror matchups are excluded.

It is referred to as the “naive win rate” as it doesn’t take into account the civilisation play rates and thus more represents the civilisations win rate against the most played civilisations.

Averaged Win Rates

Averaged win rates are calculated by taking the average across all civilisation v civilisation win rates. I.e The Aztec win rate is calculated by taking the mean of their win rate vs Berbers, Britons, Bulgarians, etc, separately. This statistic can be thought of as the win rate if your opponent was picking their civilisation at random.

For 1v1’s each civilisation v civilisation win rate is calculated using the “naive” method mentioned above. For team games though the win rates are calculated by fitting a logistic regression that derives the probability of winning as the average across each pairwise civilisation match-up. For example, let’s say in match \(j\) that team 1 had civilisations A, B and C whilst team 2 had civilisations X, Y and Z. The model fitted would then be:

\[ Y_{j} \sim Bin(1, p_{j}) \\ p_i = \text{logistic}\left(\frac{\beta_{AX} + \beta_{AY} + \beta_{AZ} + \beta_{BX} + \beta_{BY} + \beta_{BZ} + \beta_{CX} + \beta_{CY} + \beta_{CZ}}{9} + \beta_d d_{j}\right) \]

Where:

\(Y_j\) = 1 if team 1 won or 0 if team 2 won
\(\beta_{mn}\) is civilisation \(m\)’s win rate against civilisation \(n\)
\(d_j\) is the difference in mean Elo between team 1 and team 2

Please note that a major limitation of this formulation is that it doesn’t allow for any interaction effects. I.e. it doesn’t account for the fact that some civilisation pairings are stronger together than if they were to be considered independently (think a team of all cavalry civilisation vs a team of both archer and cavalry civilisation).

In all cases, a small Laplace smoother was added to avoid issues associated with certainty bias from low civilisation v civilisation sample sizes (most notably in the Empire Wars data). This will mean that the confidence intervals are very marginally underestimated and biased towards 50%; realistically however this should be negligible.

Removing Single Civilisation Players

In some cohorts “single civilisation pickers” have been removed. In these cohorts single civilisation players are identified as those who have a played at least 10 games and who have a playrate of >50% for any single civilisation. In these cases all matches where that player participated are removed.

Frequently Asked Questions & Critiques

I must add the disclaimer that the answers given to these questions are just my own personal opinions. I must also stress that I am not a game developer. Additionally my opinions are not set in stone; I am always open to hearing counter arguments and am happy to update my opinion in the face of convincing evidence.

1) You shouldn’t include games with players below an Elo of X as those players don’t know how to use the civilisation and make lots of basic mistakes which don’t reflect the civilisations overall ability / balance.

I strongly believe that the game should be balanced around all levels of game play and not just the Pros / “perfect play”. It doesn’t matter to me if the game is balanced for the Pros if it is an unbalanced mess at the level I play it at. I think a good example of this is quick-walling. I personally (like many others who play) don’t have the reaction speed nor dexterity to quick-wall like the pros do, so if a unit can be effectively countered by quick-walling should it be considered balanced? Thus my belief that more than just the Pro level needs to be considered when assessing balance. To this end I have used a cut-off of >1200 Elo (or the top ~25% of players) which I feel provides a nice balance between civilisations being played “correctly” whilst still representing a decent proportion of the player base.

2) Most people just play the meta and thus win rates are biased towards that and don’t actually represent the true ability of the civilisation if they were to use all of their available strategies

My current opinion is that observed win rates are a more useful metric than the true / theoretical win rates.

My logic is that if civilisations were truly played perfectly with no mistakes and machine like macro/micro it is highly likely that the true win rates for all match-ups are either 0 or 100%. I believe it’s human error and the specific strategy selection that introduces uncertainty in the results. Because of this I believe that the theoretical “true win rates’’ don’t really matter; this isn’t some science where we are trying to evaluate the true natural effect, it’s a game that we are playing for fun and the observed win rates give information about how the game is actually being played. If a civilisation has a crazy high win rate it doesn’t (at least not to me) matter if the civilisation is hypothetically balanced when”played correctly" because the majority of people who are actually playing the game are experiencing imbalance.

3) You shouldn’t base balance changes based on win rates

I 100% agree with this. Win rate statistics are one of the most macro level tools we have and shouldn’t be the sole reason nor justification for any balance change. All they can do is help identify general potential problem areas. To inform actual balance changes we need to combine expert opinion, scenario analysis & expected outcomes from the intended game design. This is to say I disagree with the approach of civilisation X has a win rate of Y therefore we should buff them: well why is their win rate that, where do they struggle, what is our expectation for them, how does this impact the pro scene, etc.

4) You need to account for players who play the vast majority of their games as a single civilisation

This is a fair critique; play rates do have a large impact on the win rates. See some of my earlier work to get an idea of how they are impacted. That being said the main impact appears to be a regression towards having a neutral 50% win rate so it is likely that the most played civilisations have their win rate underestimated (I’m looking at you Franks). For the time I have attempted to partially address this by creating 2 cohorts in which I remove all players whose most played civilisations have a play rate > 70%. Again this is a long way off solving the issue but should help to somewhat account for it