After quite a bit of work from both xparru and myself, we have something of quality to show. I love meeting smart people – especially those interested in Overwatch. Our conversation began in the Winstonslab discord channel, which I recommend anyone with a good idea of analyzing Overwatch to come start a conversation.
With that, we began our work to begin to classify players from a fantasy perspective in Overwatch League.
The Overwatch League season one first stage is just finished and we have been treated with super-star level clutch plays, tight matches, and quality entertainment. Furthermore, over 2500 fantasy league contests have been created on Winston’s lab Overwatch fantasy league service. Since we have that steady stream of match data we use for fantasy league, we did some in-depth statistical analysis on it to provide you more insight into stage 1 match details. Here we present our findings, methods, and explain some of the most interesting results in detail. After the example analyses we present, you should be able to interpret the results yourself and perhaps tune your fantasy league roster ready for stage two!
We used two explorative statistical analysis methods, called Principal Component Analysis and cluster analysis. The selected methods provide results that can be easily visualized and are relatively straightforward to interpret while providing more information per one view than your average table of numbers. We model stage 1 by using number or -kills, deaths, ultimates, first kills, and first deaths (weighted by time played) as variables obtained from Winston’s Lab. Our aim is to expose different playstyles, showcase outliers, and enable better decision making in terms of fixing your roster. Furthermore, we wish to provide more insight and fun to the Overwatch fantasy league and provide more tools to play with.
Let’s take a look at the very first stage of Overwatch League. We all can remember the highlight performances of JAKE, Fleta, and Pine on the first week, while Carpe dominated the killfeed in weeks two and three. Not to mention Deadly Zenyatta plays by JJoNak and Bdosin. At the same time, some teams and players underperformed and did not meet everyone’s expectations. Nevertheless, here is how stage 1 looks after PCA and cluster analysis, we will of course start with DPS / Flex players:
What you should focus on in figure 1 above:
- The axes principal component 1 (PC1) and PC2, and how much they explain the variance. Together PC1 and PC2 explains over 85 % of the variance here, which is considered extremely high. It means, that this plot explains 85 % of the entire data, making the interpretation easy for you. Most importantly, it allows us to present the results in single two-dimensional figure.
- The red arrow direction corresponds to the direction a player moves from increasing this variable. For example, kill points mostly change along PC1 (x-axis), where Gido (far right) performs poorly, and Pine (far left) extremely well.
- The red arrow length tells you the magnitude of variance between the players. For example, kill points variance is much higher than first kill points variance.
- Players inside a same ellipse resembles each others in terms of the used variables. The similarity / dissimilarity was determined via cluster analysis. You can invent own names for the categories: for example, the left-most ellipse could be *some meme here*. The more further away player / ellipse are from each other, the less they resemble each other. Note that, player is so-called outlier if it belongs to a cluster but does not fit inside the ellipse; those players are doing something differently! (See the color coding legend)
- Size and shape of the marker under player names represent the reliability of the estimate, where smaller marker means less reliable the estimate is. Players with same marker belongs to the same cluster.
- The question: “what does the PC 1 and PC 2 axes represent” is the one you should answer last. It is also the most difficult task. Note that, here the first PC 1 explains already 75 %.
- PC1: Kills, first kills, and ultimate point seems to change in the same direction, while kill points and first kills changes in opposite direction, with respect to PC 1 axis. Thus, the PC 1 axis could be interpret as “ability to generate points with 75 % ‘accuracy’”, where the high-scores are more to the left.
- PC2: there appears to be more changes in deaths, first deaths, and ultimates than in first kills and kills (see red arrow directions), where first deaths and deaths change in same direction, and ultimate points in opposite direction. Thus, the PC2 can be interpret as “ability to lose points vs. gain ultimates with 9.8 % ‘accuracy’”, where point-losers are more up (death point is a negative thing, resulting from being eliminated in the game).
That’s it! First, it may feel a difficult task, but we can almost guarantee that your initial intuitive interpretation was pretty much the same as the interpretation formulated above. Next, we present the same plot but instead of player names, the labels are by most-played-hero by the corresponding player:
Now, we can make comparison between Tracer players, for example, by looking at some Tracer in most-played-hero in figure 2, and checking the corresponding player name from figure 1. Let’s look at Tracer played by Carpe and Tracer played by clockwork: they are far away from each other, and situated in different clusters (ellipses), which means that there is some difference between their performance / play style. Another example displaying contrast is Mcree by Pine vs. Mcree by uNdeAD. Remember that the points are weighted by time played, so the absolute number of points between Pine and uNdeAD are not so far (522 vs. 590), but when Mr. Pine is on the field, things starts to go sour in the opposite team!
About those outliers, meaning those points that does not fit inside the ellipse, but are counted to belong into the same cluster. One clear outlier is Gido with Zenyatta, who forms a cluster of his own, why? It could be simply due to short time played simultaneously with a bad day playing as Zenyatta. Another interesting outlier is Profit on Tracer, who is clearly separate from his cluster (Pine et al.), why? Let’s remind ourselves what PC2 represents: “ability to lose points vs. gain ultimates”, it appears that Profit loses points far less than guys inside the corresponding cluster if you look at the statistics. Slippery Tracer!
With these few examples, you should now be able to make your own interpretations and comparisons from the two plots above. Furthermore, you should be able to do the same thing with similar plots generated for Tank and Support players as well. Let’s take a look at the Tanks next:
The red arrows in figure 3 represent the same variables as before (figures 1 and 2), players within same ellipse / color code are again somehow similar, and so on. First, we note that the clusters are now more dispersed, telling that there is more difference between tank players. Interestingly, ultimate, death, and death point arrows increase in the same direction now, unlike in offense players. Why, and what does it mean? This becomes more obvious if we switch the labels from player names into most-played-hero:
We can immediately see that the plot in figure 4 is extremely polarised in terms of heroes: deep-diver Winston on the left, and D.Va on the right. If you look at the red arrows closely and think about the context you realise that it tells an interesting story about mobility and survivability. Once Winston jumps and D.Va jets to a fight, followed by Mercy, Genji, and Zenyatta, while Tracer is buzzing somewhere in the opponent backline hopefully, the battle between the two teams begins. Should the diver team lose the fight, who is usually the sole survivor? Correct! It is D.Va backing with defence matrix on, and eventually hovering off with her rocket boosters. While poor Winston is slammed to the ground after the famous “monkeymonkeymonkeymonkey” call. This is one story that the above figure tells, the rest we will leave for you to analyze! (Don’t forget to look at the Gamsu-Gesture-FaTe trio isolated on the bottom left corner. Hint: you should consider having one of these guys in your roster).
Finally, the supports. We found the most appropriate number of cluster to be four after testing, trying, and analysing with different number of clusters. Before you look at the image, try to imagine how it will look. Which players and heroes are within a same cluster? How are they scattered with respect to the “red arrows”? Make an intuitive guess.
You should be familiar with the formula already: 1) PC1 and PC2 amount of variance explained, 2) the red arrows, 3) clusters, 4) what does the PC1 and PC2 represent. You may have guessed that the figure is dominated by wonderful-and-caring Mercy and deadly Zenyatta hero-player pairs. Furthermore, the lethal trio JJoNak, Bdosin, and Neko are in their own cluster at the top right corner, implicating that you should not mess with these guys. Also, for a Fantasy league venturer it means that you should have these players in your roster!
You could say that the above figure tells a story of healer vs. killer. Mercy ultimate points include her points generated from resurrections, too. Furthermore, the ultimate points in general is an interesting variable, namely it is an excellent proxy measure for overall performance, since it is a consequence from damage dealt or amount healed. Let’s lay our eyes on the outlier, Freefeel playing as Zenyatta, and why he is so far from his cluster. He has about the same amount of ultimate points as other Zenyatta players; however, his performance as a killer-zen is lacking, since his kill points are much lower than other Zenyattas.
This completes our result and analysis section. We encourage you to dig into these results yourself, and while you are doing it, it can be helpful to keep Winston’s lab fantasy player-stat page open aside. Should you wish to discuss more or ask something, comment below, come by at Winston’s lab discord, or make a Reddit post or comment. We appreciate any feedback, and if you like this, we will be providing this type of analysis in the future as well!
We note that all the i.i.d. assumptions are not perfectly met in terms of fully independent variables. Furthermore, the principal component analysis is an approximation in this context rather than an exact representation. Importantly, the variables are weighted by time played which anyone adjusting his / her roster according to these results should carefully consider. Reader should be aware that any decision concerning fantasy league roster is his / her own responsibility.