Flow Motifs

This blog post explains the Flow Motifs tool as created by UnravelSports.com.

About a year ago I came across a research paper called “Who can replace Xavi? A passing motif analysis of football players” by Peña & Navarro. They applied the notion of network motifs to players in the Spanish and English top tier and found that Xavi was in a league of his own when it came to involvement in passing motifs.

Because I was really intrigued by the question posed by Peña and Navarro I decided to try and answer it. This seemingly innocent question resulted in a project spanning multiple months for which I had to learn a programming language (Python) and ended up writing a research paper (together with Dr. Dabadghao from the Eindhoven University of Technology) and getting the opportunity to present our findings at the MIT Sloan Sports Analytics Conference in Boston 3rd and 4th of March 2017 as part of their research paper competition.

In the subsequent blog I will explain the general idea behind flow motifs and their implications for players and teams. We analyzed over 7,000,000 passes, 3532 unique players, 155 teams across a total of 8219 matches played in the 2012/13 to the 2015/16 seasons of the Premier League, La Liga, Bundesliga, Serie A, Ligue 1 and Eredivisie.

Flow Motifs

We see flow motifs as the building blocks of both players (P) and teams (T) passing behavior. We differentiate between two types:

  • Possession Motifs (PMs): a sequence of at least 3 passes a team/player creates that does not lead to a goal attempt. In Figure 1 we see ABAB, BABC and ABCD respectively.
Figure 1. ABAB, BABC, ABCD Motifs
  • Expected Goal Motifs (xGMs): a sequence of at least one pass that leads to a goal scoring opportunity with a certain expectation of being converted. In Figure 2 we see expected goal motif ABACG.
Figure 2. ABACG Expected Goal Motif

Every pass is restricted by a 5 second threshold, thus any transition with an interval time greater than this upper bound is not considered to be a part of any motif.

Ultimately there are 5 different Team Possession Motifs (TPM) and 8 Team Expected Goal Motifs (TxGM). Due to the fact that a player can be considered to be A in multiple positions in a motif we identify 15 Player Possession Motifs (PPM) and 22 Player Expected Goal Motifs (PxGM). For instance: motif ABAB and BABA are considered the same motif within a team, but these are distinct motifs from a players perspective.

Unique Play Styles

By plotting every players motif intensity (motif/90 minutes) against their individual use probability (how often a single player uses a motif as a percentage of all their motifs used) and then coloring the nodes by each players preferred position we identify a clear link between player position and motif use. In Figure 3 & Figure 4 we show two examples of motifs ABCD and ABCB respectively.

Figure 3. Involvement in possession motif ABCD as a percentage of all possible motifs, against the percentage of time it is used per 90 minutes. This figure includes all 3532 unique players.
Figure 3. Involvement in possession motif ABCB as a percentage of all possible motifs, against the percentage of time it is used per 90 minutes. This figure includes all 3532 unique players.

Unique Style Clusters

We now calculate 3532 vectors with every individual players’ involvement in every possible motif per 90 minutes played, for both PPMs and PxGMs. By subsequently applying Mean Shift hierarchical clustering to these vectors we identify eight different player clusters. In Table 1 the clusters with less than 100 members are depicted for Player Possession Motifs.

Size Cluster Member(s) Classification
1 Iniesta
1 Rafinha (Bayern)
1 Denswil
3 Benatia, Busquets, Xabi Alonso Central/Defensive
6 Kimmich, Weigl, Verratti, Thiago Alcantara, Thiago Motta, Xavi Central Midfielders
Table 1. Mean Shift Clusters by PPMs (estimated bandwidth) with less than 100 cluster members.

Here we see that Iniesta, Rafinha and (some what surprisingly) Stefano Denswil (Ajax 2012/13) have unique play styles. Furthermore we see that Benatia, Busquets and Xabi Alsono (all considered central/defensive players) form a cluster with a similar play style, and also Kimmich, Weigl, Verratti, Alcantara, Motta and Xavi (all central midfielders) have a similar play style.

In Table 2 we show the results from applying the same analysis to PxGMs. Here Mean Shift creates 13 different clusters, we show the clusters with less than 20 members.

Size Cluster Member(s) Classification
2 Lewis Baker, Nicky Shorey Central Attack
2 Jacob Mulenga, Slaon Privat Center Forwards
4 Imbula, Weigl, Lanzini, Ibe Defensive Midfielders (excl. Ibe)
9 C. Ronaldo, Adrian Ramos, Mitrovic, Diafra Sakho, Dzeko, Uche, Coda, Lewandowski, Necid Center Forwards
16 Messi, Robben, Morata, Tevez, Sturridge, Lampard, Bale, Hernandez, Higuain, Luis Suarez, Benzema, Ibrahimovic, Depay, Vucinic, Muller Wingers & Center Forwards
Table 2. Mean Shift Clusters by PxGMs (estimated bandwidth) with less than 20 cluster members.

Pep Guardiola’s Style Identified

The same analysis can be done for both team motif types (TPM and TxGM). From the TPM clusters in Table 3 we derive that Paris Saint-Germain has a unique passing style whereas the passing styles at FC Barcelona and Bayern Munich are closely related. The latter is not surprising considering Pep Guardiola integrated his specific style at Barcelona when he coached them between 2008 and 2012, and then at Bayern Munich after he joined there as a coach in 2013. In the third cluster with 29 teams we see teams that utilize intelligent possession based strategies.

Size Cluster Member(s)
1 Paris Saint-Germain
2 FC Barcelona, FC Bayern Munich
29 Ajax, Arsenal, Borussia Dortmund, Borussia M’gladbach, Celta de Vigo, Chelsea, Empoli, Everton, Feyenoord, Internazionale, Juventus, Las Palmas, Lille, Liverpool, Lyon, Man. City, Man. United, Milan, Napoli, Nice, Real Madrid, AS Roma, Southampton, Swansea City, Tottenham Hotspur, VfL Wolfsburg, Vitesse, Wigan Athletic
123 All other teams
Table 3. Mean Shift Clustered by TPMs (estimated bandwidth)

Even the more suspect teams in this cluster seem to fit in rather well. Empoli managed by Maurizio Sarri in 2014/15, who is now head coach at Napoli; Nice managed by Claude Puel during the past four seasons, head coach of Southampton since the start of 2016/17 season; Vitesse coached by Peter Bosz during 2013-2016 and currently head coach at Ajax; Wigan Athletic, Swansea City and Everton all coached by Roberto Martinez. Las Palmas, coached by Quique Setién, who finished 11th in their first year back in the Spanish top tier since 2002, also shows up in this cluster. This indicates that Quique Setién might be a suitable coach for other teams looking to play possession based ‘attractive’ soccer.

The main difference between the first three clusters and the remaining 121 teams seems their apparent lower probability of executing ABCD and ABCA motifs, and thus higher probabilities of utilizing ABAB, ABCB, and ABAC. Furthermore, all three clusters have overall higher average motif intensity per match.

The results for TxGMs are shown in Table 4.

Size Cluster Member(s)
9 Arsenal, Barcelona, Chelsea, Juventus, Manchester City, Napoli, Roma, Southampton,VfL Wolfsburg
12 Ajax, Bayer Leverkusen, Borussia Dortmund, Bayern Munich, Schalke 04, Feyenoord, Liverpool, Man. City, PSG, PSV, Real Madrid, Vitesse
134 All other teams
Table 4. Mean Shift Clustered by TxGMs (estimated bandwidth)

Conclusion

All in all we can conclude that using flow motifs for style identification and thus player scouting seems are very valuable tool.