Elo Rankings: An Intro to Our New System
- Updated: December 20, 2016
For the past five seasons, fans of American quidditch have had two main sources to turn to when trying to find out where a team stood in comparison to the rest of the league. The first place one can look is USQ’s Standings, which despite being set up with as good an algorithm as one could ask for in a quidditch ranking system, have suffered from a lack of frequent inter-regional play until US Quidditch Cup. This has historically led to significantly over-ranked teams and the general caveat that until late April, these standings must be taken with a fair grain of salt. Fans can also check The Eighth Man’s Media Rankings, which provide a slightly more accurate view of the teams at the top. However, these rankings only give a perspective of the top of the league, and (purposefully) lack the pure objectivity that an algorithm like USQ’s can provide.
For the first time, The Eighth Man is pleased to present a unique ranking system that seeks to provide an answer to the issues presented by both sets of rankings: The Eighth Man Elo Ratings.
What is Elo?
Elo is a rating system named for and invented by Marquette University physics professor Arpad Elo. It was originally designed (and is still used today) as a rating system for chess players, where every team’s or individual’s rating is quantified as a number. Essentially, the higher the number, the better that competitor or team. In general, a team whose Elo is 100 points higher than another team has a 64% chance of beating that team on an even playing field, and that probability increases as the Elo difference increases. After two teams play, their Elo is readjusted. If the better team wins, they gain points, but will gain fewer points if they have a higher predicted chance of winning. This means that good teams will have a hard time inflating their rankings by beating up on weaker teams. If the underdog wins, their rating jumps by significantly more points than their opponent’s would have—think of it as Elo correcting itself to be more in line with that result.
Most importantly, Elo is a zero sum game. If RPI (Elo rating: 1736) were to beat Quidditch Club Boston (Elo rating: 2022) tomorrow by a snitch grab, RPI would gain 20 points, while that amount would be subtracted from Boston’s rating. Conversely, if Boston beat RPI, as the favored team, Boston would only add four points to their Elo rating, while RPI would lose that amount.
Unlike chess, which only tracks wins and losses, a quidditch team can lose by 10 points or 110 points. Elo accounts for this and adds a multiplier to the Elo change based on the margin by which the winning team won. This multiplier is based off of the SWIM stat used in USQ’s ranking algorithm—so winning by an extra 30 points will help a team’s rating significantly if the score is tied, but will only marginally benefit a team that is already up by over 100.
Elo’s use in professional sports has been on the rise in recent years—first in international soccer, then later developed and popularized for the NFL, NBA and MLB by FiveThirtyEight. The Eighth Man’s new Elo rankings were in part inspired by these projects.
Elo’s biggest benefit is its consideration of a team’s historical record. Our model currently accounts for every official game (minus a few that are unable to be found through various sources) going back to September 2012. This means that a team’s Elo score is influenced by every game it has ever played, with its recent games weighted more heavily. By not resetting every season, Elo is often able to spot quirks that USQ’s algorithm may have gotten wrong. While USQ’s rankings had the Rain City Raptors as a top 12 team going into US Quidditch Cup 9, Elo had their rating at 1581 before the event—which is an above average rating, but is not top tier. In order to function, Elo also must take into account every team that plays officially, and so unlike The Eighth Man’s Media Rankings, Elo is able to give a relatively accurate ranking to every team in the league, and not just the Top 20.
Elo will still get some things wrong, specifically with newer teams. Right now, Elo has the BosNYan Bearsharks (No. 14 in USQ’s standings and No. 8 in The Eighth Man’s Media Rankings) as No. 48 in the country with an Elo of 1575. This is due to the fact that our Elo rankings assign new teams a base Elo value of 1300, and, thus, a team will need to play numerous games before their Elo rating accurately reflects their skill. The Bearsharks are already the most-improved team this season, gaining an Elo of +275—and the more games they play, the more their Elo ranking will reflect their actual talent.
Just like any other ranking system, Elo is not perfect. Any X factor, such as a team’s strategy, injured players, roster changes, depth or experience can’t be accounted for when you boil them down to a single number. By no means will Elo replace the analysis, scouting and coverage that you’ve come to know The Eighth Man for, but in situations where less information is available, Elo will be there to fill in the gaps.
We hope to use Elo as a dynamic support for visually and analytically tracking the history of the league. To preview this new system, check out our graph that tracks the Elo rating of every team who has held a top 25 Elo this season. In the future, we hope to use this system to provide broader displays of the league and in-depth looks at the histories of individual teams. Like any new tool, we’re looking forward to working with, tweaking and improving our Elo ranking system as we begin to see it work in real time.
- Our Elo model uses all available official game data, dating back to 2012.
- Teams present in the 2012-13 season were assigned a starting Elo rating of 1500. From the 2013-14 season onward, new teams receive a starting Elo rating of 1300, so as to avoid artificially inflating the Elo ratings of teams they play early in the season.
- After each season, teams are readjusted toward a mean value of 1500 by 1/3rd of their Elo margin above or below that rating. For example, a team with an ending Elo rating of 1800 will start the next season with an Elo rating of 1700. A team with an ending Elo rating of 1200 will start the next season with an Elo rating of 1300. This seeks to account for the inevitable turnover that happens from year to year by tightening the gaps between teams in the rankings. However, it will not reorder any teams from one season to the next.
- Games played during the regular season use a multiplier of the game’s SWIM x 0.8 to adjust Elo ratings. Games played at World Cups, US Quidditch Cups and Consolation Cup use a multiplier of the game’s SWIM x 1.2 to adjust Elo ratings.
- Crimson Elite’s scores during the first eight games of the 2015-16 season were counted as their original scores, not 0-150* forfeits in order to preserve the accuracy of the Elo ratings of the teams they played during those games. Otherwise, all ranked forfeits are counted as recorded in Elo.