I like Nate Silver. I read his political blog virtually every day and followed it religiously during the 2008 presidential election cycle here in the US. He’s a man who knows his statistics and also his sports: he writes for Baseball Prospectus and uses more decimals than I do. A lot more. So yeah, I like Nate Silver and when I learned that he was involved in ESPN’s Soccer Power Index (SPI), I was pretty excited. After all, Nate has a bigger brain than I do and maybe I would be able to learn a few things about approaching football/soccer statistically.
Nate’s article is here. The actual SPI rankings are here and you can the full methodology here. You can read about how smart the guy is here.I’m not here to bash the SPI or anything like that. I think it’s probably being bashed enough around the footballing world as just another American attempt to force the beautiful game into a box that is easily understood via a spreadsheet and anyway, Nate himself writes that the “SPI is designed to serve as a general guideline” rather than a one-stop-shop for who is going to win the next World Cup.
Basically, I can’t quibble with the numbers involved or how they’re done because I don’t have the background to truly look at them, but I do think that there are some parts of this that are worth discussing, even from a layman’s point-of-view. If I understand it correctly, the idea behind SPI is to rank who is most likely to win the next World Cup or at least if you put two teams together determine who is more likely to win that match (which might sound contradictory to my previous statement that it’s not designed to do that, but it’s in the semantics: most likely means it’s about predictive power rather than “Team A is the best in the world”…I hope that makes sense).
This is what the methodology says:
The SPI proceeds in four major steps:
Calculate competitiveness coefficients for all games in database
Derive match-based ratings for all international and club teams
Derive player-based ratings for all games in which detailed data is available
Combine team and player data into a composite rating based on current rosters; use to predict future results.
I added the emphasis. There are, of course, lots of numbers (and decimal!) involved in this, but I highlighted what I think might be the key problem. Nate writes, “…basically, we’ve taken results from every recent game in the four key European leagues (England, Germany, Italy and Spain), plus the Champions League, and assigned credit or blame to the individual players on the pitch based on the results of those matches.” This “credit or blame” is what’s bothering me.
If you’ve read this site for long enough (or its original iteration at The Offside), you’ll know that I worked for a little while on just such a system–though certainly with less brain power and fewer resources–so you’ll also know that I’m pretty open to these ideas. It’s just that after working on these things for a while, I realized, as some commenters had pointed out early on, that there really isn’t a good way to measure football statistically for each player. You can take teams and figure out a lot of what’s going on in the games through the total numbers, but assigning individual blame or credit is a bit misleading for several reasons. The first is that there are questions about the stats themselves. Someone must be subjectively judging whether or not a pass was on target or not, whether the player receiving a pass was in the right position to receive the pass or not, etc. There are so many variables that I find it basically impossible to adjudge how a simple Player A to Player B pass should be construed statistically that once you add in 9 other players on that team and 90 other minutes of action, it’s basically impossible to figure out who did what and was better by the numbers. That’s why watching the match and getting an understanding of the system and who did well within those particular 90 minutes is so important (and part of what makes Kevin’s reviews so important and difficult).
The SPI gets around this whole thing rather handily: “The first and foremost requirement is that the sum of the ratings for all individual players on a team must equal their team’s rating for that game; soccer is too much of a team sport to make any other assumption.” Fair enough, I suppose, but I still think it presupposes that the stats are being kept in regulated, methodical, and standardized ways. I’m not so sure they are. For instance, how do they know how far Xavi ran on a given day? How do they track that, exactly? And is it different in La Liga and the Champions League or in internationals? Are they like NBA stats keepers?
Given all that, what’s the point anyway? I’m not a big fan of “objective” polls (I hate the BCS with a burning passion) because I don’t think they’re any more objective than the subjective polls that at least admit to not being objective. I care about NCAA men’s basketball rankings only insofar as they get my Kansas Jayhawks (ranked #1 incidentally) into the tournament with a better seed, but I don’t like it and I can’t really get behind any Ratings Power Index (RPI) for any sport (NBA springs to mind). RPI basically tells me what I already knew as a fan of the sport: what teams are good.
I know what teams are good in international soccer. If I were to blindly rank teams based purely on the speculative nonsense that runs around my head after a long weekend of watching internationals, I’d come up with a list that is basically the same as the SPI. It wouldn’t be exactly the same (I think Spain is better than Brazil right now, for instance), but it would certainly be within the realm of the understandable. After all, compare the FIFA rankings with that of the SPI. They are very similar (they have 8 of the same top 10) and yet also different (Italy is 4th according to FIFA and 13th according to SPI*)–in fact, the SPI looks a lot more similar to the ELO rankings, but they’re all basically the same lists in slightly different orders.
So, I appreciate what the SPI is doing because I like the idea of being able to understand more about the game through statistics, but I’m not sure what it really does. I’d be okay if FIFA adopted this system instead of what they have for the seeding of teams in the World Cup, but I don’t suppose it really matters either way since both systems are going to be flawed in some manner and I don’t have the brainpower to compare the two outright.
Take what you will from the SPI, but I’ll be checking in on it as well as on the Castrol Rankings that Kevin discussed in a previous post. And yeah, I’ll hopefully be writing up something about those rankings later this week, but suffice it to say that I’m not a big fan already. Long story short: the SPI has lot of decimals. I like that.
*One reason for that might be SPI’s updates compared to FIFA’s. November 20 will be the next time FIFA updates their rankings (after the WCQs are all wrapped up), so it might be better to compare the two then rather than now.