Put Down the Slide Rule

One of the under-reported effects of cheap and widely available personal computers is the increasing dorkification of sports.

I'm talking here about the rise in obsessive stat-geekery across the board, with the accompanying increase in "fantasy" sports. Those phenomena have hardly been ignored, but not many commentators put the blame where it belongs: on the computer industry.

Back in the day, stat-wanking was mostly confined to baseball, which is so ridiculously boring that calculus seems like a fun way to spice things up. As computers have become more common, though, it's become easier for sports geeks to crunch numbers, and the statistics mania has started to creep into football and even basketball. At this time of year, pseudo-objective college basketball ratings are as common as ill-advised jump shots. Dave has been tracking Tournament seeding projections for a while now, and now offers some stats that purport to predict tournament outcomes. But if you really want basketball stat-geekery, Ken Pomeroy is your man, tracking and promoting a dizzying array of statistics.

I'll admit that I'm somewhat torn about this. I am, after all, a professional nerd, and enjoy working with numbers, so I can see the appeal of quantitative data. And a lot of the regular statistics used in absketball are pretty crude measures, so I can understand trying to develop better statistics.

As a player and fan, though, I tend to think that a lot of this stuff is just crap. I'm more than a little dubious about the possibility of testing these measures-- the "log5" method claims that UNC has a 51% chance of winning the ACC tournament, but given that they only play the tournament once, it doesn't seem like you've got a way to assess the validity of the prediction. After all, either they win it or they don't, and one measurement doesn't tell you anything about the overall distribution. You can aggregate historical data to see if the method gives you consistent results, but that's not terribly convincing, given that the teams change over time.

Most of my problem with this, though, is that it seems so bloodless. I'm a fan of college basketball because I enjoy playing basketball, and reducing it to just manipulation of numbers sucks all the interest out of it. I'm not into the game to predict the outcomes, I'm into it for the joy and pain of the playing of the game, and while the final score matters, what really matters is what happens on the way to the final score.

And besides, it's a short step from excessive statisticulation to "fantasy" leagues, and those are an absolute blight on the national landscape. If you want to find an example of the widespread availability of powerful computers exerting a detrimental influence on the American character, forget about Internet pornography-- fantasy sports leagues are the real threat.

Tags

More like this

The irritating thing, I think, is that they appear to be making little effort to isolate important statistics or at least portray the more important ones as being more important. It's like someone took a 2 day course in writing database queries and was then encouraged to do whatever they felt like.

Darn. I thought this post was going to be about slide rules. :(

When I was heavy into fantasy baseball, it made me a much better baseball fan, and I got a lot more out of watching the games. And not because I wanted to follow "my" players, and not _just_ because I knew a lot more about a lot of the players. I found myself intrigued by the details of the game -- the details around pitch selection, or the tactical choices on when to call a hit-and-run.

If you were to treat the stats as a replacement for the qualitative side of the game, you'd lose a lot. But I never found myself tempted in that direction, any more than knowing about black-body spectra and Hubble's constant prevent me from enjoying the beauty of a cloudless night sky.

stats lie a lot, and can be abused.

For example, Dennis Johnson had inferior stats to Dominique Wilkins, and Wilkins is in the BBHOF. But who would you want on your team??????

You are talking teams, and its hard there too. ACC is best, we did .500 in the ACC, so we're really a .725 type team, and such.

You can lie with statistics, I mean a theory of many identical systems, in a limit of large numbers of systems, cannot be applied to 30 games played by 10 humans or more......You can quote stats all you want, they are based on a limited sample of non-identical events, and of course THATS WHY THEY PLAY THE GAMES as the reverend Tony and Michael would advise us. Stats are a spice, they supplement the meal, but too much is obnoxious.

Now its time to go watch the NFL Combine, where they quote 40 yard dash times to 100th of a second whilst ignoring factors such as reaction time, non-electric timing, no wind gauges, etc.....

Hey Chad, give us your take on the RPI please! Useful number or not?


As a player and fan, though, I tend to think that a lot of this stuff is just crap. I'm more than a little dubious about the possibility of testing these measures-- the "log5" method claims that UNC has a 51% chance of winning the ACC tournament, but given that they only play the tournament once, it doesn't seem like you've got a way to assess the validity of the prediction.

I was a little stunned to read this. How do you think meteorologists validate their predictions, which are also one-time events? You take all the 51% log5 predictions, and see if the teams did indeed win 51% of thier games. Do this for all the other predictions. I'm taking away your nerd licence.

The log5 method has been validated in this way for MLB baseball, for EPL soccer, and for NBA basketball. I have no idea how well it does for NCAA basketball, but I'm assuming that it does slightly worse because of the amount of noise in NCAA data.

Professional and college teams take higher-level stats seriously, even if you don't. I'm surprised by your dismissal of statsgeekery.

By igor eduardo kupfer (not verified) on 28 Feb 2007 #permalink

I'm torn as well - on the one hand,, tempo-free stats are great, much better for comparing teams and playing styles than the old method of just usimg raw numbers. And great sportswriting can come from them, as John Gassaway at Big Ten Wonk demonstrates. But the continual focuse on the NCAA tournament and its seeding months in advance is ridiculous and saps a lot of the enjoyment out of the season if you let it. And the log5 stuff is just silly. (Even a lot of Ken Pomeroy's rankings and game predictions are worthless. He's been predicting UNC and about five other teams would win out the season since mid-January.) I'd much rather, you know, watch the games that map out future events, no matter how much math is involved.

OK first of all - the more stats the team has (that they can validate), the better off they will be. And they'll have an advantage over the "traditionalists" - ever read moneyball? It wasn't just about stats vs. old scouting. It was using both to find undervalued players to save money. That's an important distinction. They weren't using stats in lieu of tradtional methods, they were using new stats to gain an advantage. And that goes very strongly in baseball - more teams need to start paying attention to the new statistics. They're far more strongly correlated than the traditional ones.

As for teams changing over time, OK that's true. But the statistics (I hope) aren't based on just that team and what UNC does. They're based on the historical player pool and what a player of a certain makeup contributes to a team. Get a certain team makeup, look at teams with similar makeups, and you can make an intelligent prediction. It has nothing to do with guessing and making one measurement.

I personally enjoy sports more when there are increasingly complex stats involved. It helps me have a better understanding of the game and what's really going on. And as a player, it shows me strengths, weaknesses and holes to attack. The more detailed, the better. Playing a sport and ignoring the stats is like trying to play with one eye closed. You can do it, you can say you're doing just fine til the cows come home - but as soon as someone else opens their other eye, they'll have the advantage.

It's not a matter of number of stats, cephyn, it's picking the right stats. If you have the record, you can put them in a database (not difficult) and query them to produce any stats you want (also fairly easy). The trick is to be picking the right stats to look at and to weight their relative importance properly (both of which might broadly be described as 'knowing your stats').

I guess there's a few different uses of stats we're talking about here. The easiest one is analysis of individual performance. That's the Moneyball use.

The second is the use to 'rank' teams, like RPI or similar measures. This is probably the most problematic, as it has such a big influence on real-world events. You have to pick 65 tourney teams somehow. One way or another, you'll rank teams; the argument becomes a philosophical discussion about what's the best measure for doing so.

The third is to predict the outcome of games. This for me is interesting but academic exercise. I suppose if you had a really good model you could make a tidy sum with your bookie. But in the end it's the real games that count. And there's just too much room for performance variation, tactical choices, and game-changing random-chance events to make prediction an exact science.

Perry: Hey Chad, give us your take on the RPI please! Useful number or not?

I'm not a big fan of the RPI. I understand the idea-- it's to avoid the human factor that keeps pedigree teams ranked much higher than they deserve (*cough* Duke *cough*). It's still a system that can be gamed, though, and I'm not really happy with a rating that penalizes teams for playing cellar-dwellers in their own conference.

cisko: The second is the use to 'rank' teams, like RPI or similar measures. This is probably the most problematic, as it has such a big influence on real-world events. You have to pick 65 tourney teams somehow.

I'm not that bothered by the "ranking" that takes place in determining the NCAA tournament field, because I think it's entirely reasonable to believe that you can sort out the thirty-odd best teams in the country. I'm not convinced that it's possible to rank-order those teams in a really sensible way, though.

The third is to predict the outcome of games. This for me is interesting but academic exercise. I suppose if you had a really good model you could make a tidy sum with your bookie. But in the end it's the real games that count. And there's just too much room for performance variation, tactical choices, and game-changing random-chance events to make prediction an exact science.

This is the part that I don't quite get. I can't cleanly separate attempting to predict outcomes from rooting for teams to win, which is why I always end up filling out my pool sheets to have teams that I like winning it all. I can make some semi-objective assessments of teams, but in the end, sports don't really hold my attention unless I have a rooting interest of some sort. Even when I find myself watching small-conference tournaments, I end up picking one team to root for, because I need somebody to root for.

This is the thing that I really don't get about stat-geeking. When you take the rooting interest out of sports fandom, I just don't find it all that interesting.

I find it psychologically easier to back the teams that I don't want to win, thus hedging my emotional bets.

When I was at college, I remember betting ten quid* on Ireland to win the World Cup at 25 to one, not because I wanted them to win or because I thought that they could, but because I'd need 250 pounds worth of beer to dull the pain of the gloating paddies.

*Strictly speaking, I paid 11 to wager 10, because in those days there was a 10% betting tax which you could pay on the stake as I did, or on your winnings (if you won).

I'm not convinced that it's possible to rank-order those teams in a really sensible way, though.

I agree, but in today's system the NCAA still has to do it. I'd prefer a system that attached almost all tourney slots to conference standings. Make those conference standings really worth something. You could have just a few (say 8) at-large shots, but they would have to take the bottom seeds. You could even have playoffs between minor conference champions to determine who gets one slot -- kind of like the 64/65 game.

I can't cleanly separate attempting to predict outcomes from rooting for teams to win, which is why I always end up filling out my pool sheets to have teams that I like winning it all.

I would sort-of do the same thing, though I usually find a point at which I have no way to predict my team will win. When Cal is in the dance -- not this year, alas -- I'd usually pick them to win to the Sweet 16 or so. Then I'd have them running into a powerhouse that I couldn't picture them beating.

Which isn't to say I'd root against them at that point. No way.

This is the thing that I really don't get about stat-geeking. When you take the rooting interest out of sports fandom, I just don't find it all that interesting.

Wow! First, have you never watched a game as a neutral, just because it was an interesting game? Second, why on earth would stats-geeking be mutually exclusive with rooting interest?

why on earth would stats-geeking be mutually exclusive with rooting interest?

You run into this a lot among the innumerate, the idea that analysis precludes enjoyment, but you have to admit it's an odd statement to hear from a scientist.

By igor eduardo kupfer (not verified) on 01 Mar 2007 #permalink

Re: picking teams I like to win: I would sort-of do the same thing, though I usually find a point at which I have no way to predict my team will win. When Cal is in the dance -- not this year, alas -- I'd usually pick them to win to the Sweet 16 or so. Then I'd have them running into a powerhouse that I couldn't picture them beating.

I've taken a lot of grief over the years for picking Syracuse or Maryland to go to the Final Four in NCAA pools, but it needs to be a really extreme case before I'll pick them to lose. I'll even put them down to win games that I'm sure they'll lose, because I don't want to be in a situation where it's in my financial interest to root against one of my teams.

Me: This is the thing that I really don't get about stat-geeking. When you take the rooting interest out of sports fandom, I just don't find it all that interesting.

cisko: Wow! First, have you never watched a game as a neutral, just because it was an interesting game? Second, why on earth would stats-geeking be mutually exclusive with rooting interest?

Oh, sure, I'll watch basketball just because it's basketball. But if I watch a game for more than a quarter or so, I end up picking a team to root for. If I can't decide who to root for, I usually start channel-surfing, or reading a book, or something.

I can't really separate rooting from watching. Sports are about winning and losing, and if I don't care who wins or loses, I don't care enough to keep watching.

As I see it, honest stat-geeking requires separating the numbers from the rooting interest. The numbers say what the numbers say, whether you like the team or not. There are people who will sort of cherry-pick statistics to try to show that their team is really the best team, but that's fundamentally kind of dishonest-- it's just a different form of guy-in-a-bar arguing.

I'm not interested in which team is "objectively" the best, according to some statistical measure, I'm invested in whether my team wins or loses. I truly and honestly do not care whether the statistics say that Maryland in 2002 or Syracuse in 2003 was the best team in the nation-- what matters is that they won the championship out on the court.

(Well, OK, I care enough to check for the purposes of this comment: Pomeroy has them at #2 and #5 respectively. But that and a dollar will get you a dollar.)

I don't pretend that this is rational, by the way. Rationality is my day job. Sports are for recreation.

I'm pretty much with you Chad, except for this:

This is the thing that I really don't get about stat-geeking. When you take the rooting interest out of sports fandom, I just don't find it all that interesting.

Well, in the case of fantasy sports, it often adds a rooting interest. I have a few teams I like, a few I don't, and the rest I couldn't care less...unless I have a player on their team. Fantasy football gave me a rooting interest in every game, which was great because I, like you, have to have a rooting interest or I lose interest.

But my interest in sports statistics is in watching the train wrecks that arise (like the NBA draft lottery, or the BCS). Take your point that ranking systems are flawed. This is one reason I don't find criticisms of the NCAA tourney convincing that claim it's flawed because so often "the best team" doesn't win. First off, talk of "the best team" assumes the teams can be ranked one dimensionally, which is clearly nonsense. Often Team A will beat Team B consistently, B will beat C, and yet C will beat A. So no matter how much data you have, "the best" will still be a somewhat subjective judgement.

Second, the point of playing sports is to play sports, to win games and championships. The focus should be on giving every team equal chance, within the rules, to do so, not on making sure "the best team" wins. This brings me to the NCAA tourney and the BCS. Having automatic NCAA bids for conference champs is great, makes the conference tourneys exciting. But for the BCS and the NCAA at large bids, they should be done 100% by computer, and the computer ratings used should be the ones that had the best back-tested results. Sure, it wouldn't be "perfect", whatever that really means (see "the best team"). However, it would give every team, from Duke to North Dakota State, the same competitive chance, which IMO is far more important.

And on that slightly off-topic note: how is it that no one is up in arms about the fact that by removing point spreads from the BCS computer ratings, they've made it virtually impossible for a team from a weak conference to make it to the title game? They can't make up for the weak schedule with large margins of victory any more, so even if they beat everyone 100-0 they wouldn't be able to get a high computer rating.

Well, in the case of fantasy sports, it often adds a rooting interest. I have a few teams I like, a few I don't, and the rest I couldn't care less...unless I have a player on their team. Fantasy football gave me a rooting interest in every game, which was great because I, like you, have to have a rooting interest or I lose interest.

See, in those leagues where I don't actually have a couple of decades worth of historical associations with the teams (there are very few teams in the NFL that I don't have some opinions about, for example), I'm perfectly happy to pick rooting interests on a completely arbitrary basis. I'll find myself rooting for the first team that does something I really like, or rooting against the first team to have a player strut and preen like a jackass. Fantasy sports aren't required, and this never puts me in the position of hoping that someone playing against one of my teams does well.

Having automatic NCAA bids for conference champs is great, makes the conference tourneys exciting. But for the BCS and the NCAA at large bids, they should be done 100% by computer, and the computer ratings used should be the ones that had the best back-tested results.

I'm not really wild about that, just because I doubt that any single algorithmic process can really do the job-- there are a lot of factors that are hard to account for in a computer model. Take the case of Cincinatti several years back, when Kenyon Martin shattered his leg a week before the tournament. By any computer-based measure, they were a #1 seed, but with their best player on the sidelines in a cast, they were nowhere near being a favorite to win a region, let alone the whole tournament. They were rightly dropped to a #2 seed, and lost in the second round, but I don't think you would've gotten that result from a computer.

There are a bunch of fuzzy decisions that I think are still better made by humans. I'm fine with having a committee do the NCAA seedings, and I think that by and large, they doa good job.

And on that slightly off-topic note: how is it that no one is up in arms about the fact that by removing point spreads from the BCS computer ratings, they've made it virtually impossible for a team from a weak conference to make it to the title game?

I figure that until they come to their senses and institute a playoff system, it really doesn't matter how they pick the BCS teams. They could look at sheep entrails for all I care-- without a playoff, it's not a real championship.

Yeah, we're pretty much in the same boat. I agree on the seedings in the NCAAs done by humans. I was calling for the computers for who makes it in. Still, there is the risk of the same kind of injury scenario you paint for a bubble team, but to me that would be worth the tradoff to see teams in lesser known conferences get more of a shot than they do now. These things always come down to subjective tradeoffs.

This is completely orthogonal to the topic of the post, but not to the title of the post.

I happened have had the privilege of being at university (major: engineering physics) in around 1970. I had mastered the use of the slide rule yeasrs before--in high school. HP introduced its first calculator, then going for the then princely sum of US$325 (or so) in around 1970--that was the calculator that had no equal, a pun with a double entendre.

The amazing thing is that, despite the fact that the HP calculator's price was some six or seven times the price of the K&E duplex/decitrig slide rules, within six months of the calculators' introduction, K&E slide rules were nowhere to be found in the university bookstores. To me, that was one of the most profound examples of one technology overtaking another that I have ever seen, even to this day.

I'm not sure that anyone is making the argument (as you imply in your post) that "people" are using stats to take the fun out of the game or that they believe the stats are seers of the future. I think many folks are using the stats to gain additional insight into the games themselves. I would argue that this is similar to using numerical models to simulate laboratory tests -- the models will never replace the physical tests, but can be used to gain additional insight into what is occurring (e.g., estimating parameters that are not easily measurable in a physical experiment). Think of it like BASF.

You don't think anyone would actually be using sports statistics to try and predict the outcome of game for the purposes of wagering do you???

I pretty much disagree with everything you say here.

Using data to help you understand reality is a beautiful thing and should have no negative effect on your ability to enjoy the games you love. Knowing that Michigan State has their best FG% defense since the 1950's didn't keep me from getting chills up my spine and teary-eyed when they beat the #1 Wisconsin Badgers and the fans stormed the court.

Since it's impossible to actually watch every game, statistics fill you in on what you missed. And even if you could watch every game, intelligent analysis of the statistics would help you understand what you saw. If you're a player or a coach or a general manager, you'd be an absolute fool not to use the best, most sophisticated methods available to discover and exploit your opponents tendencies and weaknesses.

Maybe it's not the data you find annoying, but the foolish misuse and misunderstanding of them?