Baseball's World Series is played over the best of seven games. The first
two games are played at the home field of one team (we will call
this one team A), the next three at the home field of team
B, and the last two at the home field of team A. Given that
teams are more likely to win games on their home fields, does
this give team A an advantage?

Every Series goes either four, five, six, or seven games. We
don't know at the outset how many games it will
go. Suppose it goes four games. Then there will have been two
games in each team's park. No advantage. Suppose it goes
five games. Then there will have been three games in team
B's park and two in team A's park. Advantage to team
B. Suppose it goes six games. Then there will have been three
games in each team's park. No advantage. Suppose it goes
seven games. Then there will have been four games in team
A's park and three in team B's park. Advantage to
team A.

Let's take stock. In two of the scenarios, there is no home-field
advantage. In one scenario, the team that begins at home has an
advantage. In one scenario, the team that begins on the road has
an advantage. It's a wash! Where's the overall

Is he right, or does team A have the advantage? The answer is below the fold.

Team A has the advantage. The easiest way to see this is to imagine
what would happen if all seven games were always played instead
of stopping when one team gets four wins. Then team A would
have the advantage since it plays one more game at home. But
once a team reaches four wins, playing the remaining games makes
no difference to which teams wins the Series, so team A has the
same chance of victory in the "stop at four" and in the "play all seven"
formats. So team A has the advantage in the "stop at four"
format that is actually used.

A second way to see it is to consider what would happen if home field
advantage was absolute. Then each team wins all their home
games and team A wins the Series in game 7. So team A has

For people who are not convinced, I have calculated the actual
probabilities in the table below (this requires that your browser
supports Javascript). The home team has won 57% of the games in
World Series play1, so there is a 57% chance the Series score will be
1-0 after game one and a 43% chance it will be 0-1. The score will be
1-1 after two games if either B wins with the score 1-0 (43% of
57%) or if A wins with the score 0-1 (57% of 43%). The total is
49% (43% of 57% + 57% of 43%). Similarly you can work out the
probability of each Series score and add up the ones where A
wins to work the chance of A winning the Series and see that A
has a 52% chance of winning the seven game Series. The
calculations assume that the results of the games are
independent---that there is no such thing as "momentum" where
the winner of a game is more likely to win the next game.

You can experiment by entering different percentages for home field
advantage or different formats for the sequence of home and away
games. (Press 'Enter' after changing values to have the table
update.) Notice that the format makes no difference to A's
chance of winning the Series---it's the same with BBBAAAA as it
is with AAAABBB.

Percentage of games that A wins at home
onChange="calcProbs(this.form)"/>
Percentage of games that B wins at home
onChange="calcProbs(this.form)"/>
Home/Away Format
onChange="calcProbs(this.form)"/>

How well do the calculated probabilities match with what happened in
the 78 World Series with the AABBBAA format? The table below shows
the actual results. The entry for "2-0", for example, says that in 27
of the 78 Series (35%)
the score was 2-0 after two games. It also shows that in 20 of those
27 Series, team A went on to win. The percentages agree fairly well
with the calculated ones in the table above. name="nf2">2

 0-0 100% 45/78 1-0 62% 30/48 2-0 36% 21/28 3-0 13% 10/10 4-0 12% 9/9 A wins58% 0-1 39% 15/30 1-1 49% 21/38 2-1 47% 24/36 3-1 21% 13/16 4-1 12% 9/9 0-2 16% 3/12 1-2 32% 11/25 2-2 44% 21/34 3-2 27% 15/21 4-2 13% 10/10 0-3 9% 0/7 1-3 17% 2/13 2-3 36% 12/28 3-3 42% 17/32 4-3 22% 17/17 0-4 8% 0/6 1-4 6% 0/5 2-4 9% 0/7 3-4 19% 0/15 B wins 42%

Another comparison you can make is between the expected distribution
of Series lengths and the actual numbers.
These are shown in the table below.
For example, there is a 6% chance of a 4-0 win by A and a 6% chance of
a 4-0 win by B, so you would expect (6%+6%)*78 Series that end in four
games.

 Games played four five six seven Observed Expected

What is surprising when you compare the Series lengths with the
expected numbers is that there are a fair bit more four and seven game
Series than you would expect. Of course, it could just be chance, so
we should test to see if the difference is statistically
significant. The appropriate test for this is called the href="http://www.unc.edu/~preacher/chisq/chisq.htm">Chi-square
test. You can see the results of the test below---the p
value is the important number and tells us how likely such a large
difference could arise by chance. A p
value of less than 0.05 is generally considered statistically
significant, so this difference is statistically significant.

Chi-square
df
p value

Now there are some possible explanations for the extra four-game Series.
For example, I assumed that the teams are evenly matched except for
home field advantage, but if one
team is better than the other, that increases the chance of a four
game Series. (Try entering 67 and 47 as the percentages that A and B
win at home.) Trouble is, that decreases the chance of a
seven game Series. (Try it---the p value even goes down.)
Similarly, if there is "momentum" and winning one game makes a team
more likely to win the next game, that makes four-game Series more
likely but seven-game Series less likely.

The only way that there can be a sixth game is if the score is 3-2
after five games. For the Series to go on to seven games, the team
that is behind must win the sixth game. Remarkably, that has happened
31 out of 49 times or 63% of the time instead of the 50% you would
expect. I can't think of a good reason why this would

Update 30 Oct: Included result of 2004 Series.

1 The results of all the
games are available from href="http://baseball-almanac.com/ws/wsmenu.shtml">this site.
I collected the results for the 78 world Series that used the AABBBAA
format for games (the Series from 1924 to 2004 except for 1943 and
1945) and put them in this file so you do your own
calculations if you are so inclined.

2
Alan Abramowitz has href="http://www.emory.edu/EMORY_REPORT/erarchive/2003/October/October20/10_20_03firstperson.html">article
on home field advantage. While he finds that the team with home field
wins more often, he argues that this is not because they get to play
the seventh game at home:

playing game seven at home does not appear to be a significant
advantage in the World Series. ... Moreover, since the 2-3-2 format was
introduced in 1924,
the home team has won only 16 of 31 seventh games (52 percent), far
below the 57 percent success rate of the home team in all World Series
games.

I don't think 52% is "far below" 57%. In fact, if the home team had
won just two more of those seventh games the success rate would have
more than 57%. If you do a statistical test (the Fisher
Exact Test
) you will find that the difference between 52% and 57%
is not even close to being statistically significant, so it is wrong
for Abramowitz to reject the notion that the advantage comes from the
extra home game. (He also miscounts the number of wins for the home
team in game seven---it is actually 17 of 32, or 53%.)

Tags

### More like this

Burgess-Jackson's argument is an interesting illustration of the pitfalls of the principle of insufficient reason. He implicitly assumes that seven-game and five-game series are equally likely (actually, if this were true, the setup would be biased in favor of B since a 3-2 edge in home games is better than 4-3).

Given equally matched teams, there's at least a 35 per cent probability that the first six games will split 3-3 (this probability is greater, the greater the home team advantage).

The probability of a 4-1 series is about 20 per cent. I think this decreases with home team advantage, but I'm not sure.

This was really interesting, especially your game 6 data. That prompted me to break it down a little more. What I seemed to see is that the HFA is at its strongest when a team behind in the series returns to its home stadium (i.e. in games 3 and 6). The URL I entered goes the the BaseballThinkFactory discussion of your article, where I posted my data. You should take a look; I think you'd find it interesting.

Great job, this was fun to look at.

The biggest question is if the change of the rules as regards the DH affects series probability significantly.

While a seven game series is more likely than a five game series, that isn't the main error in Burgess-Jackson's argument. Suppose that the maximum number of games was six (so if it ends 3-3 the title is shared). Then B-J's argument would lead you to conclude that team B has the advantage because B gets home-field advantage in a five game series and and neither team gets the advantage in four and six game series. But that conclusion is incorrect. B is more likely to win in a five game series than A (14% to 11%), but A is more likely to win in a six game series than B by exactly the same margin (17% to 14%). So, actually neither team gets an advantage in this best-of-six format. The reason is the same as the one I gave earlier -- if you always played six games the same team wins as when you stop at four wins, and neither team has home-field advantage after six games.
You can experiment with formats like BBBAAAA and AAAABBB and you will see that it makes no difference to the probability that A wins.

Of course, the team that wins game 1 wins the Series 60 some odd percent of the time, so Burgess-Jackson's argument falls apart with just that statistic. That means the team playing at home in Game 1 has an empirically demonstrated advantage. He'd know that if he'd watched Game 1 of this year's series (though his reasoning capabilities may not be sufficient to put 2 and 2 together there).

One potential reason why the trailing team wins more than their share of Game 6's is that managers adopt different strategies. The trailing manager is desperate and may bring in Game 7 starter or overuse their closer or play an injured player. One way to test this would be to see whether the Game 6 winner lost more than their share of Game 7s (controlling for home field advantage. A second (partial) reason could be home field advantage. What fraction of Game 6's is the trailing team the home team? There also may be a psychological factor (back to the wall, but still hope, whereas down 3-0, no hope (unless you're the Red Sox)).

Marc, an interesting idea. Of the 32 7-game series, 18 were won by the winner of game six, so the game six winner won more than their share. When the trailing team was playing at home, they won 21 out of 28 game sixes, which is way more than you would expect from home field advantage.

With all of the talk about Game 5 and Game 6, I'd like to point out that this series only went four games. Of course this doesn't bear on your very interesting discussion, but I've been waiting for decades to gloat a little.

Another factor worthy of consideration in the results of games 6 and 7: perhaps home field is more likely to decide the result when the teams are closely matched than when they are not. Think of it not as a co-factor but a tiebreaker. Thus, the more likely a series is extended, the more likely it's decided by the home field. How to measure this? Well, were series of more than 5 games demonstrative of a larger home field advantage?

By Eric Ingman (not verified) on 30 Oct 2004 #permalink

What if we had 3-3, then the seventh game (if there is one), played at a field where neither team has the HFA? Kinda like always playing the FA Cup Final at Wembley.