Back to Back Statistics

Two fans in Dodger stadium caught back to back fouls during a Mets game (and, almost as importantly, the Dodgers lost, woohoo!)

From the article:

But USC mathematics professor Kenneth Alexander used Wednesday's Dodger Stadium crowd size and game statistics -- 40,696 in attendance and a foul ball count of 48 -- to postulate the odds against Walker and Castro catching back-to-back fouls.

Calculating that only one of 18 pitches were fouled into the stands (other fouls stayed on the playing field) and factoring in the six fans sitting close by the pair, Alexander fielded the problem.

"One in 10,000 is the probability" that the pair would catch two fouls in a row, he concluded. He cautioned, however, that his specialty is probability theory in mathematical physics.

I assume that this is the odds per game? If so this should happen about once every four years? Anyone think this is number is in the right ballpark?

More like this

Well the 40000 at the game is mostly meaningless - you have to look at the distribution of location of the fouls - that is what determines the probability that two go almost right next to each other. Once every 4 years seems in the ballpark....oh yeah... I would guess this happens every few years from anecdotes of one guy getting 2 fouls - which happens pretty often. For example, if a pitcher is throwing outside fastballs and the batter isn't getting around, he could hit several to almost the same place.

Actually, I suspect that the likelihood of two adjacent fans catching back to back fouls is greater than the likelihood of the same two fans both catching fouls but in separate innings. Both the batter and the pitcher have more influence on this than pure probability.

chezjake: I certainly think the probability of two fans catching both fouls is higher, per pair of attempts, for the same batter/pitcher combo. But there is that enough to overcome the fact that there are many more pairs when taken over all plate appearances?

A few years ago Dawkins and Gould both published books at the same time. So Evolution asked them to review the other's book. Dawkins spent the start of his review complaining that Gould's book was incomprehensible to most people outside the US, because it was written in baseball jargonese. I get the same feeling here too.

One comment I would make is that I think it's very unlikely that the everyone has an equal chance of catching the ball. My guess is that the people at the back of the stadium are probably less likely to have the ball fly in their direction. So, even with change in pitchers and batters the probability won't be uniform.

All these analyses fail because of implicit assumptions of independence and linearity.

The sub-game of importance here is the duel between a specific pitcher and a specific batter.

Some batters are good at hitting high and inside pitches, for instance, and some at low and outside picthes. Some batters tend to swing at the first pitch. Some batters can get "a piece of the ball" and hit many fouls in a row, extending the mean length of the at-bat. Different pitchers have different repertoires of pitches: fatsballs, curves, change-ups, sliders, knuckleballs, splitters, and so forth.

For the string theory of balls and strikes, see:

A136407 Valid strings, in lexicographic order, of Balls ("1") and Strikes ("2") in a Baseball at bat. Numbers that contain only 1's and 2's never exceeding 3 total 2's or 4 total 1's, whichever comes first.

Each at bat is a 2-person zero sum game between batter and pitcher, but is in the context of whether or not runners are in scoring position, the score, the inning, and dozens of other factos, many of which are statistically tracked by each team.

Here's one slice of that:

"The Cognitive Psychology of Baseball!
http://scienceblogs.com/mixingmemory/2007/06/the_cognitive_psychology_o…
Category: Cognitive Psychology
Posted on: June 30, 2007 9:54 AM, by Chris

Ah, yes, a real game (kidding, Scrabble people). If you've watched many baseball games or baseball movies, you know that one of the things that makes for a successful hitter is the ability to predict what the
next pitch will be. Is it going to be inside or outside? Will it be a fastball or a breaking ball? If you're expecting a fastball and get a slow, breaking curveball, it's unlikely you'll get anywhere near it."

"So cognitive processing is an important part of being a good hitter. At least, that's what a hitting coach would tell you. And according to a 2002 paper by Rob
Gray in Psychological Science, they'd be right."

"Basically, Gray had college baseball players stand in front of a screen with a simulated baseball diamond, and swing at a simulated pitch. This setup led to one of the coolest method sections ever, if you're a baseball fan and a geek (like me):"

"Mounted on the end of the bat (Louisville Slugger Tee Ball bat; 63.5 cm long) was a sensor from a Fastrak (Polhemus, Colchester, Vermont) position tracker. The x, y, z position of the end of the bat was recorded at a rate of 120 Hz."

"The pitch simulation was based on that used by Bahill and Karnavas (1993). Balls were launched horizontally (i.e., 0� ) from a simulated distance of 59.5 ft (18.5 m); that is, the pitcher released the ball 1 ft in front of the pitching rubber. The only
force affecting the flight of the ball was gravity."

"The height of the simulated pitch at time t, Z(t), waschanged according to

Z(t) = -1/2 * g * t^2,

"where g is the acceleration of gravity, 32 ft/s (9.8 m/s)."

"I know you're supposed to include descriptions of your equipment in method sections, but I can't get over the inclusion of the fact that the bat was a Louisville Slugger. I'm sorry, I'm a baseball fan."

"Anyway, Gray included two kinds of pitches: slow and fast. The fast pitches were simulated at 85 +/- 1.5 mph, and the slow ones at 70 +/- 1.5 mph. Whether the
pitch was fast or slow depended on the pitch count. For 0-0, 1-0, 0-1, 1-1, 2-1, 2-2, and 3-2 counts (that's balls-strikes, for those of you who don't know baseball), the probabilities for fast and slow pitches
were .50-50. For pitcher's counts (0-2 and 1-2), the slow balls were more likely (0.65), and for hitters counts (2-0, 3-0, and 3-1), fast pitches were more likely (0.65). The pitch count was displayed on the screen so the hitters could keep track. There were three different horizontal positions for the pitches: strike, outside ball and inside ball. The strikes crossed the plate at 0 +/- 1 inch from the center of the plate, the outside balls at 12 +/- 1 inch away from the center of the plate (in the direction away from the batter, that is), and the inside balls at 12 +/- 1 inch from the center in the direction towards
the batter. Whether the pitch was a ball or a strike was randomly chosen. Each hitter took 25 swings per block for 10 blocks, with rest (a lot, I hope, 'cause 250 swings is crazy) in between."

"... So here's what the hitters had to predict: whether the ball would be fast or slow, and whether it would be inside, outside, or down the middle. Since certain
types of pitches (e.g., slow breaking balls) are associated with pitcher's counts, and others (fastballs, mostly) are associated with hitter's counts. Since Gray used these associations to determine pitch probabilities, the batters had some
basis for predicting pitch speeds. Since pitch locations were random, the batters just had to guess these...."

Gray, R. (2002). "Markov at the Bat": A model of cognitive processing in baseball batters. Psychological Science, 13, 543-548.

Two great baseball players engaged once, it is said, in a very sophisticated Zero Sum Game of Mathematical Disinformation Theory.

The story is well-know; the analysis by Jonathan Vos Post is original.

Yogi Berra [catcher] to Hank Aaron [batter]:
"The label's on top." [Translation: it is widely beleieved that the location of the label of the bat with respect to the bat-ball impact point affects the probability that the bat will break on contact, which is negatively correlated with the probability of a home run, due to momentum conservation considerations; hence I offer to you that you drop your model of this at-bat and replace it by one with an additional variable to take into account,
as I hope that you will, I expect you to decline in performance by the replacement cost of model-switching]

Hank Aaron [batter] vs Yogi Berra [catcher]:
"I didn't come here to hit and read at the same time."
[Translation: I'm on my way to the record number of home runs hit in a lifetime, surpassing Babe Ruth's record, and likely to stand until at least 2007 with Barry Bonds and Alex Rodriguez perhaps 12 to 14 years later; I decline to make my analysis more complex, as I assert that I measure myself as being on the manifold rather near the global maximum of my performance in the zero-sum game between pitcher and batter, and believe that I would do worse if I changed my eigenvector in the predicted fast-ball/ curve-ball/slider probability distribution; while you are destined to be known in 2007 primarily through a cryptic ad for Aflac, hence do not divert me from my optimal allocation of resorces].

The branch-and-bound Decision Analysis corollary by Yogi Berra, at another time:
"When you come to a fork in the road, take it!"

Baseball. America's Game. The America of John Forbes Nash, Jr., anyway.

-- Prof. Jonathan Vos Post

A few years ago Dawkins and Gould both published books at the same time.

What were the odds of that? :)

Don't forget you have to subtract out of the 40,000 those fans sitting in the outfield seats. Anything they catch won't be a foul ball, it will be a home run.

Interestingly I haven't been able to find a chart showing the probability distribution of foul ball in a stadium. Anyone see any such data or know a good way to get a rough estimate of ball-bat scattering?

The charts showing where a ball is hit are called "spray charts" generally if that helps, but I can't remember seeing one which would have where the foul landed... left and right, out and in and back. I looked and didn't find anything much. I suspect the pro teams have all that but I doubt it is public.