(Original title: Some Thoughts . . .
Later title: Some Bold, Original Thoughts. . .
But heck, if you want attention, you've got to ask for it!)
(For newbies -- $H is the percentage of balls in play that are turned into
outs. This varies surprisingly widely from year to year for any given
pitcher, leading some to question whether the ability to get guys to hit easy
grounders and fly balls is a real ability or not. Voros McCracken has explored
this notion in detail as Defense Independent Pitching Statistics (at least I
think that's what the acronym stands for <g>)).
1) Much of the reason why $H varies so much from year to year, where K% and BB%
don't, is *because the sample size is smaller*. Too small. Note that HR%,
which has a very similar sample size, also has a large variance from year to
year. The additional lack of correlation in $H can be attributed to larger
team defense and park effects.
It's trivial to simulate multiple seasons of $H in Microsoft Excel for a pitcher
with any theorized innate level of $H. The results are eye-opening.
Put this formula in cell A1:
=IF(RAND()>0.271,0,1)
where the 0.271 is whatever $H rate you want to simulate (that happens to be an
estimate for Maddux).
Now copy the formula down to the next 650 cells (650 being a typical number of
balls in play for a pitcher with 150+ IP).
In cell B1, put:
=SUM(A1:A650)/650
Now hit F9 repeatedly to regenerate the random numbers. Cell B1 will report
each season's $H number. You will see quite a bit of variation. I'm doing
this in real time:
.280, .308, .298, .271, .278, .255, .275, .263, .274, .314, .257, .263, .266,
.286, .263.
Obviously this CyberGreg learned to pitch after those first three mediocre
seasons! And he must have come up with some adjustment after the .314 to go
back to .257, which started another string of vintage years.
(In case you miss the point, the previous paragraph is total nonsesnse, although
total nonsesnse of the sort our brain is hard-wired to come up with.)
(Lowering the sample size just a bit will increase the variance noticeably.)
Let's do Glavine at a theorized .282:
.252, .288, .292, .317, .258, .288, .298, .266, .285, .280, .285, .280, .297,
.298, .305.
He was "better" than CyberGreg in years 1, 2, 3, 5, and 10.
Any trained scientist would dismiss the notion that there were any difference
between these two pitchers. Over 15 seasons, CybeGreg averaged .277 +/- .018,
CyberTom .286 +/- .017. That has a 16% chance of happening at random. And yet
we *know* the pitchers *are* different!
(OTOH, the scientist would also dismiss the *obvious* fact -- fact! -- look,
it's obvious in the numbers! -- that CyberTom was the better pitcher for the
first 5 years, until CyberGreg developed his mastery. And the scientist would
be correct: CyberGreg was just *unlucky* for 5 years.)
In reality, of course, the different levels of team defense behind the pitcher
cause an even greater fluctuation from year-to-year. Two years ago the real
Greg and Tom had their $H's go up hugely, to .331 and .318. If you include '99
and ask, what are the odds that this string of $H numbers could happen by
chance, you basically get No for an answer (2 and 4% chance, respectively).
Removing '99 returns the odds on Maddux to 72%, and Glavine to 20%. IOW, with
the exception of one season, you don't have to invoke *anything* beyond random
variation to explain the fluctuation in their $H numbers in their career
together, and in Maddux's case the variation is particularly trivial.
So: because of small sample sizes, differences in pitcher $H are *very* hard to
detect statistically. But that doesn't mean they're non-existent or even
trivial in size. The apparent difference between Maddux and Glavine over their
joint career in Atlanta, in terms of ERA, is about 0.20. That's not trivial.
Give me that every day and I'll finish 3 games ahead of you in the strandings.
My rough guess / gut feeling is that there are pitchers who shave up to 0.25 off
their ERA and others who lose about 0.25, because of their innate $H ability.
Maybe even more.
2) I believe that my study of pitchers who changed teams may be the first to
show conclusively that there *is* an effect. I originally compared pitcher's
new $H to his old $H, and his old and new team's $H with him included in the
team totals. That was messy. Subtracting him out from the team totals (this
spreadsheet is 68 columns wide and growing!) gives us this formula for a traded
player's $H, once everything has been converted to z-scores (minimum 450 balls
in play, which gives us 443 pairs of seasons in the study):
New $H = (.201 * old $H) + (.290 * new team's $H) - .076
The significance of the old $H factor is immense -- 1 chance in 684,050 of
happening by chance!
Combined, the two factors account for just 22.4% of his new $H.
So, this says his new $H is based 13% on his new team $H, and 9% on his old $H.
What determines his new team's $H? It's the average of the innate $H of all the
other pitchers -- which will *tend* to be average -- plus the effect of the
defense per se, plus the manager, plus the park effect. Now, the quality of the
other pitchers on the team doesn't actually affect him, while the other factors
do. The fact that there must be good and bad staffs for $H will thus add noise
to the relationship. This additional noise will thus weaken the observable
relationship of new $H to new team $H. Thus, the true factor for team $H
contribution is larger than 13% -- it's something that reduces to 13% when you
add the noise effect. It may be possible later to estimate the size of this
effect. I'm guessing 15 or 16%.
The last term above, the apparent -0.76 fudge factor (that's .076 of a standard
deviation, about .0017 in raw $H terms) is very interesting. Pitchers who
change teams *unquestionably* have a reduction in $H, that is independent of
everything else, from an average of .280 to .276. It will be very interesting
to see if this varies by era. I have only one weak idea why this happens:
change-of-scene psychology? I'm not convinced.
3) So how should DIPS work? We shouldn't throw out $H entirely. We can do two
things: we can compare $H to team $H without that pitcher, and we can look at
lifetime $H. Combining both approaches, we can look at lifetime $H relative to
team. This can then be modified by park and manager effects (when known) to get
a true $H estimate. This might be very valuable for predicting future pitcher
performance.
Until we do this, we won't really know how large the range of innate $H skill
is, and we won't know how much trouble to go to in order to figure it out!
(And I can think of a heaping mess of trouble. Ask me about it if you dare.)
4) None of this abrogates the importance of Voros' work with DIPS. The
realization that there are very large utterly random fluctuations in ERA, and
the understanding of where exactly that comes from, so that we can say whether a
change in ERA was luck or skill -- that remains revelatory.
5) Pedro Martinez went from a true $H of .325 in 1999 to one of .237 in 2000.
His HR rate, however, nearly doubled. The odds of this all happening by chance
are about 1 in 286. Conclusion: if anyone could refine their pitching approach
in a way that reduced how often batters hit the ball hard, at the expense of
some extra homers, it's Pedro. I think his true $H actually did go down last
year.
--
----
Eric M. Van
em...@post.harvard.edu
". . . from that day forward she lived happily ever after. Except for the dying
at the end. And the heartbreak in between." - Lucius Shepard.