|
Defense in Scoresheet
Greg Hardy -- The Annie Savoy Memorial
American League
Updated 17 Feb 99
Baseball HQ's recent experiment in Scoresheet
strategies got me thinking, once again, about the most important factors in building a
winning Scoresheet baseball team. While we can debate draft strategies at length, I
decided to take a look at stats after the fact. I wanted to know which stats were
most important in determining whether a Scoresheet team was successful or
unsuccessful.
I looked at 20 seasons of Scoresheet results, from
1990 to 1998. I had on hand nine seasons in which I have participated, and I took the rest
from leagues that posted their results on the Internet. Thanks to all the webguys who took
the time to make their results available -- the NLTAL, NL114, NL No Glory, NL Over the
Wire, NL Mad Chatters, NL113, AL Sloane, AIL 1998, AL NAIL 98, 1998 Big Hurt, and
AL-Worrall. Many of the leagues were one-year leagues, but there were some perpetual ones
as well. They ranged in size from 8 to 12 teams. All told, I looked at 8 NLs and 12 ALs.
I went through the final standings and
statistics for each season. I picked out the top three and bottom three teams in terms of
won-loss records and looked over specific stats for each team in relation to the rest of
the league.
The question I asked of each league is: How many
of the three most successful teams finished in their league's top three in terms of
statistic X, and how many of the losing teams finished in the bottom three of statistic X?
The higher the correlation of each -- for example, if the top three teams had the three
best ERAs in the league, and the bottom three teams had the three worst ERAs in the league
-- the more likely it would be that the statistic in question was important to building a
successful Scoresheet team.
Or at least that's what I'm assuming; I'm no
statistician. Not that this fact will stop me from making some broad generalizations about
the numbers that crop up. But I wanted to see how these factors compared to each other
in a relative sense and draw conclusions from there.
This is not an effort to decide which specific
players should be drafted. Rather, it should show which statistics are most determinate in
building a successful team. Using this information, one can then develop draft lists
focused on which players will contribute positively to the key statistics that result in
winning teams. This is like using real data to make your draft selections!
In terms of methodology, this study yielded
more than 60 "winning" teams and more than 60 "losing" teams -- there
were some seasons in which two teams tied for the third-best or third-worst record. Some
of the "winning" records were not all that great; 84 wins were sometimes enough
to qualify a team as one of the top three in its league. Also, some of the
"losing" teams were not so bad; two of the four "losing" teams from
the 1998 ASMAL season actually went 82-80.
The most successful team in the study went 112-50
in a regular 162-game season. The least successful team went through a ghastly 39-123
campaign.
Finally, I dealt in absolutes. Teams were ranked
against their peers, and it did not matter to me if only a single percentage point
separated Team Y's on-base percentage from that of Team Z. The top three were the top
three, no matter how slim the margin. The same went for the bottom three.
So what happened?
Well, Scoresheet says that Run Differential
is usually a good indicator of a team's success, and there are many studies that support
that theory in real-life baseball, so I figured out the run differentials for the 20
seasons. How often did the three winningest (is that a word?) teams enjoy the league's top
three run differentials? 75 percent of the time. And how often did the worst three teams
suffer through the worst run differentials? 80 percent of the time. Run Differential
successfully predicted the top and bottom three teams a combined 78 percent of the time.
Well, sure. We all know that to win, you
have to outscore your opponent. But which individual Scoresheet stats most often
determined the success or failure of a ballclub?
I broke the study down into three main
areas: pitching, offense, and defense.
Under pitching, I grouped ERA, complete games
(CG), shutouts (Sho), saves (Sv), and WHIP, or Walks plus Hits allowed per Inning Pitched.
I was curious to see if these numbers would give us an idea as to whether starting
pitching or relief pitching were more important. I don't think anything conclusive came
out of this, but the info is there to absorb.
Under defense, I looked at Scoresheet's
Outstanding Plays (OP), which is tied directly to range factors, and Errors (E).
Scoresheet started including OP in 1992, so those numbers reflect results from 1992-1998.
Under offense, I looked at Runs Scored (RS),
on-base percentage (OBA), slugging percentage (Slg), OPS (OBA plus Slg), home runs (HR),
and total stolen bases (SB). I realize that SB success rate may have been a better
measure, but didn't want to take the time to hem and haw over that.
One more quick note on methodology: Since I
was pulling three teams for each category from leagues ranging in size from 8 to 12 teams,
I'm working under the assumption that results ranging from 25% (3 of 12) to 38% (3 of 8)
indicate a correlation that is no better than random selection; that is, it's a stat that
has no indicative value in terms of success or failure. I could be wrong on this, and if
anyone can explain it to me without using the word "coefficient" more than once,
please do. Regardless of whether this assumption is correct or not, we will still have a
relative ranking of factors.
So here, in terms of percentage, is how
often the best three teams posted the best stats in each category, and the worst three
teams posted the worst stats in each category. I then combined the two numbers to come up
with an overall percentage and ranked each stat according to the apparent influence it has
on building a successful team, relative to the other stats in the study:
Scoresheet Success Stats
|
Statistic Measured
|
Best 3 Pct
|
Worst 3 Pct
|
Combined Pct
|
Rank
|
Run Differential
|
75
|
80
|
78
|
1
|
Pitching
|
ERA
|
60
|
68
|
64
|
3
|
Complete Games
|
44
|
50
|
47
|
11
|
Shutouts
|
55
|
52
|
53
|
9
|
Saves
|
56
|
57
|
56
|
8
|
WHIP
|
61
|
61
|
61
|
4T
|
Defense
|
Outstanding Plays
|
38
|
47
|
43
|
12
|
Errors
|
33
|
43
|
38
|
14
|
Offense
|
On-Base Pct
|
57
|
63
|
60
|
6
|
Slugging Pct
|
54
|
62
|
58
|
7
|
Home Runs
|
48
|
54
|
51
|
10
|
Total Stolen Bases
|
38
|
43
|
41
|
13
|
Runs Scored
|
62
|
74
|
68
|
2
|
OPS
|
57
|
64
|
61
|
4T
|
Let's put those stats in order of relative
importance, or how they rank against each other:
Scoresheet Success Stats
|
Statistic Measured
|
Best 3 Pct
|
Worst 3 Pct
|
Combined Pct
|
Rank
|
Run Differential
|
75
|
80
|
78
|
1
|
Runs Scored
|
62
|
74
|
68
|
2
|
ERA
|
60
|
68
|
64
|
3
|
OPS
|
57
|
64
|
61
|
4T
|
WHIP
|
61
|
61
|
61
|
4T
|
On-Base Pct
|
57
|
63
|
60
|
6
|
Slugging Pct
|
54
|
62
|
58
|
7
|
Saves
|
56
|
57
|
56
|
8
|
Shutouts
|
55
|
52
|
53
|
9
|
Home Runs
|
48
|
54
|
51
|
10
|
Complete Games
|
44
|
50
|
47
|
11
|
Outstanding Plays
|
38
|
47
|
43
|
12
|
Total Stolen Bases
|
38
|
43
|
41
|
13
|
Errors
|
33
|
43
|
38
|
14
|
Conclusions
Assuming my methodology is legitimate, one can
draw some conclusions from this exercise.
-
Of the statistics measured, Run Differential is
indeed the best predictor of a team's success or lack thereof. Not surprising. However,
this is a stat that grows out of other stats, and it's not like you can draft a guy who
provides "good run differential."
-
In my first survey of just 10 leagues, ERA was the
most accurate determinant of a team's success or lack thereof in terms of individual
stats. However, in this larger sample, Runs Scored is now ranked higher. Note that there
is a solid balance of offensive and pitching stats at the top of the chart. This segues
directly into the discussion on how much more difficult it is to predict how pitchers will
perform than how hitters will perform. Of course, I picked Frank Thomas -- a consistently
great hitter -- with the third overall pick in 1998 to form the cornerstone of my AL
continuing league team, and he promptly went into the tank, relatively speaking. I'll stop
whining now.
-
I thought Saves would rank much higher on the list,
since more wins would mean more save opportunities and therefore more saves. But this stat
finished in the middle of the pack.
-
The numbers for the worst three teams were higher
than those of the best three teams in every category except Shutouts. I guess this means
that there was more competition and/or balance at the top of the standings than at the
bottom, where the three worst teams were clearly inferior to their league counterparts,
but I welcome any thoughts on the mattter.
-
The two defensive statistics, Outstanding Plays
(Range) and Errors, ranked 12th and 14th out of 14 stats measured. If the methodology is
correct, then one could reach the conclusion that fielding a good or bad defensive team
seems to have little effect on one's chances of putting together a winning season. As I
understand it, Baseball HQ's 1998 Scoresheet Exhibition also underscored the apparent lack
of effect of Scoresheet defensive range numbers. Scoresheet has responded by prohibiting
some out-of-position moves and plans to make their range factors more drastic in both
directions, giving good defensive players even greater range and poor players even less.
It will be interesting to see if these moves have any effect. I beat the defensive numbers
drum even more below.
-
Total Stolen Bases came in 13th on the list, just a
couple of points above randomville. So, um, if defense and stolen bases are the least
indicative of success, do Brian Hunter, Rey Ordonez, Deivi Cruz, and their ilk have any
value at all in Scoresheet? I'll continue this discussion on defense below.
In view of the fact that Run Differential is
clearly the best indicator of success, I decided to take a look at which stats have the
greatest effect on Runs Scored and Runs Allowed. I had some time on my hands one weekend
-- a rarity -- so I went through the same 20 Scoresheet leagues and used the same
methodology. Instead of picking out the three teams with the most wins and the three teams
with the most losses in each league, I looked at the three teams that scored the most
runs, the three teams that scored the fewest runs, the three teams that gave up the fewest
runs, and the three teams that gave up the most runs.
In terms of Runs Scored, I checked out
correlations with HR, BA, Slg, OBA, OPS, BB, strikeouts (K), Sac Hits, GIDP, SB, and LOB.
I looked at the positive and negative sides of each stat. For example, a team that scored
a lot of runs would be expected to be in the top three in HR, BA, Slg, OBA, OPS, BB, Sac
Hits, and SB, and should be in the bottom three in terms of K, GIDP, and LOB. On the other
hand, the three teams per league that scored the fewest runs should be expected to score
in the bottom three in HR, BA, Slg, OBA, OPS, BB, Sac hits, and SB, but would be in the
top three in terms of Ks, GIDP, and LOB.
I knew ahead of time that some of these
stats were worthless indicators, but I ran all of them anyway to get some kind of a sanity
check on the numbers that should mean something. Happily, many of my assumptions turned
out to be correct.
Here's how the numbers crunched, in order of
highest correlation with Runs Scored:
Runs Scored
|
Statistic Measured
|
Best 3 Pct
|
Worst 3 Pct
|
Combined Pct
|
Rank
|
OPS
|
75
|
79
|
77
|
1
|
OBA
|
69
|
76(1)
|
73
|
2
|
Slg
|
68
|
70
|
69
|
3
|
BA
|
70
|
66
|
68
|
4
|
HR
|
54
|
61
|
57
|
5
|
BB
|
45
|
62(1)
|
53
|
6
|
Total SB
|
35
|
48
|
41(3)
|
7
|
GIDP
|
37
|
31
|
34(2)
|
8
|
Sacrifice Hits
|
30
|
36
|
33(2,3)
|
9
|
Strikeouts
|
32
|
27
|
29(2)
|
10
|
Left on Base
|
15
|
15
|
15(4)
|
11
|
-
There seems to be a huge difference between the
"best" and "worst" numbers for OBA and BB. The inability to get on
base through hits or walks seems to be a real killer.
-
GIDP, Ks, and Sacs seem to fall into the category
of statistically insignificant. Not surprising for the first two, but since a sacrifice is
designed specifically to produce a run, one would think that Sac Hits would score higher
than 33%. See note 3 for further discussion on this.
-
Curious about NL vs. AL statistical differences? I
was, so I separated the 8 NLs from the 12 ALs and then took a look at Stolen Bases and Sac
Hits. Here's what came up:
-
NL vs. AL
|
NL Stolen Bases
|
AL Stolen Bases
|
Best 3 Pct
|
Worst 3 Pct
|
Best 3 Pct
|
Worst 3 Pct
|
41
|
54
|
30
|
42
|
NL Sacrifice Hits
|
AL Sacrifice Hits
|
41
|
50
|
21
|
24
|
It seems that the ability -- or lack thereof -- to
steal bases and lay down sacrifices has some effect on Runs Scored in the NL, or at least
a much greater effect than in the AL. Note also that the AL sacrifice numbers fall below
the random 25-38%. I take this as a counter-indicator; the use of the sacrifice in the AL
may actually lead to fewer Runs Scored rather than to more. Food for thought, certainly.
-
Speaking of counter-indicators, the LOB numbers
jump out too. For the sake of the study I assumed that high LOB numbers were a bad thing,
a failure to score runs. In reality, however, higher-scoring teams seem to leave more men
on base than their counterparts, a by-product of their higher OBA, I guess.
But wait, there's more!
The other half of getting a good Run Differential
number is preventing runs from scoring. So I looked at the three teams from each league
that gave up the fewest runs and the three teams that gave up the most runs. How many of
the three most successful pitching staffs and defenses finished in their league's top
three in terms of statistic X, and how many of the teams finished in the bottom three of
statistic X? Which stats had the highest correlation with success or failure at preventing
runs?
On the pitching side, I looked at ERA, WHIP, CG,
Shutouts, Hits allowed, BA against, BB allowed, and Ks. Theoretically, teams that allow
fewer runs should have lower ERAs, WHIPs, Hits Allowed, BA Against, and BB allowed, and
high numbers of CG, Shutouts, and Ks. Teams that were in the bottom tier of runs allowed
should have higher ERAs, WHIP, Hits Allowed, BA against, and BB allowed, and lower CGs,
Shutouts, and Ks. You can see how often that happened below.
On the defensive side, I looked at Outstanding
Plays, Errors Committed, and the Opponent Caught Stealing (OCS) numbers. I threw the
latter in to get an idea of whether poor-hitting catchers with strong arms (Charles
Johnson comes to mind) had any usefulness in Scoresheet. In real life a poor-hitting
catcher can have other useful qualities -- calling a good game, handling the pitching
staff, those intangible leadership merits -- but in Scoresheet, the numbers tell the whole
story. Anyway, you can make your own judgment.
Here are the numbers in rank order:
Runs Allowed
|
Statistic Measured
|
Best 3 Pct
|
Worst 3 Pct
|
Combined Pct
|
Rank
|
ERA
|
90
|
92
|
91
|
1
|
BA Against
|
79
|
81
|
80
|
2
|
Hits Allowed
|
75
|
84
|
79
|
3
|
WHIP
|
80
|
76
|
78
|
4
|
Shutouts
|
58
|
58
|
58
|
5
|
BB Allowed
|
62
|
52
|
57
|
6
|
Strikeouts
|
58
|
50
|
54
|
7
|
Complete Games
|
48
|
56
|
52
|
8
|
Outstanding Plays
|
36
|
44
|
40
|
9
|
Errors Committed
|
33
|
39
|
36
|
10
|
OCS
|
27
|
20
|
23
|
11
|
A couple of points to make on this last chart.
First, note the dramatic dropoff in correlation numbers after WHIP -- from 78 to 58%. All
the highest-ranked numbers have to do with Hits Allowed, to a much greater extent than BB
Allowed. So often we hear announcers chide a pitcher after a leadoff walk, and they point
out that "a leadoff walk comes around to score X percent of the time" -- I can
never remember the exact number. But I'm thinking -- and this study seems to support --
that a leadoff single is just as bad, and a leadoff extra-base hit is even worse. I
desperately avoid pitchers who walk more than an average number of batters, but I may have
to reexamine that philosophy.
Finally, another note on Scoresheet defense. Once
again, defensive statistics ended up at the bottom of the chart, waaay below even Ks and
CG. So now we can see that range and errors seem to have little effect on building a
winning Scoresheet team, and they seem to have little effect on preventing runs, at least
in comparison to most other pitching/defensive statistics.
Is this a surprise? It probably shouldn't
be. Scoresheet clearly states in its draft packet that "a difference of .10 in range
is equal to .1 (a tenth) of a hit per 9 innings." Well, it takes a couple fairly
rangy players to achieve that .10 range advantage, especially since most teams have some
subpar range players at other positions.
Just for fun, I went through the 1999 AL
Player List and picked out the rangiest guys at all eight field positions. If you somehow
managed to draft all eight of these guys -- essentially including three CF types -- you
would enjoy a range rating of +.56. If I'm doing the math right, that means this squad of
defensive whizzes would save your pitching staff .56 hits per game, or about 91 hits per
season (about two weeks of work for Tim Belcher). I assume that .56/game advantage would
work out over the course of the season.
However, in my Scoresheet AL experience, a
great fielding team may manage a range of +.25. That's still pretty high, but we can use
it for the sake of comparison. This defense will save the pitching staff .25 hits per
game, or about 41 hits per season. Is that a significant number? Is it worth carrying a
couple of weak bats?
Well, teams usually give up between 1300 and
1700 hits in a 162-game season, depending on how good their pitchers are. Saving 41 hits
for a team of great pitchers -- which would give up 1300 hits in the season -- represents
a 3.2% improvement. Saving 41 hits for a bad pitching staff -- the 1700-hit group --
represents a 2.4% improvement. Saving 41 hits for an average staff -- 1500 hits allowed --
is a 2.7% improvement. The improvement numbers for the top-ranked defensive team that
saves 91 hits per season would be 7.0 (1300), 5.4 (1700), and 6.1 (1500) percent.
For the sake of comparison: A survey of the
20 leagues I used to create this page indicates that the teams that gave up the fewest
hits in each league beat their nearest competitor in Hits Allowed by an average of 50 hits
per season. The average gap between the team that gave up the fewest hits and the team in
the league that gave up the most hits was 267.
Let's look at a couple of positions. In a one-year
league, if you have the choice of Mike Bordick (4.77) or Mike Caruso (4.72), who do you
take? Caruso makes a lot of errors, but that doesn't seem to have a great effect on
winning or losing or even preventing runs. Bordick's range advantage of .05 translates to
him saving you 8 hits per year. Well, I'm pretty sure Caruso will make up those 8 hits
with the stick, despite the fact that he starts hacking when he steps off the team bus.
A more extreme example would be choosing
between Jose Canseco (2.01) and Brian Hunter (2.21) to play a corner OF position. Hunter
will theoretically get to 32 more balls than Canseco in the OF, but then you have to watch
him bat. This is about the most extreme example I could think of without delving into
playing guys out of position, and it boils down to 32 hits over the course of the season.
Since Canseco will probably produce 75 more RUNS than Hunter over the same season, I think
you gotta let the big guy wear a glove. And maybe a hard hat.
Oh, there is a more extreme example.
Scoresheet points out that one position at which range makes a great difference is CF,
which is 1.6 times more important than the other OF spots. I wonder if that means that
playing Canseco in CF instead of Hunter would mean an extra 51 (32x1.6) hits falling in?
That seems like a lot, despite the 75 extra runs produced. Plus your pitchers might get
irritated.
Does defense matter? I think it can
marginally improve your pitching. A great fielding team may even be able to cancel out the
offensive "All-Star factor" that balloons real-life ERAs up by .25 runs or so.
But even a simply good team defensive range -- +.25 -- is only going to reduce your hits
allowed by 2 to 3 percent, and the correlations between Range and Wins/Losses and Range
and Preventing Runs are quite low. Is it worth the accompanying loss of offense that these
trade-offs often require?
In the past I've always drafted like defense
did matter, but those differences of .03 or .04 that I've worried about in the past boil
down to 5 or 6 hits prevented per season. It's not like Scoresheet has concealed this
fact; it's all right there in the packet. In the future, I'll draft the better hitter and
hope to lead the league in Runs Scored.
If for some reason you would like to comment on
this study or the methodology, please send me a note at this address. I hope you
found it at least somewhat interesting. |