My thoughts
This article was published in our July 2023 Chapter Newsletter. It has been re-edited a bit and added to. To read the original article click here.
I recently completed reading A Fan’s Guide to Baseball Analytics by Anthony Castrovince. In an early chapter he introduced OPS (on-base plus slugging) as an easy and convenient way to evaluate hitters that is better than batting average alone. Of course it is, but that convenience perhaps distorts the true value in two areas. It is biased and it is an apples-to-oranges amalgamation.
First the apples-to-oranges issue. Batting average is calculated by hits/at-bats. On-base percentages are on base events (hits, walks, hits by pitches) divided by a number that approximates plate appearances (includes Sac Flys but no Sac bunts according to this text). Why is an event like a Sac Fly which by definition has led directly to a run scored get added in to penalize the OBP? Is that a worse event than a 2-out, bases empty walk that has such a small chance of scoring, that it is almost like an empty calorie? It’s not significant, but we should recognize that it is likely not a complete measure of a hitter’s value, just a convenient, easy to calculate one.
The second issue is far more significant in my opinion. That is the double counting that leads to bias in the calculation. I recognized it immediately when I gave the formula thought, then a page later Castrovince contradicted my hypothesis when he stated, “OPS is probably guilty of overstating power hitters and understating high OBP hitters.” (pg. 59) No! While it may understate hitters that walk a great deal (in my estimation, really let’s say “feared” instead of “good” hitter’s should have an OBP at least 10% or .100 higher than their batting average) it overrates singles hitters.
Let’s look at a hypothetical first. Batter A and Batter B have the same number of at-bats, hits and walks and we’ll leave out the minor events (HBP’s, Sacs).The difference will be in slugging. Batter A only hits singles, Batter B’s total bases average out as doubles. So:
AB | H | AVG | BB | PA | OBP | TB | SLG | OPS | |
A | 450 | 150 | .333 | 50 | 500 | .400 | 150 | .333 | .667 |
B | 450 | 150 | .333 | 50 | 500 | .400 | 300 | .667 | 1.000 |
Everything is straight forward here, nothing to see here. But wait! Even Branch Rickey saw this prior to 1954 (pg. 137), hits – namely the component of a single are counted in both the batting average component of OBP and the 1 base component of slugging. So in essence:
Everything is straight forward here, nothing to see here. But wait! Even Branch Rickey saw this prior to 1954 (pg. 137), hits – namely the component of a single are counted in both the batting average component of OBP and the 1 base component of slugging. So, in essence:
A single counts as 1 hit and 1 base, its value is 2 for 1 base, let’s say 2.00
A double counts as 1 hit and 2 bases, its value is 3 for 2 bases, let’s say 1.50
A triple counts as 1 hit and 3 bases, its value is 4 for 3 bases, let’s say 1.33
A homer counts as 1 hit and 4 bases, its value is 5 for 4 bases, let’s say 1.25
Hopefully, this shows that singles carry a higher weight in this calculation.
The fix? If we subtract the batting average from slugging, we get another metric called ISO (isolated power) this subtracts the redundancy of hits in the calculation. Not as simple, but not hard either. Two ways of calculating can be used depending on what works for you. I’ll call this IO (isolated power plus on-base). You can use:
OBP + ISO, – or, – OPS – BAvg.
Now let’s look at batter A vs. B
Batter A .333 Avg. .400 OBP .333 Slg. .666 OPS .000 ISO .400 IO
Batter B .333 Avg. .400 OBP .667 Slg. 1.000 OPS .334 ISO .666 IO
This of course is the non-example as no variables were changed and I subtracted the same batting average from both. This logic makes it easy to say that OPS is efficient and relatively accurate. So, I need a real-life sample to make my case. How about Mike Trout? I will post his 5 top OPS years next to his top 5 IO years. Now we can look at the variables and how they affect his top year’s numbers.
Year OPS Year IO
2021 1.090 2019 .792
2018 1.088 2018 .776
2019 1.083 2017 .768
2017 1.071 2021 .757
2022 .999 2022 .716
As you can see his top 5 years were still his top 5 years; however, why did 2021 fall from 1 in OPS to 4 in IO? Does this apply to the other years minor shifts in the order? In 2021 Trout posted his smallest sample size (117 AB’s). 2021 also touted Trout’s highest career batting average (.333) and at least his lowest HR rate (1 every approx. 15 Ab’s). Meanwhile in 2019 he posted only a .291 average but with 45 HR’s (an almost 1 in 10 AB rate). That’s evidence that OPS is biased towards the singles hitters because the singles are double counted.