Ambiguity, Interpretations, and Assumptions in Sabermetrics

An MMO Fan Shot by DerpyMets

Two seasons ago I set forth on a somewhat insane, ambitious project to calculate all of the Mets pitching stats by hand for an entire season. This included watching every game, carefully marking down everything that happened (balls, strikes, fly balls, ground balls, line drives, bunts, blah blah blah, every little aspect of everything) for every Met pitcher. After every game I would painstakingly input this data into a spreadsheet, then double, triple, and quadruple check it. This was an obnoxiously tedious process, but it gave me an appreciation for a lot of aspects of baseball that a lot of fans take for granted.

For instance, when you’re calculating these stats, it is vitally important to mark down ground balls, fly balls, line drives, and pop ups. Sit back and think about what that entails. What is a fly ball? What is a line drive? At what point does a line drive turn into a fly ball? Where is the line between a pop up and a fly ball?

So I looked at the stats that would end up using these numbers for clues. Lets look at xFIP as a starting point. A lot of people quote xFIP pretty regularly, and I’m sure most of you know what it is: e(x)pected (F)ielding (I)ndependent (P)itching. Fielding independent means fielding isn’t counted, which is a subtle point I missed the first time I heard this jargon. I would have called it Pitching independent fielding, but maybe that would be confusing in the other direction. Whatever.

Here is the formula for calculating xFIP (and I’ll explain it):

xFIP = ( ( (13(Flyballs * lgHR/FB%)) + (3(BB+HBP)) – (2K) ) /(IP) ) + FIP constant

xFIP = ( 13(Flyballs lgHRFB%) ) + ( 3(BB + HBP) ) – (2K)IP+ FIP constant

Breaking that down a little, the numerator is made up of three parts.

First, you have 13 (a constant) multiplied by the total number of fly balls times the league average home run to fly ball rate. This estimates how many home runs an average pitcher would have given up in an average park to an average batter given this total number of fly balls surrendered.

Second, you have 3 (a constant) multiplied by the total number of free baserunners surrendered, aka walks plus hit by pitch.

Third, you have 2 (a constant) multiplied by the total number of strike outs.

The denominator is total innings pitched.

Then you add on the FIP constant at the end. The FIP constant is calculated by calculating the previous fraction for the entire league, then subtracting that from the league ERA. Essentially, the FIP constant makes these numbers resemble ERA. You generally use the same constant for FIP and xFIP, but you can manually calculate the FIP and xFIP constants if you want ever so slightly more accuracy (which I will be doing for the numbers coming up).

OKAY! Most of this seems straight forward, right? Not much interpretation. Walks, hit by pitch, strike outs, innings pitched, that’s all easy. Even the part about subtracting from the league ERA is pretty easy. But wait a second, that fly ball part, what’s that about? And what is a home run to fly ball ratio? What is a fly ball in the first place?

Well, in the case of xFIP, you’re trying to predict the total number of home runs. So, I presume the fly balls we’re talking about are fly balls that have a chance of becoming a home run, right? Is an infield pop up a fly ball? Well, there is a small amount of data to suggest that inducing a pop up is a pitching skill that may actually negatively correlate with home runs. So eliminating those from your fly ball total might make sense, or it might not. It depends who you talk to. What about infield line drives? Do those count? Line drives turn into home runs, but an infield line drive can’t possibly do that. But what if it is hit really, really hard, but on the wrong angle to leave the park? Maybe those infield line drives should count, but the really soft infield line drives shouldn’t. But what about really hard hit ground balls? Those are pretty much the same thing as hard hit infield line drives, but hit on an even more downward trajectory. Surely, if the batter had only hit the ball a tiny of a fraction of an inch lower, he could have elevated it to line drive to the outfield, right? He hammered the pitch, so the pitch was hittable, he just barely missed it.

There are so many factors to consider, where do you draw the line?

Alright, that is all well and good, but maybe if you calculate the league HR/FB% using the same interpretations as the pitcher FB total it will all just sort itself out in the end.

Let’s look at an example:

Dillon Gee’s 2013 stats: 172 FB, 139 LD, 42 PU, 47 BB, 7 HBP, 142 K, 199 IP.

I’ll note at this point, the league average xFIP is defined as, for this particular season, 3.76.

For Fly balls only, HR/FB = .1013, xFIP Constant = 3.01

Fly balls + Line drives, HR/FB = .0615, xFIP Constant = 3.54

Fly balls + Line Drives + Pop ups, HR/FB = .0583, xFIP Constant = 3.58

You can do the math along yourself, Gee’s xFIPs, respectively, are: 3.54, 4.18, 4.33.

	Lg HR/FB	xFIP Constant	xFIP
For Fly balls only	.1013	3.01	3.54
Fly balls + Line drives	.0615	3.54	4.18
Fly balls + Line Drives + Pop ups	.0583	3.58	4.33

That is a rather large range, from this data we can determine Gee was anywhere from above average to significantly below average, not exactly helpful.

Clearly each interpretation of flyball is giving us a different number. From now on, I’m going to refer to them in the following way:

flyballs only = xFIPf

flyballs + line drives = xFIPfl

flyballs + line drives + pop ups = xFIPflp

Okay, so those numbers come in a broad range, what does fangraphs say? 4.07.

Admittedly, I used different FB, LD, and PU numbers, since I took all this stuff by hand, using my own judgment while watching the games using my own eyes. Let me plug in FanGraphs own values and surely they must lineup with that 4.07 number, right? Respectively: 3.98, 4.38, 4.40.

Wait, what?

So this leaves us with a question, where did 4.07 that FanGraphs listed as Gee’s xFIP come from? I used FanGraphs own stats and their own equation, using three different methods, and none of them match up.

Looking closer into FanGraphs, specifically, into their guts, I see a listed FIP constant for 2013: 3.048, so plugging that number into the formula I get:

My Gee Stats: xFIPf = 3.57, xFIPfl = 3.69, xFIPflp = 3.79

FanGraphs Gee Stats: xFIPf = 4.01, xFIPfl = 3.89, xFIPflp = 3.87

	xFIPf	xFIPfl	xFIPflp
My Gee Stats	3.57	3.69	3.79
Fangraphs Gee Stats	4.01	3.89	3.87

Note how none of these numbers are 4.07. Clearly FanGraphs is doing something behind the scenes that they aren’t telling us about. Either they are changing the formula’s slightly, using different stats, weighting stats differently, or using a different constant.

Alright, I admit this may be a bit of a nerdy, rambling sort of example, but I really want all baseball fans to understand this one thing: Advanced stats have inherent assumptions and interpretations that can dramatically change the look and feel of the stats. You have to question these underlying assumptions, you have to dig deeper into the stats, and, above all else, you must always explicitly state which version of the stat you’re quoting. I hope I have shown you at least four different versions of xFIP right now: xFIPf, xFIPfl, xFIPflp, and xFIPfg. Each of these gives you a different result, and you have to make sure to always compare like results to like results.

At the moment, FanGraphs largely holds a monopoly on advanced statistics consumed by average fans like you and me, but that will not always be the case, and more importantly, you should recognize their stats come with these inherent assumptions that may dramatically color your perception of certain types of players. Keep an open mind, and always question the numbers.

* * * * * * * *

This Fan Shot was contributed by DerpyMets. Have something you want to say about the Mets? Share your opinions with over 30,000 Met fans who read this site daily. Send your Fan Shot to [email protected]. Or ask us about becoming a regular contributor.