Back in 2004, the New York Mets hired their first full-time statistical analyst after the release of the New York Times bestselling book, Moneyball. The Mets were looking to invest in someone who could break down the numbers, analyze data for potential free agent signing and trades, and offer evaluations on minor league players.
What the club received from Dr. Ben Baumer was even more than they could’ve expected.
Baumer, 41, realized that the best way to store stats and better visualize the data that the club needed was to devise an internal server. He then went about setting up the team’s internal infrastructure, or their statistical portal. Baumer admits that he self-imposed that undertaking, but understood that it would ultimately make his job easier in the end.
Hired by former Mets General Manager Jim Duquette, Baumer worked mainly under Omar Minaya and Sandy Alderson during his tenure with the club. Baumer counseled the GM in various aspects of player development and potential transactions. He worked with various Mets’ coaches on providing data and analysis for specific research, including in-depth work with former Mets pitching coach, Rick Peterson.
While working for the Mets, Baumer went to graduate school and earned his Ph.D. in Mathematics at the Graduate Center of of the City University of New York. In the summer of 2012, Baumer decided to leave the Mets in pursuit of a career in academia. He started teaching at Smith College, located in Northampton, Massachusetts, and is now an assistant professor of Statistical & Data Sciences. The SDS major was created in 2016, and according to The Sophian – Smith’s independent newspaper – it’s the first women’s college to offer a data science major.
Along with teaching courses at Smith, Baumer has kept busy, co-authoring a book with economist Andrew Zimbalist released in 2014 called The Sabermetric Revolution: Assessing the Growth of Analytics in Baseball. In it, the pair debunk several misconceptions about Michael Lewis’ Moneyball, explore how the game has evolved with the use of sabermetrics, and highlight the effectiveness of the metrics.
Along with the book, Baumer and two colleagues – Shane T. Jensen and Gregory J. Matthews – won the Contemporary Baseball Analysis Award from the Society for American Baseball Research (SABR) in 2016 for their paper on openWAR, which is based on public data that offers greater transparency than other WAR metrics from sites like FanGraphs and Baseball Prospectus.
I had the pleasure of speaking with Dr. Baumer about his work in the Mets front office, his relationship with scouts and coaches, and what advice he’d give to young people looking to break into the data science industry.
MMO: You started working for the Mets in 2004 as a statistical analyst. Can you talk a bit about how you came to secure that position?
Baumer: Things were very different back then. I was very lucky to get that job; it was very much being a person in the right place at the right time. This was right after Moneyball had come out and I got word through a family friend that Jim Duquette was looking to hire someone to do statistical analysis. I sent in a resume and a couple of months later I had an interview. I had a good meeting with Jim and couple of other people in the front office. I ended up doing a couple of projects for the Mets over the last few months of 2003 and then I got the call that they were going to hire me in December of 2003. I started January of 2004.
MMO: Were you always interested in data science and analyzing numbers, particularly in baseball? Were you a reader of Bill James’ work growing up?
Baumer: Yeah, I played baseball and I paid attention to the numbers from my collective baseball cards and I read some of Bill James’ stuff. At that time data science wasn’t a term that anybody used and even sabermetrics was not a term that I was familiar with. I’d say it was more that my brain was always thinking like that but there wasn’t a field to latch onto, so it wasn’t a conscious kind of position in that sense.
I was interested in math and I ended up going to grad school in math before I even found out about this opportunity or even read Moneyball and realized that this was a career path.
MMO: Upon being hired, did the Mets give you a lot of freedom in terms of what data to analyze, or, did they lay out a specific course of action for your analysis?
Baumer: It was very free in the sense that there wasn’t really anyone who had that position before. I kind of got the sense that they knew they wanted to be doing more with what we would call baseball analytics but they didn’t really have a firm sense of what that was. When I got there, there were no computers – infrastructure – for baseball analytics. There were no databases and no server. I started building out some of that infrastructure as I got more familiar with the kinds of things that Jim wanted from me. Jim would come to me with questions like, here’s this list of free agent relief pitchers that are available, what do you think of these guys? Can you analyze them?
In order to do that work I needed data to analyze and so I set about building that infrastructure.
MMO: I spoke with your former colleague, Adam Fisher, who said that you built version one of the Mets’ statistical portal. He mentioned that you built it without any formal training and taught yourself to code, is that right?
Baumer: Yeah, that’s mostly right. I had taken two computer science classes in college and had worked -before I went to UCSD (University of California San Diego) for a master’s before working for the Mets – for a company and that was my first experience building dynamically generated web pages. It was a different technology than the one that we used with the Mets and I only had a little bit of experience with it. But it was a taste and enough that I knew it could be done and what Adam said was basically true. I had never used PHP or MySQL before but I knew that they could work together and I just kind of sat down and got to work on it.
MMO: Can you describe what your day-to-day work life was with the Mets?
Baumer: That changed throughout the years but the way I like to think about this job was there were kind of two parts to it. The sexier part is advising the general manager on player evaluations: free agent signings, trades, minor league players, etc. There was that aspect of it. There were a lot of other people doing that like Adam (Fisher) and all the scouts were involved in that process.
The other part of the job – which was sort of more mundane – was the building up of the database and having the web front-end; there was a lot of work that went into that. That probably took up more than fifty percent of my time and that was I would say largely self-imposed in the sense that nobody told me to do that. I just decided to do that because I knew that it was the best way to do my job more efficiently and help other people do their jobs more efficiently.
MMO: How often would you interact with the players and coaches in disseminating the information you analyzed?
Baumer: I didn’t interact much with the players but I did interact with the coaches on a regular basis. The philosophy at the time was very much that it’s the manager’s and coaches’ jobs to translate the information from the front office to the players. They really didn’t want front office people talking to the players directly because there is such a perceived gap in experience and knowledge and context. A little of that has changed, you see more and more teams – including the Mets – having this sort of embedded data analyst in the clubhouse, looking at video and talking to players. It was not that way for me when I was there.
MMO: I know you worked closely with then Mets pitching coach Rick Peterson, who was a big proponent of analytics. Were there specific numbers he asked for on a consistent basis that you could share?
Baumer: There was. I probably shouldn’t go into too many specifics about that but Rick was someone who came in and knew exactly what he wanted to know. He told me what data points he wanted to see and I helped produce reports that would give him that information. Then we would work together on those reports. He was interested in things like what were players’ batting averages in different locations in the strike zone and stuff like that.
MMO: During your tenure with the Mets, how would you describe the club’s reliance on analytics and how it evolved over the years?
Baumer: I think by the end of 2004 – which was my first year there and Adam’s first year as a full-time person – he and I got more and more into the conversations that Jim (Duquette) was having. By the end of that year we started to feel like we were really integrated into the decision making process.
When Omar (Minaya) came on I think we were both worried that it was going to end because there was stuff out there in the public sphere about Omar sort of being dismissive of analytics. And that absolutely did not happen. I remember having conversations with Omar on his first day in the office. From day one he just made it clear to everyone involved that analytics were going to be a part of his decision making process and that Adam and me and other people that were providing that kind of perspective were important and were going to be listened to. I would say our influence only increased through the Omar years and then with Sandy (Alderson) it was more obvious that he was interested in that kind of stuff and so we continued to be involved.
MMO: That’s something I’ve heard from several front office people over the years, that Minaya wasn’t just devoted to the player development and scouting aspects, he wanted the analytic information as well to make his final judgment.
Baumer: Yeah, I think it’s really a difference in the backgrounds, right? Omar’s background is in scouting and player development and at the end of the day whatever decisions he makes are going to be heavily predicated on his own perception of the player. Omar wasn’t doing analytics but he recognized the value of analytics.
The other thing about Omar’s decision making process is that he would listen to anyone and he always wanted to gather everyone’s opinion and then make his decision. The analytical point of view was always part of that decision making process. He was very good, I thought, at soliciting that point of view and all that.
With Sandy it was very different because Sandy does not have a player development background or scouting background. For Sandy, his decision making process is much more inherently analytic in itself. So I think for him he sort of understood the analytics’ arguments in a much deeper way but that doesn’t necessarily mean that it had a heavier influence on his ultimate decision-making.
MMO: Early on in your tenure, were the specific stats or metrics that you would lean more heavily on than others?
Baumer: I don’t want to get too specific but we did not focus on a single statistic. Actually part of what I would say I was trying to achieve was a greater knowledge of sabermetrics in general for everyone in the front office. To me, it seemed much more important that people sort of understood the basic arguments that sabermetrics was making as opposed to having this proprietary statistic and wanting to show how great it was so we can all believe in it and worship it.
The insight about batting average on balls in play (BABIP) and pitchers not being able to control that as much as people thought, that was an idea that at that time we talked about a lot. And again, part of what Adam and I were trying to get through is just to help everybody understand that idea and let it permeate our thinking about players in a much more general way as opposed to a very specific way.
MMO: After eight years with the Mets you moved into academia at Smith College in Northampton, Massachusetts. Was that a move you had thought about making for quite some time?
Baumer: Not really. Teaching is something I’ve always thought about but it wasn’t the plan. Obviously I was in graduate school for a long time while I was working for the Mets, so it was a potential outcome for sure. But it wasn’t something that I decided that I needed to do.
By that time I got close to finishing my Ph.D. in 2012 my circumstances were different, I had been doing things with the Mets for a long time and I needed to explore what was out there and this job at Smith College turned out to be a great fit.
MMO: Would you ever consider returning to a major league front office down the line?
Baumer: Yes. I’m coming up for tenure next year so I’ve got to take care of that first but it’s something that I would consider. From where I am personally in my family life I think it would be very hard for me to leave this area, but in the right situation I would certainly consider it. Certainly on a short-term basis if not a permanent position.
MMO: You and two of your colleagues developed the concept of openWAR. Can you talk a bit about what made you interested in further investigating the current WAR models out there from Baseball Prospectus, FanGraphs and Baseball Reference, and how openWAR differs in that regard?
Baumer: WAR had been gaining currency as an idea to go back to our previous conversation about a single statistic. When I sat down to think about it I just got increasingly frustrated at how unclear it all was. You had all these different organizations with their own implementation of it and yes, there are descriptions of those implementations online but nobody has released the source code or anything like that. It really wasn’t something you could verify that everything was working as it was supposed to. I think for most people if you’re just consuming the numbers that’s fine; you might not care about that. But for me it was like I’m not going to trust this until I’ve really understood how exactly these numbers are being computed.
That was sort of the impetus for that project of just being frustrated at the current state of affairs. I worked on that with Greg Matthews and Shane Jensen and what we were trying to do was not so much create a better version of WAR as much as creating one that was fully transparent and that everyone could agree. You don’t have to like it but we could all agree on what it is and how the numbers are computed.
That was really the goal and as we went through that process there were a couple of things that we thought made more sense to do in a different way. The way that we handled replacement players are different, the way that we do some of the fielding stuff is different, but at the end of the day you should get more or less similar numbers. It wasn’t really about saying that the numbers that were out there were really far off and we need to make them correct. Again, it was just more about making it all transparent.
MMO: You and your colleagues released the paper on openWAR and users can go online and play around with the numbers via an R software program. Where do you envision openWAR going in the public sphere? Are you and your colleagues working on any other open source metrics? Did you hope that it would’ve taken off in the same way that Baseball Reference’s bWAR and FanGraphs’ fWAR are utilized?
Baumer: I sort of did hope for that (laughs) but that hasn’t really happened. And that’s okay. Greg and I have talked about revisiting that and coming out with a 2.0 version and trying to iron out some of the things that some people were critical of in the first version. We’re teachers and I don’t have the time or the platform to set it up the way that they have it on FanGraphs or Baseball Prospectus. The website that I built for the Mets is not unlike FanGraphs or Baseball Prospectus, but that’s not where I’m at right now in my career.
I do think though that we did achieve some success in getting the public to talk about the specific version of WAR that they’re referencing more often. It used to be that when you read ESPN or the New York Times and they talked about WAR, it was just WAR. I do think that more people are referencing bWAR or fWAR but they’re referencing the specific source and I think that’s a consolation prize.
MMO: You co-authored a book with Andrew Zimbalist called The Sabermetric Revolution back in 2014. For those who haven’t had a chance to read it yet, can you talk a little about what readers should expect to learn from this book?
Baumer: The original title of the book was Moneyball Revisited and the concept was ten years after Moneyball and we wanted to do a retrospective on not just Moneyball itself but how the game has changed and how these ideas have changed. We weren’t able to use that title due to the publisher having some legal concerns but that’s what the book is about, it’s sort of this retrospective on how the sabermetric industry has changed both inside and outside Major League Baseball.
One of the things that I liked that we were able to do was actually try to measure whether we think analytics has worked or not. That is, through the teams who I perceived as being more analytically driven and to win more. I won’t say that we answered that question but we addressed that question and the short answer was yes, it did seem that teams were more successful.
MMO: During your time in the front office with the Mets, was there a certain aspect of the game – defense, pitching, hitting, running – that was harder for you to asses? Clearly public defensive metrics have been improving with the advent of Statcast, would you assert that defensive metrics were the most challenging for you?
Baumer: Certainly defense would be one of those things. Catcher framing is another one that in 2004 we had no way to measure. And it wasn’t that we didn’t think of the idea, scouts and player development people were talking about that idea. We just didn’t have the data to measure it, but that changed.
I think the biggest thing there was in 2004 we had play-by-play data that we purchased from Stats Inc. that had locations of pitches in the strike zone but they were eyeballed by people watching the game on TV. We also had locations of where the balls landed in the field but they were also eyeballed by people watching on TV. To go from that to what they have now is just a big change in terms of the quality in the data.
MMO: One of the myths you debunk in the book you co-wrote with Zimbalist was the idea that the scouts and data analysts clashed. How would you describe your relationship and dealings with scouts during your time with the Mets?
Baumer: Overwhelmingly positive. In the very early days there were a couple of people and a couple of moments where scouts weren’t that interested in what I had to say. But again, I think Omar made it very clear when he came on that that wasn’t going to be a winning philosophy.
I had a really good relationship with scouts. Al Goldis and Bill Lindsey when they were there, they would talk to us for hours, and I learned a lot of baseball from them. I went on the road with Brian Lamb a bunch of times and learned ton things from him on scouting and other things. Overwhelmingly positive was my relationships with scouts.
MMO: For those interested in pursuing a career in data science – specifically for baseball- what would you recommend they do in order to put themselves in the best position to succeed in that field?
Baumer: Publish. To that end I have a book out with Jim Albert and Max Marchi, it’s the second edition of Analyzing Baseball Data with R. I wasn’t a part of the first edition but was with the second and the book is a sort of tutorial on how to use R to access baseball data and to analyze it. It covers many of the key concepts of sabermetrics.
To go back to where we started this conversation, things are very different now than they were back in 2004. The good news is there are a lot more teams that have a lot more jobs, the bad news is there’s a lot more people paying attention to that and the competition is probably even tighter to get those jobs. The expectations are higher in terms of what type of degrees and skill set they expect you to have.
I think for younger people who are trying to get into this, if you want to be an analyst you have to be able to code and that can mean R, Python, SQL, or some combination of all three. But you have to be able to do that in order to do the work and I think the best way to get that work noticed is to try to publish it somewhere online. Any one of these sites whether you have a class project or something that you’re working on independently, if you can get that out there to get people’s eyes on it, then it gives you something to talk about in an interview that shows people what you can do.
MMO: I can’t thank you enough for taking the time to speak with me today, Dr. Baumer. Thank you for sharing your insights.
Baumer: I appreciate that, thank you.
Follow Dr. Baumer on Twitter, @BaumerBen