My new favorite stat
If you’ve read this blog, you know that I support the sabermetric approach to analysis and am not averse to citing numbers like WAR (wins above replacement) and WPA (win probability added). Nevertheless, many of the numbers available at sites like fangraphs.com and baseball-reference.com remain a sort of alphabet soup to me. So it’s fun to discover that one of the numbers I hadn’t paid much attention to can really tell me a lot.
The number I’m excited about right now is called “RE24.” For evaluating relief pitchers, it works a lot better than ERA (we know that ERA can often be pretty meaningless for relievers). And the statistic also works for batters, giving us information about situational hitting and clutch performance that is omitted from popular statistics like WAR and wOBA.
What is RE24? In a way, it’s kind of similar to WPA, except it’s measuring runs instead of wins. Let me explain.
The calculation of WPA begins by assuming that each team, on average, begins a game with a probability of winning of 50%. (Yes, in any individual game the probabilities will be more or less than 50%, but if we take the average across all games and teams it has to be 50%.) Each event during the game then raises or lowers the win probability, and WPA simply assigns the change in win probability of these events to the batter (offense) or pitcher (defense) and then adds them up. At the end of the game, the total WPA for the players on the winning team always equals +.50 (their probability of winning has gone from 0.50 to 1.00) and the total WPA for the players on the losing team always equals –.50.
RE24 applies the same type of calculation to the changes in expected runs. Major League Baseball teams are averaging 4.3 runs per game, or about .48 runs per inning. So each inning starts with +.48 runs expected and ends with zero runs expected (after the third out is recorded).
As an example, let’s look at the third inning of Thursday’s game against Philadelphia. Rick Ankiel led off with a double. With a runner on second and no outs, a team can expect to score about 1.10 runs; the increase in expected runs (.62 = 1.10 – .48) was credited to Rick Ankiel as his RE24. Next, Jesús Flores struck out; there were .44 fewer runs expected with a runner on second and one out than with nobody out, so his RE24 was –.44. Brad Peacock then grounded out, advancing Ankiel to third. There were .31 fewer expected runs with a runner on third and two outs than with a runner on second and one out, so Peacock was charged with –.31 RE24. Ian Desmond doubled, scoring Ankiel, and Desmond was credited with .96 RE24 (1.00 for the run scored minus .04 because he ended up on second, whereas Ankiel had been on third). Roger Bernadina singled, scoring Desmond (+.91 RE24). Bernadina then stole second, giving him an additional +.09 RE24. At this point, with a runner on second and two outs, there were .32 expected runs. Finally, Ryan Zimmerman flied out to end the inning and was charged with –.32 RE24.
RE24 versus WPA
RE24 sounds quite a bit like WPA – so what does it tell us that WPA doesn’t? The key difference is that RE24 treats all innings and runs the same, whereas WPA gives more weight to events occurring in “high leverage” situations, such as in the late innings of close games. While that helps WPA as a “story” statistic—it can immediately tell you whether a hit or an out came with the game on the line—it also can cause it to work poorly for telling us the value of an event. That’s because we can’t really tell how important a run will turn out to be at the time that it is being scored (or prevented).
For example, if a team wins 4–3, scoring three runs in the second inning and the tie-breaking winning run in the ninth, which run was most valuable? I’d argue that they were all equally valuable because ultimately the team needed all four runs to win. At the time it was scored, the last run felt especially valuable because time was running out, but the reality is that all four runs were equally necessary. When Zimmerman hit his walk-off grand slam against the Phillies, capping a 6-run ninth inning, the Nats’ relief pitchers (Tom Gorzelanny, Sean Burnett, and Todd Coffey) were the unsung heroes; if they hadn’t kept the game within reach, the offense’s ninth inning heroics might have been for naught. But WPA doesn’t recognize the ultimate importance of their low leverage innings, whereas RE24 gives them full credit for holding the Phillies scoreless. (The movie Moneyball highlights another great game where runs that didn’t seem important at the time turned out to be critical.)
So my take is that while both RE24 and WPA have their uses, RE24 generally does better at measuring the value of a player’s overall situational performance, whereas WPA reveals something about how a player performs in situations where the game is known to be on the line.
RE24 for pitchers
For pitchers, ERA already captures quite a bit of situational performance. For example, if Ross Detwiler loads the bases with no outs and manages to work his way out of the jam without giving up a run, his ERA credits him with the scoreless inning.
ERA sometimes breaks down. The biggest problem is with relief pitchers. One of the most important responsibilities of a relief pitcher is to prevent inherited runners from scoring, yet a relief pitcher’s ERA is unaffected by whether he prevents or allows inherited runners to score. Ironically, when a pitcher leaves the game with runners aboard, his performance is judged by events that occur after he’s in the dugout. Because RE24 allows partial credit for preventing or allowing runs, it can appropriately split the responsibility for inherited runners between the pitcher who left them on and the pitcher who allows them to score (or prevents them from doing so).
The other difference is that RE24 tracks all runs, not just unearned runs. Sabermetricians have long complained that ERA, by removing unearned runs, doesn’t really adjust for fielding and can lead to a misleading measure of pitcher performance, since the unearned run rates vary systematically among pitchers. (Groundball pitchers tend to allow more unearned runs, whereas strikeout or flyball tend to allow fewer unearned runs. So ERA tends to overstate the performance of groundball pitchers like Chien-Ming Wang and John Lannan relative to simple R/9.)
RE24 is measured as positive or negative deviations from MLB average, so to be able to directly compare it to ERA, we need to add back in the MLB-average runs. For Stephen Strasburg for example, multiply his innings pitched (18) times the league average runs per game, (4.28/9), then subtract his RE24 (3.52) to give a number (5.04) that’s directly comparable to his runs allowed (5). But this is on a scale of R/9 rather than ERA, so to convert it to an ERA scale I multiply by the MLB ratio of ERA to R/9 (3.94/4.28). Finally, we divide by Strasburg’s IP and multiply by 9, giving us an ”RE24 average” of 2.32, which can be compared directly to his ERA of 2.00.
Comparing the RE24 averages of the Nats pitchers to their ERAs, we see that some pitchers do quite a bit better (Tyler Clippard‘s RE24 average is 1.19, compared to his ERA of 1.88), while others do much worse (Doug Slaten‘s RE24 average is 9.56 compared to his ERA of 4.02). Burnett and Henry Rodríguez also did more poorly with RE24, while Coffey did better. In general, these differences reflect the pitcher’s performance with inherited runners. Among starters, Wang did especially poorly (4.68 versus his ERA of 4.04), reflecting a large number of unearned runs allowed.
RE24 for batters
For batters, measures like wOBA and WAR are based on averages for various types of hits (singles, doubles, triples, home runs, walks, etc.) and generally aren’t affected by situational hitting, such as batting performance with runners in scoring position or grounding into double plays. For a measure of player value, you might want to include situational performance, but there are so many situations to consider that doing so systematically may seem daunting.
Fortunately, RE24 does these calculations automatically for all possible base-out situations. To see whether a batter performs better or worse than expected based on the situations he faces, we can simply compare his RE24 to his WRAA. (WRAA, or ”weighted runs above average” converts a batter’s wOBA to runs above average, so it’s on the same scale as RE24.)
For example, Wilson Ramos has an RE24 (–7.01 through Saturday’s game) that’s 12½ runs below his WRAA, which reflects his overall poor splits with runners on base or in scoring positions. Jayson Werth also has a poor RE24 relative to his WRAA, while, on the positive side of the ledger, Danny Espinosa, Laynce Nix, Iván Rodríguez, and Ankiel all have RE24 stats that are stronger than their WRAA.
The main caveat with RE24 is that it always attributes everything that happens with an in-play event to the batter and the pitcher. Of course, fielders and baserunners often matter a lot to what happens, so for individual plays RE24 can be a fairly crude measure. As a consequence, I generally wouldn’t take the full difference between RE24 and WRAA, for example, as a measure of a batter’s situational performance, since some of the difference may also be attributable to baserunners or fielders.
Even with this caution, I’m finding that RE24 is a very powerful tool for looking at questions such as post-season awards, where I’d like to take account of situational performance in a systematic way.