Posts Tagged ‘stats


OBR – MLB at the Break

Taking a look at the putatively best hitters in the National and the American leagues is a likely a tradition amongst those with a touch too many minutes on the clock lying about–meaning this blog is going to the rodeo. I have previously constructed a hitter ranking based on the top 40 (by batting average) individual players’ stats aggregated to produce an average and standard deviation and then producing a z-score for each player. A more  refined approach takes each team’s total statistics to produce a more comprehensive, and less skewed towards high performers, average and standard deviation than that garnered by merely the top 40 players.

In choosing which stats to compare, I tried to eliminate those with significant correlations with other stats. For instance, OPS and ISOP was eliminated due to their high correlation with the more widely quoted slugging percentage. Eventually the following set of hitting stats was used, derived from ESPN data:

  • Batting Average
  • On-Base Percentage [(H + BB + HBP)/(AB + BB + HBP + SF)]
  • Slugging Percentage
  • Runs Created (calculated with the more complex formula described by wikipedia rather than ESPN’s, though the two sets of numbers are fairly close. Situational hitting nor the team adjustment as described by wiki were not used due to lack of volition and data. Note that runs created have a high correlation to BA, OBP and SLG–plus 0.80–but included it as a proxy, and hopefully a more descriptive one, for the more common RBIs. For the average and standard deviation based on team statistics ZJX simply used a team’s total runs divided by an assumed 11 players)
  • Secondary Average (A way to look at a player’s extra bases gained, independent of Batting Average = (TB – H + BB + SB – CS) / AB)
  • HR/AB (HR/AB were employed rather than the more oft-quoted AB/HR due to the fact that if a player had no home runs the stat would revert to zero instead of an error term due to dividing by zero)
  • BB/K
  • K/AB
  • ISOP/(K/AB) (this combination examines power statistics versus strikeouts–the tradeoff weakness for some power hitters)

I am trying, poorly at that, in reinventing the wheel already created by an even more comprehensive statistic: Wins Above Replacement. Regardless, I’ve always been intrigued by the z-score process, namely the number of standard deviations a player’s statistic is above an average player’s. This of course makes the assumption that hitting stats conform to a normal distribution.

For the National League, the highly paid Joey Votto appeared to be earning by the all-star break:

For the American League, designated hitter David Ortiz took the top spot by a wide margin over Edward Encarnacion:

Combining the statistics of both leagues (by using all MLB teams) is somewhat misleading not only because of the designated hitter rule but also given the each league faces different pitching. However, on that basis, Ortiz beats out Votto by less than three points:


For all the World Series wins, Joe Dimaggio is best remembered for his improbable 56 game hitting streak, if not his tumultuous marriage to icon Marilyn Monroe, which brought a very private man into a highly public relationship.  However, a lesser known DiMaggio stat is equally, if not more, impressive. DiMaggio, a member of the 300 home run club, drove 361 balls out of the park over his 13 year career (in which he missed the prime years of 28-30 due to the Second World War) but struck out just 369 times for a HR/K ratio of 0.978. That is, DiMaggio, a formidable power hitter, had almost as many home runs as strikeouts, typically the bane of power hitters. The top five home run hitters–Barry Bonds, Hank Aaron, Babe Ruth, Willie Mays and Alex Rodriguez, who all have solid career OBPs–don’t go any higher than 0.546 (Aaron).  For those over 300 home runs, the top ten in HR/K:

  1. Joe DiMaggio (NYY): 0.978 (361 HR)
  2. Yogi Berra (NYY): 0.865 (358)
  3. Ted Williams (BOS): 0.735 (521)
  4. Johnny Mize (STL/NYG/NYY): 0.685 (359)
  5. Stan Musial (STL): 0.682 (475)
  6. Lou Gehrig (NYY): 0.624 (493)
  7. Albert Pujols (STL/LAA): 0.618 (459)
  8. Chuck Klein (PHI/CHC): 0.576 (300)
  9. Mel Ott (NYG): 0.570 (511)
  10. Hank Aaron (ATL): 0.546 (755)

For baseball, the home run and the strikeout represent the extremes of volatility in hitting achievement–scoring a run single-handedly and causing an out without even putting the ball in play (or even really connecting with the ball, at least on the final strike). This excludes hitting into a double (or even triple) play, but that is subject to situational factors not to mention the opposing team’s defensive prowess.

In that spirit of capturing those that were able to balance the extremes favorably, I replaced the traditional triple crown of AVG/HR/RBI to a bit more in-depth statistics of OPS/(HR/K)/Runs Created. For this year, in the NL, only Andrew McKutcheon (2nd OPS, 2nd RC, 5th HR/K) and Ryan Braun (5th OPS, 4th RC, 3rd HR/K) appear in serious contention at the break. In the AL, perhaps unsurprisingly, David Ortiz is in the clear pole position at 2nd (by a thin margin) in OPS, 1st in RC, and 1st in HR/K. Josh Hamilton, Robinson Cano and Edwin Encarnacion are honorable mentions finishing in the top five in two categories and the top eight in the third.