MLB

5 Hitters Bound to Regress Based on Expected BABIP

Several high-profile players are posting unsustainably-high batted-ball stats this year. Who's knocking on regression's door?

"Protect yo neck 'cuz they comin' for respect, hit yo onion, Then chop yo deck; your head tumblin' like gymnastics 'Cuz ignorance is bliss."

If we analyze the above lyrics from Kendrick Lamar's "Ignorance is Bliss," I think we can make several safe assumptions about the context under which Mr. Lamar recorded the song. First of all, it's pretty obvious that when Mr. Lamar says, "they comin'," the "they" he refers to is the devious, trickster demons of regression. "Yo onion" is those dope stats you've accumulated so far this season. "Chop yo deck" means these pesky regression party-poopers are going to make those dope stats closer to their true level of stankiness. Who knew Kendrick was a Sabermetrician?

The reason this stat-driven rap is titled "Ignorance is Bliss" is because sometimes it's better to turn your head to the signs that regression is coming and enjoy the success. Now is not one of those times. While letting y'all know who's likely to regress may not save their precious onions, maybe it can lessen the sadness associated the realization of this truth.

In order to predict whose beautiful onion stats are in trouble, I've gotten in the habit of comparing a batter's batting average on balls in play (BABIP) to his line-drive percentage. If you read our piece from earlier this week about which guys were bound to improve based on their expected BABIP, you'll already have seen the methodology behind this. If not, I'll run back through it here quickly.

This winter, I looked at the correlation between a hitter's BABIP and their line-drive percentage for each of the batter's from 2013 that had at least 300 plate appearances. The correlation coefficient of the data was .455, meaning that a statistically significant relationship existed between the two statistics.

Once I saw that it was significant, I charted the two statistics in order to generate a line-of-best-fit for the data. In essence, what this gave me was an expected BABIP based solely on a hitter's line-drive percentage.

Now, like I mentioned in the other article, this is a pretty rudimentary method because it omits other things such as ground ball, fly ball and infield fly ball percentages and speed. If you want an expected BABIP calculator that is far more sophisticated and far more awesome, check out the one that Derek Carty created for The Hardball Times back in 2009.

Below is a list of five guys whose BABIP was too high based on their line-drive percentage. With each hitter, you'll see what I'll be calling their "expected BABIP." This is based on the equation for the line-of-best-fit based solely on their line-drive percentage. Ready? Sweet. Let's see who needs to protect their onions (stop giggling, children) from the regression demons.

1. A.J. Pollock, Arizona Diamondbacks

BABIP: .355 | Line-Drive Percentage: 11.5% | Expected BABIP: .247

To put this simply, Pollock darn near broke my spreadsheets. For each hitter, I divided their BABIP and their line-drive percentage to create a ratio of the two stats. The average ratio was 1.55. The second highest total was 2.60. Pollock was at 3.09.

If Pollock's expected BABIP were to be his true BABIP, his batting average would be .221; instead, it was at .301 entering play Thursday. How is an 80-point difference even possible? Well, it's largely because most of Pollock's batted ball stats are ridiculously unsustainable.

According to ESPN's Tristan Cockcroft in this 2013 post about BABIP, the average big-leaguer hits .226 on ground balls, .132 on fly balls and .714 on line-drives. Pollock this season is hitting .328, .189 and .759 in those three categories, respectively. He's more than 100 points above league average on ground balls, 57 on fly balls and 45 on line drives. Reality may be harsh the next few weeks.

2. Yasiel Puig, Los Angeles Dodgers

BABIP: .397 | Line-Drive Percentage: 15.5% | Expected BABIP: .270

Putting Pollock on this list was a no-brainer for me because his batted-ball numbers were fairly regular last year. Puig is a completely different story. He made our regression list from the off-season, too, because he did this exact same thing last year. I was tempted to leave him off this list completely because he clearly has a vendetta against the status quo, but nah - the world needs to know.

Previously, I mentioned Pollock's ratio of BABIP-to-line-drive-percentage was 3.09, the highest total. The higher the total, the more likely that your stats are to deflate. Puig led the entire league in this ratio last year, and he's third this year with a 2.56 mark. The man makes no sense. Absolutely none.

On the season, Puig is batting .386 on ground balls, .316 on fly balls and .703 on line-drives. While his line drive average is pretty normal, the other two are the deformed, three-headed liger of batted-ball stats. A part of it is probably that not all ground balls are created equally, but still, the splits between Puig's numbers and the average should not be this grotesque.

If there's one guy that could buck the norms of traditional baseball statistics, it's Puig. He has defied logic ever since his call-up (and his bat-flips have defied physics). I'm hoping now Puig continues to prove this theory wrong because brudduh is the illest man in the game.

3. Justin Upton, Atlanta Braves

BABIP: .393 | Line-Drive Percentage: 18.2% | Expected BABIP: .284

Upton is having a spectacular season so far. A year after being labeled a bit of a disappointment his first season with the Braves, Upton is hitting .308/.387/.595 with a .421 weighted on-base average (wOBA) and 13 home runs. Unfortunately, it doesn't look like it's going to last.

In his career, Upton's highest BABIP is .360 back in his second full season with Arizona. Last year, that number was .321. This year, it has spiked all the way to .393, a number that is unsustainable with his batted-ball numbers.

Last year, Upton hit .239 on ground balls, .254 on fly balls and .768 on line-drives. His fly ball numbers were pretty high, but that's not too abnormal for a guy with some decent pop. This year, those averages are up to .383, .325 and .722 respectively. Unless Upton has magically developed telekinesis or has secret mind-control powers over the defense, dude is probably going to come tumbling down a significant amount. Although I wouldn't count out the telekinesis thing too quickly.

4. Alex Gordon, Kansas City Royals

BABIP: .309 | Line-Drive Percentage: 13.1% | Expected BABIP: .256

The Royals have the worst slugging percentage in the entire league at .352. Edwin Encarnacion has almost as many home runs this month (16) as the Royals have all year (22). It just seems border-line mean to say that one of their best offensive players is going to regress, but the spreadsheets don't lie, homie.

If Gordon's actual BABIP were equal to his expected BABIP, he'd be hitting .230 instead of his current .279. Sadly, that would still be 78 points higher than Mike Moustakas' .152 mark prior to his demotion.

This projected regression actually follows a pretty troubling trend for him over the last few years. Ever since his true break-out season in 2011, his slugging percentage has gone down every season (from .502 to .455 to .422 to .393). Over the past three years, his line-drive percentage has fall to 13.9 percent from 20.3 percent last year and 25.0 percent in 2012. Gordon, now in his age-30 season, may be past his prime.

Unfortunately, his decline comes at the same time that Lorenzo Cain is really coming into his own. There are a bunch of young bucks on this club, and if Gordon's stats do fall as this expected BABIP would suggest, he could have some serious job security issues when 2015 rolls around.

5. Adam Jones, Baltimore Orioles

BABIP: .339 | Line-Drive Percentage: 15.3% | Expected BABIP: .268

Thus far in 2014, Jones is posting the highest BABIP of his career while also recording his lowest line-drive percentage. Seem odd? That's because it is. It probably won't stay that way for long.

On the ground balls that Jones has hit this year, his batting average is a Puig-esque .374. Last year, that was only .276 for him. Again, this points toward a very serious regression for Jones, and when you have a walk-rate of 2.2 percent, a reduction in batting average is all-the-more damaging.

If Jones were performing at his expected BABIP, his batting average would be .230 instead of .289. This would also bring his on-base percentage down to somewhere in the .250's. With that said, I don't believe Jones will fall all the way down to his expected BABIP because he has had a tendency to out-perform that by a bit throughout his career. That doesn't mean his rate stats won't come down a bit, though. Don't insult the spreadsheets by even entertaining that thought. They're sensitive.

Other Candidates: Jonathan Schoop, Ian Desmond, Eric Hosmer, Howie Kendrick.