Thursday, March 7, 2013

PECOTA Is Stupid, Random Guessing Is Apparently Not Stupid

(Thanks to Cory for emailing me these links. I always appreciate links to bad journalism during the cold, long months after the NFL season is over and baseball has yet to begin.)

Many elderly or not-so-elderly baseball sportswriters have written about the scourge that is advanced statistics and their use to evaluate baseball players outside of the normal ways to measure a player like hustle, grit, intangibles (which can't be measured anywhere but in the mind), RBI's and wins. One of the biggest scourges of advanced statistics is PECOTA or any advanced projection designed to reasonably predict a team or player's performance over the season. Today two writers criticize PECOTA, and of course Nate Silver since he has become the Billy Beane of statistical projections, and claim that the PECOTA projections are wrong because they aren't based entirely on guesswork. Well, that's the conclusion I come to. These writers say PECOTA is wrong because it a projection based on numbers and who wants to crunch numbers other than nerds and people who live in their mom's basement? Michael Tomaso also indicates Nate Silver has a bias against the White Sox because he is a Cubs fan. The ridiculousness and inaccuracy of this statement has blown my mind.

First, we'll start with the unenlightened George Ofman. George looks kind in his picture, but his smile hides a deep distrust of statistics and a strong urge to let us know he isn't a geek. George is a sports anchor and does talk radio. He's all about the human element, which apparently he is very, very, very afraid PECOTA is wanting to eliminate from baseball all together. Many of these writers who don't like the idea of using advanced statistics remind me of the same people who didn't want minorities having equal rights in society decades ago. In both cases they are afraid of change trying to overwhelm the status quo and that's just dangerous. If we allow minorities to integrate in society with equal rights they will teach all of our children to be like them and it will ruin the current utopia of ignorance, just like if we allow the use of advanced statistics in sports then the human element will forever be removed from the game of baseball until the sport resembles the video game "Base Wars." Everyone is entitled to their opinion, but much of the opinion is based on fear and lack of understanding more than anything else.

I have an important question to ask: Who is Pecota or is that PECOTA?

If you don't know, don't knock it. PECOTA isn't a stupid idea simply because you lack the knowledge or want to understand it. 

Is it an island off the Pacific or a cough suppressant? Did you have it as a side dish with dinner last night or is it some elaborate test chemists take before working for a government lab? Can you do the Pecota or does the Pecota do something to you? Imagine being Pecotaed!

39 years. That's how long George's standup talents have been wasted doing talk radio. George Ofman could have been the next Jerry Seinfeld, but we'll never know. All we are left with is comedy gold like this passage.

Pecota actually is a name. Bill Pecota was a journeyman big leaguer who played in 698 games over 9 seasons from 1986 through 1994. He was a lifetime .249 hitter. Average, hence the typical Pecota entry. It was developed by Nate Silver, the resident genius statistician who now plies his trade forecasting elections.

And forecasting elections pretty well by the way. That part was left out by Mr. Ofman.

Pecota actually stands for Player Empirical Comparison and Optimization Test Algorithm. In other words it’s a sabermetrician’s orgasm. Then again, anything involving sabermetrics get a rise out of baseball junkies, nerds, or someone who can’t get enough of numbers.

This is as opposed to how the love of intangibles and distrust of numbers is enough to get a rise out of old school baseball writers. They just can't get enough of measuring those things, like intangibles, which can't be measured.

It’s fascinating, engrossing and …inaccurate.

No one said these statistics can predict the future. Therein lies a major problem with Sabermetrics bashing. The statistics used and quoted by Sabermetricians aren't supposed to show the truth or what exactly will happen, but these are simply projections for what is expected to happen. It's like when George Ofman provides a win-loss prediction for the Cubs and White Sox at the end of this very column, except the PECOTA projections are based on statistical data not guesswork. PECOTA isn't trying to be your new daddy and take the place of your real father. PECOTA just wants to marry your mother and have you consider it to be a part of your life.

but when a computer spits out what athletes and teams are expected to achieve, it can’t judge the most important tool: the human element.

PECOTA isn't supposed to measure the human element or injuries a team may incur over the baseball season. That's why they are called "projections" and not "final statistics that show you what will happen this year so there is no need to even play the baseball season."

This is a fantasy product and there are millions of fantasy players in baseball,...I imagine Silver would project I have a better chance of being Bill Pecota than I do cashing in a big ticket. I wonder if I fed him my statistics what the results would be.

The results would be that you taking shots at advanced statistics because you don't understand them and you fear their use. The title of this article is "PECOTA Likely Wrong About Cubs, Sox Predictions," so who exactly is trying to predict the future again? It seems to me like writing an entire column about how the use of advanced statistics to try to predict the future is dumb, and then claiming that the PECOTA predictions are wrong anyway while making his own prediction, is George Ofman's own little way of trying to predict the future that he so strongly believes can't be predicted.

Pecota is great because everyone talks and writes about it. It’s become the preseason standard.  This year it projects the Cubs and White Sox each to win 77 games.  Mighty interesting, this Pecota. If it holds true, the Cubs will gain 16 games and the Sox will lose 8.

George's basic point of view is that the PECOTA tries to predict the future win-loss record for the Cubs-White Sox and that's impossible to do. George may as well have his future read by a palm reader than read PECOTA projections. Plus, George's predictions about the future win-loss record for the Cubs and White Sox are more accurate anyway. So George believes PECOTA fails in trying to accurately predict the future at all, which is odd since George gladly tries to predict the future himself.

Actually, since 2005 Pecota has underrated the Sox by an average of 7.125 games per season.

The human element! George doesn't tell us how much he has overrated or underrated the White Sox record since 2005. I guess it is safe to assume he has guessed the White Sox record perfectly every year.

How could this be? Let’s remember Sports Illustrated pegged the Sox for 95 losses last season.  That’s LOSSES!  Nobody’s perfect, not even me, though I had the Sox winning 82, better than S.I. and Pecota.

Sports Illustrated was way off last year in their projection of the White Sox record. Why isn't this entire article about how Sports Illustrated sucks? Why doesn't George compare Sports Illustrated's projections to palm reading or visiting a psychic? Is it because PECOTA bases their projections on something while Sports Illustrated is just guessing? How is it worse to base projections on something/anything rather than just guess? 

But Pecota still can’t predict a player’s heart or injuries, drug use, squabbles with managers, coaches, spouses or girlfriends (or boyfriends).

And it isn't supposed to predict these things. Here's a super-secret though. No one can predict these things, so all projections or predictions for the 2013 season are useless if this is the only criteria used to criticize PECOTA.

It can’t tell you what you’re thinking or fantasizing about unless you’re fantasizing about stats.

Or getting off sexually in thinking about stats. A lot of stat nerds do this of course. It's science.

It’s just a massive amount of statistics fed into a computer. Geeks can’t get enough of it. I am not a geek. A little weird yes, but not a geek.

So again, George Ofman's simple little prediction that is based on nothing but pulling two sets of numbers out of thin air representing the White Sox and Cubs win-loss record is much more accurate and feasible than projections based on data. It's better to guess than try to make an educated guess.

Just a few weeks ago I predicted the final score of the Superbowl would be 35-31. I was off by just one number. Of course, I had the 49ers winning, but that’s so trivial.  So here I go just like the rest of you:

The cubs will go 73 and 89.  The Sox will finish 82 and 80.

What is this, a palm reader? I guess George Ofman should start playing the lottery since he is now participating in the fantasy game of predicting a team's win-loss record. Guessing is a much better system than PECOTA.

Now Michael Tomaso of the Huffington Post chimes in about how the Baseball Prospectus White Sox 2013 season win-loss prediction is going to be wrong again.

It's the time of year where Baseball Prospectus uses its inside knowledge of baseball to predict the records of every team in the league for the upcoming season.

Baseball Prospectus never claims to have inside knowledge of baseball. The idea they claim to have inside knowledge about baseball is nothing but a lie.

This year they've predicted the White Sox to win 77 games. This should be great news for White Sox fans because Baseball Prospectus has consistently predicted the Sox to be worse than they eventually end up. 

Well, great then. Just assume Baseball Prospectus is wrong and move on with your life. There's no need to begrudge what they do simply because you choose not to understand it.

Chicago Tribune's White Sox reporter Mark Gonzales did a great job analyzing the last eight years of Baseball Prospectus' predictions.

In the last eight years, the White Sox have been worse than predicted only twice and have averaged over seven more wins a season.

That's because Baseball Prospectus can't measure heart, grit, or the human element of playing 162 games in a year. Of course no prediction can factor in all of these factors (like injuries, etc) with any sort of complete accuracy. Baseball Prospectus is simply attempting to use a method in predicting each team's win-loss record and the performance of the players on each team.

If you follow the link where Mark Gonzales covered eight years of Baseball Prospectus' predictions then you can see BP hasn't done an excellent job of accurately predicting the White Sox record. Baseball Prospectus is simply trying to give the reader an idea of a team's record with their PECOTA projections and I think there is reason to expect the prediction won't be perfect given the factors no prediction model can account for (injuries, etc). In addition, I'm not sure Baseball Prospectus' predictions can be totally dismissed simply because over an eight year period they weren't spot-on concerning 1 of the 30 MLB teams.

What really stands out in that chart as totally absurd is the 2006 season projection. 2006 was the year the Sox were coming off their first World Series championship in 88 years. As spring training began that year, the Sox fielded virtually the same team, plus the additions of Jim Thome and inserting Javier Vasquez as their fifth starter. Yet, Baseball Prospectus looked at that team and declared they were average at best.

Baseball Prospectus was eight games off for the 2006 MLB season. Baseball Prospectus had them at 82 wins and they ended up winning 90 games. How dare Baseball Prospectus look at the White Sox as if they were average! What Michael Tomaso leaves out is the White Sox finished 3rd in a division of 5 teams. So in terms of the AL Central, they were average in 2006.

To be fair, it's easy for someone like me to criticize isolated projections because no one really knows what's going to happen.

But Michael Tomaso has no interest in being fair. Sure, it's impossible to criticize isolated projections, but he wants to ignore this and use these projections to show how stupid Baseball Prospectus and PECOTA. When you have an agenda, it's best to push that agenda whenever possible, even if in situations where you admit your criticism probably isn't fair.

You can't factor in injuries or the talent level of a team's opponents, or what trades a team will make during the season to improve. 

No, you can't. Criticizing Baseball Prospectus for the accuracy of their projections while also knowing their projections can't be 100% accurate due to factors that can't be accounted for in the projection is simply insane.

However, when it comes to the White Sox, Baseball Prospectus' methodology not only undervalues the Sox's potential, but they consistently predict the Sox to be .500 or worse. 

Logic would dictate perhaps the Baseball Prospectus methodology has had a difficult time taking injuries to White Sox position players or pitchers into account or players on the White Sox team have unexpectedly under/overachieved. It's not anything Baseball Prospectus does in order to tweak the White Sox team specifically. That much is for sure. Only a crazy person could admit there are factors that can't be included in a projection and then act surprised when a projection doesn't end up 100% correct.

What can be easily dismissed as isolated projection misses, has become a consistent pattern. 

PECOTA hates the White Sox. No one denies this!

To answer that question, I decided to do research on who writes these inaccuracies year after year. What I found shocked and disturbed me.

It's Nate Silver. 

Perhaps Michael should have done some more research to find out Nate Silver hasn't had anything to do with the PECOTA projections since 2009. He still writes for Baseball Prospectus but has nothing to do with the projections anymore.

Also, the projections aren't "inaccuracies" any more than any prediction can be considered inaccurate. That's harsh language from a writer who claims to understand Baseball Prospectus can't accurately factor in variables such as the human element and injuries. I also find it interesting Michael Tomaso calls the Baseball Prospectus information "inaccuracies" when he is inaccurate on who writes these projections.

My whole world of reality collapsed at that moment. 

Feel free to get your reality back immediately because Nate Silver was only there for part of the White Sox win-loss record "inaccuracies."

How could it be the guy I religiously read for pinpoint accuracy in politics? How could it be that Silver is an accuracy genius in politics, but yet when it comes to the White Sox he transforms into the accuracy of a Republican pollster?

Well, clearly there is a bias in PECOTA that Nate Silver insisted Baseball Prospectus keep in the projections even after Silver no longer worked on the projections. Because a person who makes a living off making his projections as accurate as possible will surely want to include a bias into the projections that will make his projections less accurate.

Who cares that logic dictates a person who makes a living off accurate predictions wouldn't want to mess up his predictions to make them less accurate? Nate Silver is one of those cantankerous nerds who has a anti-White Sox bias and doesn't care if his projections reflect that. This would be like if Silver had projected the 2012 Presidential election strongly in favor of one party over another (due to his party affiliation) fully knowing his numbers wouldn't be accurate. It makes not of sense. 

After composing myself, I discovered a possible reason. Silver lived in Chicago for many years near Wrigleyville and is rumored to be a Cubs fan. 

I like how Michael Tomaso does research, but not enough research to get real answers. Nate Silver is a Detroit Tigers fan. It's pretty well documented on the Internet. Of course since Michael Tomaso is afraid of the Interwebs and those crazy nerds who use it, his research might have consisted of not actually doing research and just making assumptions that because Silver lived in Chicago he is a Cubs fan.

Maybe being a Cubs fan is a weighted bias even Silver's methodology can't overcome.

What a strong end to this article. I'm riveted and on the edge of my seat for more. Unfortunately, there is no more.

If Michael Tomaso wants to know why the White Sox Baseball Prospectus projections haven't been incredibly accurate perhaps he can look at those factors no type of projection or prediction can factor into the equation. Maybe factors like injuries (or lack thereof), overperformance, and underperformance are better explanations for why the PECOTA projections aren't accurate for the White Sox. Nah, that makes too much sense. Mostly likely Nate Silver, who doesn't even do the PECOTA projections anymore, has inserted an anti-White Sox factor into the projections. Nerds who love statistics would do something like that, wouldn't they?

4 comments:

  1. I don't really see "stats guys" ruling out scouting and things like that. I think we all realize that scouting is important, but with hundreds of teams in the minor leagues, you can't scout every player in person enough to really know how good they are. My first baseball game came when I was 13. I saw the Yankees beat the Rangers. Mel Hall went 3 for 4 with a triple. Little Eric thought Mel Hall was awesome. Unfortunately, one game does not reflect true ability, and Hall was in Japan a year later.

    As Bill James said, the difference between a .275 hitter and a .300 hitter is one hit every 40 at bats. If someone can see that with their eye, I give them a lot of credit. I certainly can't.

    We know scouting is important to player evaluation. However, you can use those statistics to decide who to focus on. Maybe you plan on making a trade with a team, and you are looking for a lottery ticket as a throw in. Stats show a guy hitting .222, but with incredible strike zone control and good defensive statistics You may send a scout out to check him out or watch film on the player. It is a tool. Just like both Ofman and Tomaso.

    ReplyDelete
  2. But Pecota still can’t predict a player’s heart or injuries, drug use, squabbles with managers, coaches, spouses or girlfriends (or boyfriends).

    doesn't PECOTA actually do this? I'm pretty sure it accounts for guys with dodgy health histories. maybe not freak injuries like Rivera blowing out his knee last year, but no one does.

    I also don't entirely think that PECOTA is trying to predict that, or "You can't factor in injuries or the talent level of a team's opponents, or what trades a team will make during the season to improve." I'm pretty sure it tries to reasonably project how players will do based on what they've shown in the past, then says, "well, if these guys perform like this, their team will probably win somewhere around X amount of games."

    I think what bothers me the most about anti-stat stuff are the elaborate strawmen that get built. if a writer goes after WAR, they argue against a hypothetical person who thinks WAR is literally the only useful metric for evaluating players, and if someone goes after PECOTA they accuse it of claiming to be a flawless, 100%-certain prediction model. whatever.

    it doesn't predict the White Sox well because the White Sox are a pretty confounding team - since they won the World Series they've won 90, 72, 89, 79, 88, 79, and 85 games and haven't consistently done any one thing well aside from hitting home runs. the 2006 prediction was based primarily on the idea that there was no way their pitching would be as good as it was in 2005, and sure enough literally every pitcher they brought back was markedly worse in 2006. they didn't account for Dye and Crede having career years, but I think few people did.

    one thing you don't need advanced stats to know - White Sox fans are whiney, insecure pussies.

    ReplyDelete
  3. ivn, I love that last part of your comment. There is no fan base in the world with a greater inferiorty complex than the White Sox.

    Not sure if you have had the misfortune of listening to Mike Mulligan on 670 The Score but he is the quintessential Sox fan you speak of. Hearing him bitch and moan every time the White Sox farm system gets blasted for having no prospects brings a smile to my face. "But they had 8 minor league pitchers throw in the big leagues last year." Uh that's because of injuries, and most of them sucked.

    ReplyDelete
  4. Eric, I think the point you are getting at is the point that some of these writers seem to consistently miss. Using PECOTA and other advanced statistics are just another way of gathering information. It's too time intensive to watch every player play, so statistics can help bridge this gap in knowledge.

    Ivn, that's part of the thing also. PECOTA isn't saying "here is exactly what will happen." It is a tool saying, "here is what we project will happen based on injuries, progression/regression, other factors." It's not expected to be 100% correct, though that is obviously the goal.

    I know two White Sox fans. One of them isn't whiny and understands the team pretty well. The other...well, your point stands. I hear a lot of complaining about attention being placed on the Cubs.

    ReplyDelete