Wednesday, September 19, 2007

Corrections to the Corrections?

Mike Fast made an interesting comment in my last post.
One other thing I had been wanting to ask you about...when you calculated the corrections to the x0, z0 initial point, did you assume that each park had a single correction factor that did not vary with time? I noticed with Papelbon that his data was quite different between two different Boston homestands, and Dr. Nathan mentioned to me that the PITCHf/x system is typically recalibrated between homestands.

I haven't known Mike for long (and really how much to you ever know someone from reading blogs?) but I do know that when he says something, it is worthwhile to look in to. I had assumed that things in each home park stayed pretty much the same. Pretty much everywhere I look people have been making plots combining all home data. This is something I probably should have looked at earlier but better late than never. So the question is, do we need to add a daily (or home stand) correction to home parks?

To start, lets look at a few pitchers initial release point from game to game and see how things look. Even though Mike mentioned Papelbon I am going to start by looking at Jake Peavy. The PITCHf/x system was installed from day one in San Diego and Peavy has been a workhorse for them. Here is Peavy's vertical release point by date.
I have added Peavy's road starts in just to give an idea of what kind of error you can expect from park to park. It looks like there is some variation in Peavy's release point as time goes on in his home starts. That variation is less than the variation you see in the road parks but it is there. That said, you can see the wide spread of his release point in game and that spread is larger than what the difference game to game is. By the way, I have removed pitches with speed less than 60 as Joe P. Sheehan suggested but I still see some pathological points. I am not quite sure what to do to remove these right now. The plot looked much worse before I made the cut on speed though so I do believe that is at least helping. What about his horizontal release?
This looks maybe a little worse than the vertical release. Even though Peavy is a right hander I changed the sign to report positive numbers here. Maybe there is some trend towards bringing his release point in closer to his body? Is that an adjustment, or is that from PITCHf/x getting recalibrated or is that just some random noise? With Peavy not really providing the answers lets turn our attention to Papelbon.
Papelbon being a reliever has stretches of getting into multiple games back to back. The last two games on the way right are Sept 12th and Sept 14th and the blob just left of that was a three game stretch from Spet 2nd to the 4th. PITCHf/x wasn't installed in Fenway or many AL East stadiums until recently so we don't have a ton of data to work with. Fenway is also noted as having one of the worst calibrated PITCHf/x systems which is kind of strange because it was installed relatively late. You can see Fenway tends to be lower than his road starts which the correction fact finds and maybe the four home days on the left are lower than the four on the right. Could this be Sportvision realizing Fenway was messed up and recalibrating? Lets take a look at his horizontal release point.
Ugh this is all over the place. That nice three game stretch we noted seemed to have very consistent vertical release point but the last day here it appears Papelbon's release was much closer to his body (mechanics breakdown from pitching three straight days?), or maybe he was a step left on the mound from what he normally was, or maybe the system was recalibrated mid series. If that was the case maybe we would need to do a daily correction to the release point like we have done with the acceleration. Well, from looking at these plots it doesn't appear we have any definitive answers. One pitcher just isn't enough, we need to look at the whole staff. We can't just plot every release point from the home team every day though because some pitchers have very different release points. We need to find each players average release point and then subtract that from each pitch. This will show us the actual difference in release point from average for each pitcher which will put them all on the same level and easy to compare. Everything up until now I have been measuring in feet because these release points are far away from the origin. These differences are going to be much smaller though so I am going to move to inches to make these difference plots. Also, because the horizontal direction seems to be worse I will be using that to compare. Lets start with Fenway.
You now can clearly see the Red Sox home stands and what the differences were for each pitch thrown on each day. Again notice at how large the in game spread is. It appears that at least the Boston pitchers are varying their release point by almost a foot during each game. I've added a grid to make it easier to see how each of these home stands compare with each other and with zero. If the system was getting recalibrated and that was changing the horizontal release points being measured you would expect to find some home stands higher than zero and some lower than zero. If you look very closely you can see that maybe the first few home stands are high by an inch and maybe the last two home stands are low by an inch but it is hard to tell. Maybe looking at a park like Petco which was around from the start would show some move variation.
The Petco data looks pretty consistent to me. Again, maybe the home stands in the middle and the one on the far right show a slight increase and the others a slight decrease but that appears to be very small. Interestingly, their second home stand which was very short seemed to have a few pitchers throwing with and increased difference. That is countered by a single pitcher who was almost a foot below average though. A few parks had the system installed for one day while ESPN was in town only to have their camera removed and then added again at a later date. Coors field is one of those and we have seen that system seems too be pretty bad as well so lets look at that data next.
Even that first day, almost 80 days before their camera was installed full time, shows remarkable agreement with the rest of the data. Maybe that day is a little high and maybe the last home stand is as well but that again isn't anything larger than two inches at most. I have looked through every stadium and through several variables and have seen the same story in each one. The only stadium that really shows a recalibration changing the data is the horizontal release point at Chase field.
This is what I would have expected to see from the other parks if the recalibration was really changing the data. The first two home stands appear to be about four inches above zero. The next home stand maybe about two inches above zero. The last three home stands appear to be two or three inches below zero. Interestingly, the vertical change appears to much smaller than the horizontal.
So what can we conclude? Well it does appear that Sportvision is recalibrating their PITCHf/x systems between home stands but, in general, those corrections are relatively small. Chase Field does appear to be an exception though. My correction factor seems to think that, overall, Chase Field is moving the horizontal release point about four inches to the left (as the catcher sees it). But it appears that the difference in Diamondback home games alone is about four inches because of recalibration.

So what should we do about this. I probably could just adjust Chase Field "by hand" and be done with it but what if a recalibration in another park messes up the data in these last few weeks or even next year? It sure would be nice to have that automated. So what I am planning on doing is writing a first correction algorithm that will sit in between my code that parses the data and the code that currently does the corrections. This code will do a home stand by home stand intra-park correction and then feed the results to the regular code that will handle the inter-park corrections. Unfortunately, this will push back the player cards to probably this weekend. I know I am such a tease, but hopefully making this last correction will really nail things down. I'd like to thank Mike again for pointing this out. If you have any comments or concerns with the PITCHf/x data please comment below or email me.

2 Comments:

At September 20, 2007 11:05 AM , Blogger Mike Fast said...

Great work as always, Josh.

Have you seen Harry's latest post over at Cubs F/X? He mentions that he sees a wind effect on pitch speed in Tom Shearn's starts.
http://cubsfx.blogspot.com/2007/09/shearn-speed-griffey-hurt-sorianos-bomb.html

One of the projects on my long list of stuff I want to do was to study the effect of wind on run scoring under the assumption it would affect batted ball speed, but it never occurred to me to connect it directly to pitch speed as Harry did.

I'm curious if this is a repeatable effect for multiple pitchers. I may get around to looking at this in the next week or two, but I thought you might be interested as well.

 
At September 20, 2007 2:15 PM , Blogger Josh Kalk said...

Yeah I am definitely interested. The problem comes in as what is the actual wind between the pitchers mound and home plate? Where is the official reading taking place and how much of that wind is actually affecting the pitches? In a stadium like Wrigley I am sure it has a big effect. Miller park though records a wind reading when the roof is opening but I doubt much (any?) of it is noticeable at the mound. If you go back to Dr. Nathan's drag and spin equations the velocity really should be the velocity of the ball plus the velocity of the wind. If you can get constant wind info though, it can be corrected for.

 

Post a Comment

<< Home