Friday 31 March 2017

From #baseballsavant to #Tableau; Part II of a step-by-step for Batters #baseball #visualization #pitchFx

 

In the previous post I showed how to visualize batted ball data from the Baseball Savant website in Tableau, from a side view. This second part will show the same data but visualizing from above (spray chart). For this we need to look at what “coordinate system” baseball savant uses for storing this data. The fields hc_x and hc_y contain the locations of where batted balls ended up. Give or take a few inches the batted balls all start at the same point, above home plate. Generally the hc_x values range from 0 to 250 (0 being left field, and 250 being right field), and the hc_y from 250 to 0 (where 0 is close to home plate and 250 is close to the outfield). It’s not clear to me why they records the data this way, but they do. Using the hit_location field on the color mark (it stores which field position the ball was hit to, or made the out) this is what it looks like:

 

Simply changing the Y-axis scale (Edit Axis > Scale > reversed) shows the data in a more typical orientation.

 

Unfortunately I have not been able to find stadium data (like home plate, pitch mount, base and wall locations) that would allow me to align the batted ball data exactly to the stadium extends, so we’re going to have to guess a little her. Assuming that home plate is centred horizontally it would have an X value of 250 / 2 = 125 . Based on the data and looking as batted ball descriptions like “popped straight up to catcher X” I found that the Y value for home plate is around 220. This will likely vary per park as park dimensions vary, but it will be a decent start.

 

To calculate the start and end point of the batted balls, we use the additional records we created for part I of this series, using the first (type = 0 ) and last record (type = 6) per sv_id in the following calculated fields:

For the X values

IF [Type] = "0" THEN 125

ELSEIF [Type] = "6" THEN [hc_x]

END

 

For the Y values

IF [Type] = "0" THEN 125

ELSEIF [Type] = "6" THEN [hc_x]

END

 

Now when we drag the Calculated field for X values to the Columns bin, the Calculated field for Y values to the Rows bin, and set the Marks Type to Line. All that’s left is to separate the batted balls by sv_id, so drag that field onto Detail. It now looks like this:

 

This can then be cleaned up and filtered or colored (I have barrels in orange below), and a background image can be aligned to give it a nice look:

 

The next step will be combine the graphs that looks at batted balls from the side and from above in Part III of the series.

 




This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager.

This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail.

No comments:

Post a Comment