Create your own PitchFx database
Let me start this post with a brief preface. If you are not a baseball fan who is interested in baseball statistics, this post will probably not hold much interest for you. However if you are, then you may be familiar with the MLB Gameday application and its data which is a great source of richly detailed MLB statistical data. In a previous post I showed you how to take that data from the MLB server and put it on your local machine for easier use, using the Gameday API. In this post, I will show how you can create a local MySQL database to hold the data.
Included with the Gamday data is data that is called pitchfx data. This is data that is collected by special equipment installed in all major league ballparks to take precise measurements of the physics of every pitch thrown in a game. Pitchfx data includes things like pitch speed, pitch movement, release point, and much more. It is a very rich source of data for analyzing a pitcher’s performance.
Many baseball researchers like to have the pitchfx data locally in a database so that they can process it in interesting ways, creating plots, graphs, and other visualizations of the data. Using a Ruby library that I created called the Gameday API, it is very easy to create your own pitchfx database.
This post assumes you know enough about Ruby to understand what the IRB console is. IRB is an easy to use interactive Ruby shell. For those who are not familiar with IRB, here is a tutorial that should get you started.
The first thing you’ll need to do is make sure that you have the Gameday API downloaded. You can download the Gameday API from Github.
You can import the pitchfx data either from files you have stored locally, such as those that have been downloaded using Gameday API as shown in this post, or you can choose to download the data directly from gd2.mlb.com. The later method includes both the download and database import steps. Edit the gameday_fetcher.rb file located in the lib directory of Gameday API to set the preferred data location. Look for the block of code that looks like this:
To fetch the files from the gameday server, make sure the GamedayRemoteFetcher line is uncommented (remove the # symbol). To use locally saved data, uncomment the GamedayLocalFetcher line and comment the GamedayRemoteFetcher line.
The next thing required is a database setup to hold the data that you will be importing. Inside of the db directory of Gameday API is a SQL file named db_structure.sql. From the command-line or using a database tool, create a MySQL database and import the schema contained in that file.
To get started with importing the pitchfx data, open up an IRB shell inside of the lib directory of the Gameday API. At the IRB prompt type this:
db = DbImporter.new(’localhost’,'root’,'password’,'pitchfx’)
db.import_for_month(’2010′,’4′)
And thats it! The first line imports the required Ruby class into your IRB session. The second line creates an instance of the DbImporter class for you to use. In this line, you pass four string parameters to the DbImporter.new method. The four parameters are the host name of your database, your database username, database password, and the name of your database. The third line calls the import_for_month method which will import all of the pitchfx data for the specified month. So in this example, you’d end up with the pitchfx data for the month of April, 2010 in a local database. This will not happen instantly, it will take some time to download and import the entire month of pitchfx data.
If you do a Google search, you will find several others have posted about methods for creating pitchfx databases, but I think you will find this is the easiest method for anyone to use. As always, if I can help you out in anyway don’t hesitate to contact me. The best way to contact me is probably through Twitter, @tfisher.
Read Full Post | Make a Comment ( None so far )
