I’m going to start digging into baseball statistics. There are a couple of R packages that facilitate the collecting of data from MLB. I’ll be using the R package mlbgameday, mostly because I saw that it was recently updated.
Getting some data
First we load the package. Along with a couple of other of packages that we’ll need.
library(mlbgameday)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
library(magrittr)
Now we’ll grab some data. I’m picking the date 8/22/18 for no specific reason except that it was fairly recent. This chunk can be a little slow since it has to get it from the web.
dat <- get_payload(start = "2018-08-22", end = "2018-08-22")
## Gathering Gameday data, please be patient...
## Warning: executing %dopar% sequentially: no parallel backend registered
dat %>% str
## List of 5
## $ atbat :'data.frame': 1108 obs. of 33 variables:
## ..$ pitcher : num [1:1108] 623381 623381 623381 623381 623381 ...
## ..$ batter : num [1:1108] 656775 542340 641820 430945 542921 ...
## ..$ num : num [1:1108] 1 2 3 7 8 9 10 14 15 16 ...
## ..$ b : num [1:1108] 3 1 2 2 4 3 1 1 2 0 ...
## ..$ s : num [1:1108] 2 2 1 2 2 2 2 3 3 0 ...
## ..$ o : num [1:1108] 1 1 3 1 1 2 3 1 2 3 ...
## ..$ start_tfs : chr [1:1108] "163853" "164115" "164321" "165312" ...
## ..$ start_tfs_zulu: chr [1:1108] "2018-08-22T16:38:53Z" "2018-08-22T16:41:15Z" "2018-08-22T16:43:21Z" "2018-08-22T16:53:12Z" ...
## ..$ stand : chr [1:1108] "R" "R" "R" "R" ...
## ..$ b_height : chr [1:1108] "5-8" "6-1" "6-4" "6-2" ...
## ..$ p_throws : chr [1:1108] "L" "L" "L" "L" ...
## ..$ atbat_des : chr [1:1108] "Cedric Mullins flies out to left fielder Teoscar Hernandez. " "Jonathan Villar hit by pitch. " "Trey Mancini grounds out, shortstop Richard Urena to first baseman Justin Smoak. " "Adam Jones pops out to catcher Danny Jansen in foul territory. " ...
## ..$ atbat_des_es : chr [1:1108] "Cedric Mullins batea elevado de out a jardinero izquierdo Teoscar Hernandez. " "Jonathan Villar golpeado por lanzamiento. " "Trey Mancini batea rodado de out, campo corto Richard Urena a primera base Justin Smoak. " "Adam Jones batea elevadito de out a receptor Danny Jansen en territorio de foul. " ...
## ..$ event : chr [1:1108] "Flyout" "Hit By Pitch" "Groundout" "Pop Out" ...
## ..$ score : chr [1:1108] NA NA NA NA ...
## ..$ home_team_runs: chr [1:1108] "0" "0" "0" "0" ...
## ..$ away_team_runs: chr [1:1108] "0" "0" "0" "0" ...
## ..$ url : chr [1:1108] "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" ...
## ..$ inning_side : chr [1:1108] "top" "top" "top" "top" ...
## ..$ inning : num [1:1108] 1 1 1 2 2 2 2 3 3 3 ...
## ..$ next_ : chr [1:1108] "Y" "Y" "Y" "Y" ...
## ..$ event2 : chr [1:1108] NA NA NA NA ...
## ..$ event3 : logi [1:1108] NA NA NA NA NA NA ...
## ..$ batter_name : chr [1:1108] "Cedric Mullins" "Jonathan Villar" "Trey Mancini" "Adam Jones" ...
## ..$ pitcher_name : chr [1:1108] NA NA NA NA ...
## ..$ event4 : logi [1:1108] NA NA NA NA NA NA ...
## ..$ gameday_link : chr [1:1108] "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" ...
## ..$ date : Factor w/ 1 level "2018-08-22": 1 1 1 1 1 1 1 1 1 1 ...
## ..$ end_tfs_zulu : chr [1:1108] "2018-08-22T16:41:04Z" "2018-08-22T16:42:44Z" "2018-08-22T16:45:44Z" "2018-08-22T16:55:25Z" ...
## ..$ event_num : chr [1:1108] "12" "18" "31" "60" ...
## ..$ event_es : chr [1:1108] "Elevado de Out" "Pelotazo" "Roletazo de Out" "Elevado de Out" ...
## ..$ play_guid : chr [1:1108] "f522593e-268d-4c6c-9b6e-75b81410ed23" "b1d0ffae-b850-4dcb-a7a0-005edbe7aeef" "12c9f256-8a4d-432d-b5fe-ef85aeb492c3" "c998db59-166e-403e-86e7-f92a3a69a8df" ...
## ..$ event2_es : chr [1:1108] NA NA NA NA ...
## $ action:'data.frame': 312 obs. of 24 variables:
## ..$ b : num [1:312] 0 0 0 3 2 0 0 0 3 0 ...
## ..$ s : num [1:312] 1 1 0 2 3 0 0 0 2 1 ...
## ..$ o : num [1:312] 2 0 0 1 2 0 2 0 0 0 ...
## ..$ des : chr [1:312] "With Trey Mancini batting, Thomas Pannone picks off Jonathan Villar at 1st on throw to Justin Smoak. " "Mound Visit. " "Pitching Change: Ryan Tepera replaces Thomas Pannone. " "With Cedric Mullins batting, John Andreoli steals (1) 2nd base. " ...
## ..$ des_es : chr [1:312] "Con Trey Mancini bateando, Thomas Pannone sorprende a Jonathan Villar en la 1ra con disparo a Justin Smoak. " "Visita a la Lomita" "Cambio de Lanzador: Ryan Tepera reemplaza a Thomas Pannone. " "Con Cedric Mullins bateando, John Andreoli se roba (1) la 2da base. " ...
## ..$ event : chr [1:312] "Pickoff 1B" "Game Advisory" "Pitching Substitution" "Stolen Base 2B" ...
## ..$ tfs : chr [1:312] "164442" "180711" "182648" "183654" ...
## ..$ tfs_zulu : chr [1:312] "2018-08-22T16:44:42Z" "2018-08-22T18:07:11Z" "2018-08-22T18:26:48Z" "2018-08-22T18:36:54Z" ...
## ..$ player : num [1:312] 542340 430945 572193 607430 542340 ...
## ..$ pitch : num [1:312] 1 2 5 6 5 4 1 4 6 1 ...
## ..$ url : chr [1:312] "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" ...
## ..$ inning_side : chr [1:312] "top" "top" "top" "top" ...
## ..$ inning : num [1:312] 1 7 8 8 8 9 7 8 8 8 ...
## ..$ next_ : chr [1:312] "Y" "Y" "Y" "Y" ...
## ..$ num : chr [1:312] "4" "42" "8" "53" ...
## ..$ score : chr [1:312] NA NA NA NA ...
## ..$ home_team_runs: chr [1:312] "0" "0" "1" "1" ...
## ..$ away_team_runs: chr [1:312] "0" "0" "0" "0" ...
## ..$ event2 : chr [1:312] NA NA NA NA ...
## ..$ gameday_link : chr [1:312] "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" ...
## ..$ event_es : chr [1:312] "Out en Viraje a 1B" "Aviso en el Juego" "Cambio de Lanzador" "Base Robada 2B" ...
## ..$ event_num : chr [1:312] "26" "313" "379" "408" ...
## ..$ play_guid : chr [1:312] "fadb5c9a-b418-4171-9cd2-4eafd557bdce" NA NA "f8ae4795-b78c-4e51-bd44-04bce1bfe93b" ...
## ..$ event2_es : chr [1:312] NA NA NA NA ...
## $ pitch :'data.frame': 4413 obs. of 50 variables:
## ..$ des : chr [1:4413] "Ball" "Ball" "Called Strike" "Called Strike" ...
## ..$ des_es : chr [1:4413] "Bola mala" "Bola mala" "Strike cantado" "Strike cantado" ...
## ..$ id : num [1:4413] 3 4 5 6 7 8 9 10 14 15 ...
## ..$ type : chr [1:4413] "B" "B" "S" "S" ...
## ..$ tfs : chr [1:4413] "163856" "163910" "163923" "163938" ...
## ..$ tfs_zulu : chr [1:4413] "2018-08-22T16:38:56Z" "2018-08-22T16:39:10Z" "2018-08-22T16:39:23Z" "2018-08-22T16:39:38Z" ...
## ..$ x : num [1:4413] 80.4 62.1 93.8 79.4 83.4 ...
## ..$ y : num [1:4413] 175 154 162 174 166 ...
## ..$ sv_id : chr [1:4413] "180822_163901" "180822_163915" "180822_163928" "180822_163942" ...
## ..$ start_speed : num [1:4413] 89.3 89.5 88.2 88.4 90 72.1 89.3 89.6 90 83.8 ...
## ..$ end_speed : num [1:4413] 79.8 79.7 79.1 79.4 81 65.5 79.7 81.1 81 75.8 ...
## ..$ sz_top : num [1:4413] 3.26 3.28 3.26 3.12 3.26 ...
## ..$ sz_bot : num [1:4413] 1.48 1.5 1.48 1.39 1.48 ...
## ..$ pfx_x : num [1:4413] 5.52 4.58 5.71 6.05 6.15 ...
## ..$ pfx_z : num [1:4413] 10.4 10 10.7 11.4 11 ...
## ..$ px : num [1:4413] 0.959 1.44 0.609 0.986 0.881 ...
## ..$ pz : num [1:4413] 2.37 3.15 2.83 2.4 2.7 ...
## ..$ x0 : num [1:4413] 1.27 1.34 1.32 1.16 1.09 ...
## ..$ y0 : num [1:4413] 50 50 50 50 50 ...
## ..$ z0 : num [1:4413] 4.95 5.01 5.01 4.93 4.94 ...
## ..$ vx0 : num [1:4413] -2.62 -1.26 -3.66 -2.41 -2.59 ...
## ..$ vy0 : num [1:4413] -130 -130 -128 -129 -131 ...
## ..$ vz0 : num [1:4413] -3.67 -1.74 -2.59 -3.77 -3.17 ...
## ..$ ax : num [1:4413] 9.27 7.71 9.4 10.02 10.52 ...
## ..$ ay : num [1:4413] 31.4 32.2 29.7 29.5 31.3 ...
## ..$ az : chr [1:4413] "-14.7786707194148" "-15.2884019002941" "-14.6331034350153" "-13.2149882308981" ...
## ..$ break_y : chr [1:4413] "23.7" "23.7" "23.7" "23.7" ...
## ..$ break_angle : chr [1:4413] "-28.9" "-25.2" "-29.4" "-34.0" ...
## ..$ break_length : num [1:4413] 4.3 4.2 4.3 4.2 4.1 15.1 4.7 4.6 4 8.8 ...
## ..$ pitch_type : chr [1:4413] "FF" "FF" "FF" "FF" ...
## ..$ type_confidence: num [1:4413] 2 2 2 2 2 2 2 2 2 2 ...
## ..$ zone : num [1:4413] 12 12 3 12 12 13 7 8 8 13 ...
## ..$ nasty : num [1:4413] 59 28 43 46 50 22 29 25 47 45 ...
## ..$ spin_dir : num [1:4413] 152 155 152 152 151 ...
## ..$ spin_rate : num [1:4413] 2202 2069 2251 2419 2397 ...
## ..$ cc : chr [1:4413] "" "" "" "" ...
## ..$ mt : chr [1:4413] "" "" "" "" ...
## ..$ url : chr [1:4413] "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" ...
## ..$ inning_side : chr [1:4413] "top" "top" "top" "top" ...
## ..$ inning : num [1:4413] 1 1 1 1 1 1 1 1 1 1 ...
## ..$ next_ : chr [1:4413] "Y" "Y" "Y" "Y" ...
## ..$ num : num [1:4413] 1 1 1 1 1 1 1 1 2 2 ...
## ..$ on_1b : num [1:4413] NA NA NA NA NA NA NA NA NA NA ...
## ..$ on_2b : num [1:4413] NA NA NA NA NA NA NA NA NA NA ...
## ..$ on_3b : num [1:4413] NA NA NA NA NA NA NA NA NA NA ...
## ..$ count : Factor w/ 12 levels "0-0","0-1","0-2",..: 1 4 7 8 9 9 12 12 1 2 ...
## ..$ gameday_link : chr [1:4413] "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" ...
## ..$ code : chr [1:4413] "B" "B" "C" "C" ...
## ..$ event_num : chr [1:4413] "3" "4" "5" "6" ...
## ..$ play_guid : chr [1:4413] "6289c5d3-bcb1-4fbb-a972-b1f0ae888a19" "3c970495-dc61-4249-b36e-1b29f1bec006" "50cdbe92-3a4b-4c2d-b213-66e6406eee6b" "d0dc2bd6-31c2-4c85-8ad4-71c11cf2d517" ...
## $ runner:'data.frame': 880 obs. of 14 variables:
## ..$ id : num [1:880] 542340 542340 542921 542921 542921 ...
## ..$ start : chr [1:880] "" "1B" "" "1B" ...
## ..$ end : chr [1:880] "1B" "" "1B" "2B" ...
## ..$ event : chr [1:880] "Hit By Pitch" "Pickoff 1B" "Walk" "Flyout" ...
## ..$ score : chr [1:880] NA NA NA NA ...
## ..$ rbi : chr [1:880] NA NA NA NA ...
## ..$ earned : chr [1:880] NA NA NA NA ...
## ..$ url : chr [1:880] "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" ...
## ..$ inning_side : chr [1:880] "top" "top" "top" "top" ...
## ..$ inning : num [1:880] 1 1 2 2 2 4 4 7 7 7 ...
## ..$ next_ : chr [1:880] "Y" "Y" "Y" "Y" ...
## ..$ num : num [1:880] 2 3 8 9 10 22 23 40 41 41 ...
## ..$ gameday_link: chr [1:880] "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" ...
## ..$ event_num : chr [1:880] "18" "26" "69" "79" ...
## $ po :'data.frame': 134 obs. of 11 variables:
## ..$ des : chr [1:134] "Pickoff Attempt 1B" "Pickoff Attempt 1B" "Pickoff Attempt 1B" "Pickoff 1B" ...
## ..$ url : chr [1:134] "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" ...
## ..$ inning_side : chr [1:134] "top" "top" "top" "top" ...
## ..$ inning : num [1:134] 1 1 1 1 8 8 8 8 8 8 ...
## ..$ next_ : chr [1:134] "Y" "Y" "Y" "Y" ...
## ..$ num : num [1:134] 3 3 3 3 51 51 51 51 51 52 ...
## ..$ gameday_link: chr [1:134] "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" ...
## ..$ des_es : chr [1:134] "Viraje a 1B" "Viraje a 1B" "Viraje a 1B" "Out en Viraje a 1B" ...
## ..$ event_num : chr [1:134] "20" "22" "23" "24" ...
## ..$ play_guid : chr [1:134] "8c70f191-42d8-464c-a98a-46fa10c31008" "dd6f0afe-7b89-4968-95ec-780d39b6a99e" "fadb5c9a-b418-4171-9cd2-4eafd557bdce" NA ...
## ..$ catcher : chr [1:134] NA NA NA NA ...
## - attr(*, "class")= chr "list_inning_all"
The data returned is a whole mess of things. But we can see that there are a few different main categories.
dat %>% names
## [1] "atbat" "action" "pitch" "runner" "po"
What’s in atbat?
dat$atbat %>% str
## 'data.frame': 1108 obs. of 33 variables:
## $ pitcher : num 623381 623381 623381 623381 623381 ...
## $ batter : num 656775 542340 641820 430945 542921 ...
## $ num : num 1 2 3 7 8 9 10 14 15 16 ...
## $ b : num 3 1 2 2 4 3 1 1 2 0 ...
## $ s : num 2 2 1 2 2 2 2 3 3 0 ...
## $ o : num 1 1 3 1 1 2 3 1 2 3 ...
## $ start_tfs : chr "163853" "164115" "164321" "165312" ...
## $ start_tfs_zulu: chr "2018-08-22T16:38:53Z" "2018-08-22T16:41:15Z" "2018-08-22T16:43:21Z" "2018-08-22T16:53:12Z" ...
## $ stand : chr "R" "R" "R" "R" ...
## $ b_height : chr "5-8" "6-1" "6-4" "6-2" ...
## $ p_throws : chr "L" "L" "L" "L" ...
## $ atbat_des : chr "Cedric Mullins flies out to left fielder Teoscar Hernandez. " "Jonathan Villar hit by pitch. " "Trey Mancini grounds out, shortstop Richard Urena to first baseman Justin Smoak. " "Adam Jones pops out to catcher Danny Jansen in foul territory. " ...
## $ atbat_des_es : chr "Cedric Mullins batea elevado de out a jardinero izquierdo Teoscar Hernandez. " "Jonathan Villar golpeado por lanzamiento. " "Trey Mancini batea rodado de out, campo corto Richard Urena a primera base Justin Smoak. " "Adam Jones batea elevadito de out a receptor Danny Jansen en territorio de foul. " ...
## $ event : chr "Flyout" "Hit By Pitch" "Groundout" "Pop Out" ...
## $ score : chr NA NA NA NA ...
## $ home_team_runs: chr "0" "0" "0" "0" ...
## $ away_team_runs: chr "0" "0" "0" "0" ...
## $ url : chr "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" ...
## $ inning_side : chr "top" "top" "top" "top" ...
## $ inning : num 1 1 1 2 2 2 2 3 3 3 ...
## $ next_ : chr "Y" "Y" "Y" "Y" ...
## $ event2 : chr NA NA NA NA ...
## $ event3 : logi NA NA NA NA NA NA ...
## $ batter_name : chr "Cedric Mullins" "Jonathan Villar" "Trey Mancini" "Adam Jones" ...
## $ pitcher_name : chr NA NA NA NA ...
## $ event4 : logi NA NA NA NA NA NA ...
## $ gameday_link : chr "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" ...
## $ date : Factor w/ 1 level "2018-08-22": 1 1 1 1 1 1 1 1 1 1 ...
## $ end_tfs_zulu : chr "2018-08-22T16:41:04Z" "2018-08-22T16:42:44Z" "2018-08-22T16:45:44Z" "2018-08-22T16:55:25Z" ...
## $ event_num : chr "12" "18" "31" "60" ...
## $ event_es : chr "Elevado de Out" "Pelotazo" "Roletazo de Out" "Elevado de Out" ...
## $ play_guid : chr "f522593e-268d-4c6c-9b6e-75b81410ed23" "b1d0ffae-b850-4dcb-a7a0-005edbe7aeef" "12c9f256-8a4d-432d-b5fe-ef85aeb492c3" "c998db59-166e-403e-86e7-f92a3a69a8df" ...
## $ event2_es : chr NA NA NA NA ...
Looks like a description of every at bat, unsurprisingly.
action?
dat$action %>% str
## 'data.frame': 312 obs. of 24 variables:
## $ b : num 0 0 0 3 2 0 0 0 3 0 ...
## $ s : num 1 1 0 2 3 0 0 0 2 1 ...
## $ o : num 2 0 0 1 2 0 2 0 0 0 ...
## $ des : chr "With Trey Mancini batting, Thomas Pannone picks off Jonathan Villar at 1st on throw to Justin Smoak. " "Mound Visit. " "Pitching Change: Ryan Tepera replaces Thomas Pannone. " "With Cedric Mullins batting, John Andreoli steals (1) 2nd base. " ...
## $ des_es : chr "Con Trey Mancini bateando, Thomas Pannone sorprende a Jonathan Villar en la 1ra con disparo a Justin Smoak. " "Visita a la Lomita" "Cambio de Lanzador: Ryan Tepera reemplaza a Thomas Pannone. " "Con Cedric Mullins bateando, John Andreoli se roba (1) la 2da base. " ...
## $ event : chr "Pickoff 1B" "Game Advisory" "Pitching Substitution" "Stolen Base 2B" ...
## $ tfs : chr "164442" "180711" "182648" "183654" ...
## $ tfs_zulu : chr "2018-08-22T16:44:42Z" "2018-08-22T18:07:11Z" "2018-08-22T18:26:48Z" "2018-08-22T18:36:54Z" ...
## $ player : num 542340 430945 572193 607430 542340 ...
## $ pitch : num 1 2 5 6 5 4 1 4 6 1 ...
## $ url : chr "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" ...
## $ inning_side : chr "top" "top" "top" "top" ...
## $ inning : num 1 7 8 8 8 9 7 8 8 8 ...
## $ next_ : chr "Y" "Y" "Y" "Y" ...
## $ num : chr "4" "42" "8" "53" ...
## $ score : chr NA NA NA NA ...
## $ home_team_runs: chr "0" "0" "1" "1" ...
## $ away_team_runs: chr "0" "0" "0" "0" ...
## $ event2 : chr NA NA NA NA ...
## $ gameday_link : chr "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" ...
## $ event_es : chr "Out en Viraje a 1B" "Aviso en el Juego" "Cambio de Lanzador" "Base Robada 2B" ...
## $ event_num : chr "26" "313" "379" "408" ...
## $ play_guid : chr "fadb5c9a-b418-4171-9cd2-4eafd557bdce" NA NA "f8ae4795-b78c-4e51-bd44-04bce1bfe93b" ...
## $ event2_es : chr NA NA NA NA ...
Looks like this has action that is not the result of the at bat, such as a mound visit or pitching change.
pitch?
dat$pitch %>% str
## 'data.frame': 4413 obs. of 50 variables:
## $ des : chr "Ball" "Ball" "Called Strike" "Called Strike" ...
## $ des_es : chr "Bola mala" "Bola mala" "Strike cantado" "Strike cantado" ...
## $ id : num 3 4 5 6 7 8 9 10 14 15 ...
## $ type : chr "B" "B" "S" "S" ...
## $ tfs : chr "163856" "163910" "163923" "163938" ...
## $ tfs_zulu : chr "2018-08-22T16:38:56Z" "2018-08-22T16:39:10Z" "2018-08-22T16:39:23Z" "2018-08-22T16:39:38Z" ...
## $ x : num 80.4 62.1 93.8 79.4 83.4 ...
## $ y : num 175 154 162 174 166 ...
## $ sv_id : chr "180822_163901" "180822_163915" "180822_163928" "180822_163942" ...
## $ start_speed : num 89.3 89.5 88.2 88.4 90 72.1 89.3 89.6 90 83.8 ...
## $ end_speed : num 79.8 79.7 79.1 79.4 81 65.5 79.7 81.1 81 75.8 ...
## $ sz_top : num 3.26 3.28 3.26 3.12 3.26 ...
## $ sz_bot : num 1.48 1.5 1.48 1.39 1.48 ...
## $ pfx_x : num 5.52 4.58 5.71 6.05 6.15 ...
## $ pfx_z : num 10.4 10 10.7 11.4 11 ...
## $ px : num 0.959 1.44 0.609 0.986 0.881 ...
## $ pz : num 2.37 3.15 2.83 2.4 2.7 ...
## $ x0 : num 1.27 1.34 1.32 1.16 1.09 ...
## $ y0 : num 50 50 50 50 50 ...
## $ z0 : num 4.95 5.01 5.01 4.93 4.94 ...
## $ vx0 : num -2.62 -1.26 -3.66 -2.41 -2.59 ...
## $ vy0 : num -130 -130 -128 -129 -131 ...
## $ vz0 : num -3.67 -1.74 -2.59 -3.77 -3.17 ...
## $ ax : num 9.27 7.71 9.4 10.02 10.52 ...
## $ ay : num 31.4 32.2 29.7 29.5 31.3 ...
## $ az : chr "-14.7786707194148" "-15.2884019002941" "-14.6331034350153" "-13.2149882308981" ...
## $ break_y : chr "23.7" "23.7" "23.7" "23.7" ...
## $ break_angle : chr "-28.9" "-25.2" "-29.4" "-34.0" ...
## $ break_length : num 4.3 4.2 4.3 4.2 4.1 15.1 4.7 4.6 4 8.8 ...
## $ pitch_type : chr "FF" "FF" "FF" "FF" ...
## $ type_confidence: num 2 2 2 2 2 2 2 2 2 2 ...
## $ zone : num 12 12 3 12 12 13 7 8 8 13 ...
## $ nasty : num 59 28 43 46 50 22 29 25 47 45 ...
## $ spin_dir : num 152 155 152 152 151 ...
## $ spin_rate : num 2202 2069 2251 2419 2397 ...
## $ cc : chr "" "" "" "" ...
## $ mt : chr "" "" "" "" ...
## $ url : chr "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" ...
## $ inning_side : chr "top" "top" "top" "top" ...
## $ inning : num 1 1 1 1 1 1 1 1 1 1 ...
## $ next_ : chr "Y" "Y" "Y" "Y" ...
## $ num : num 1 1 1 1 1 1 1 1 2 2 ...
## $ on_1b : num NA NA NA NA NA NA NA NA NA NA ...
## $ on_2b : num NA NA NA NA NA NA NA NA NA NA ...
## $ on_3b : num NA NA NA NA NA NA NA NA NA NA ...
## $ count : Factor w/ 12 levels "0-0","0-1","0-2",..: 1 4 7 8 9 9 12 12 1 2 ...
## $ gameday_link : chr "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" ...
## $ code : chr "B" "B" "C" "C" ...
## $ event_num : chr "3" "4" "5" "6" ...
## $ play_guid : chr "6289c5d3-bcb1-4fbb-a972-b1f0ae888a19" "3c970495-dc61-4249-b36e-1b29f1bec006" "50cdbe92-3a4b-4c2d-b213-66e6406eee6b" "d0dc2bd6-31c2-4c85-8ad4-71c11cf2d517" ...
This has the result of every pitch. This is where most of the fun will be, such as looking at balls and strikes.
runner?
dat$runner %>% str
## 'data.frame': 880 obs. of 14 variables:
## $ id : num 542340 542340 542921 542921 542921 ...
## $ start : chr "" "1B" "" "1B" ...
## $ end : chr "1B" "" "1B" "2B" ...
## $ event : chr "Hit By Pitch" "Pickoff 1B" "Walk" "Flyout" ...
## $ score : chr NA NA NA NA ...
## $ rbi : chr NA NA NA NA ...
## $ earned : chr NA NA NA NA ...
## $ url : chr "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" ...
## $ inning_side : chr "top" "top" "top" "top" ...
## $ inning : num 1 1 2 2 2 4 4 7 7 7 ...
## $ next_ : chr "Y" "Y" "Y" "Y" ...
## $ num : num 2 3 8 9 10 22 23 40 41 41 ...
## $ gameday_link: chr "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" ...
## $ event_num : chr "18" "26" "69" "79" ...
Looks like this shows the movement of runners on the bases.
po?
dat$po %>% str
## 'data.frame': 134 obs. of 11 variables:
## $ des : chr "Pickoff Attempt 1B" "Pickoff Attempt 1B" "Pickoff Attempt 1B" "Pickoff 1B" ...
## $ url : chr "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" "http://gd2.mlb.com/components/game/mlb/year_2018/month_08/day_22/gid_2018_08_22_balmlb_tormlb_1/inning/inning_all.xml" ...
## $ inning_side : chr "top" "top" "top" "top" ...
## $ inning : num 1 1 1 1 8 8 8 8 8 8 ...
## $ next_ : chr "Y" "Y" "Y" "Y" ...
## $ num : num 3 3 3 3 51 51 51 51 51 52 ...
## $ gameday_link: chr "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" "gid_2018_08_22_balmlb_tormlb_1" ...
## $ des_es : chr "Viraje a 1B" "Viraje a 1B" "Viraje a 1B" "Out en Viraje a 1B" ...
## $ event_num : chr "20" "22" "23" "24" ...
## $ play_guid : chr "8c70f191-42d8-464c-a98a-46fa10c31008" "dd6f0afe-7b89-4968-95ec-780d39b6a99e" "fadb5c9a-b418-4171-9cd2-4eafd557bdce" NA ...
## $ catcher : chr NA NA NA NA ...
Looks like this just has information about pickoff attempts, not very exciting.
Conclusion
This has been a very short dive into what kind of data you can get from the R package mlbgameday. I’m going to try to do some actual data analysis next to see what I kind dig up.