Last time I looked into BABIP to see if pitchers could control it. I want to check a couple of more things.
I’m getting the same data as last time.
library(mlbgameday)
library(magrittr)
library(ggplot2)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
# Takes hours to get all the data for the year
dat <- get_payload(start = "2018-01-01", end = "2018-12-31")
Get pitcher names from at bat data.
play_guid
is included since we
only want the final pitch of each at bat, not every pitch.
d2 <- inner_join(dat$pitch, dat$atbat, by=c("num", "url", "play_guid"))
Confidence intervals
So my idea is that I should check the confidence interval for each pitcher’s BABIP to see how much they overlap.
inplayhits <- c("Double", "Single", "Triple")
inplayouts <- c("Bunt Groundout", "Bunt Lineout", "Bunt Pop Out", "Double Play", "Field Error", "Fielders Choice", "Fielders Choice Out", "Flyout", "Forceout", "Grounded Into DP", "Groundout", "Lineout", "Pop Out", "Triple Play")
BABIPbypitcher <- d2 %>% group_by(pitcher_name) %>% summarize(ipo=sum(event %in% inplayouts), iph=sum(event %in% inplayhits), N=iph+ipo, BABIP=iph/(iph+ipo)) %>% filter(N>100) %>% arrange(BABIP) %>% filter(!is.na(pitcher_name))
BABIPbypitcher %<>% mutate(BABIP_low=BABIP - 2*sqrt(ipo*iph/N^3), BABIP_high=BABIP + 2*sqrt(ipo*iph/N^3))
Let’s plot it.
ggplot(data=BABIPbypitcher %>% mutate(i=1:n()), mapping=aes(BABIP, i)) + geom_line(aes(value, i, group=i), reshape2::melt(BABIPbypitcher %>% mutate(i=1:n()), id.vars="i", measure.vars=c("BABIP_low", "BABIP_high"))) + geom_point(color="yellow") + geom_vline(xintercept = sum(BABIPbypitcher$iph)/sum(BABIPbypitcher$N))
Okay, so each pitcher is a horizontal line and a yellow dot. The yellow dot is their measured BABIP, the black line represents their 95% confidence interval. We see that almost all of them cover the league average BABIP. We can calculate what percentage of them include the league average. We’d expect about 95% of them to include the league average value.
leagueavgBABIP <- sum(BABIPbypitcher$iph)/sum(BABIPbypitcher$N)
BABIPbypitcher %<>% mutate(containsleagueavg = leagueavgBABIP > BABIP_low & leagueavgBABIP < BABIP_high)
BABIPbypitcher$containsleagueavg %>% table %>% print
## .
## FALSE TRUE
## 32 389
(sum(BABIPbypitcher$containsleagueavg) / nrow(BABIPbypitcher)) %>% print
## [1] 0.9239905
We get about 92.4%, which is pretty close to 95%. It would be very unlikely this would be above 95%. This is more convincing that pitchers have minimal control over BABIP than what I checked before.
Test for proportions
What we are asking is whether a set of proportions are all equal
or if there is a difference.
Instead of making silly plots, I should try running a statistical test.
This is really easy with prop.test
.
prop.test(BABIPbypitcher$iph, BABIPbypitcher$N)
##
## 421-sample test for equality of proportions without continuity
## correction
##
## data: BABIPbypitcher$iph out of BABIPbypitcher$N
## X-squared = 517.8, df = 420, p-value = 0.0007733
## alternative hypothesis: two.sided
## sample estimates:
## prop 1 prop 2 prop 3 prop 4 prop 5 prop 6 prop 7
## 0.1925926 0.2051282 0.2066116 0.2133333 0.2155689 0.2164948 0.2167488
## prop 8 prop 9 prop 10 prop 11 prop 12 prop 13 prop 14
## 0.2183099 0.2215686 0.2229730 0.2238806 0.2242991 0.2253521 0.2285714
## prop 15 prop 16 prop 17 prop 18 prop 19 prop 20 prop 21
## 0.2297297 0.2302158 0.2330097 0.2345679 0.2352941 0.2365591 0.2371795
## prop 22 prop 23 prop 24 prop 25 prop 26 prop 27 prop 28
## 0.2377049 0.2396694 0.2396694 0.2419355 0.2422360 0.2432432 0.2440476
## prop 29 prop 30 prop 31 prop 32 prop 33 prop 34 prop 35
## 0.2447917 0.2453704 0.2457143 0.2465278 0.2465753 0.2470588 0.2478448
## prop 36 prop 37 prop 38 prop 39 prop 40 prop 41 prop 42
## 0.2480620 0.2500000 0.2504931 0.2505800 0.2514620 0.2519841 0.2539683
## prop 43 prop 44 prop 45 prop 46 prop 47 prop 48 prop 49
## 0.2554348 0.2556391 0.2564103 0.2566372 0.2583333 0.2587719 0.2589928
## prop 50 prop 51 prop 52 prop 53 prop 54 prop 55 prop 56
## 0.2592593 0.2599338 0.2601156 0.2601626 0.2610294 0.2622642 0.2628205
## prop 57 prop 58 prop 59 prop 60 prop 61 prop 62 prop 63
## 0.2628676 0.2633452 0.2634561 0.2637168 0.2640000 0.2645051 0.2649573
## prop 64 prop 65 prop 66 prop 67 prop 68 prop 69 prop 70
## 0.2650602 0.2653061 0.2657343 0.2660256 0.2661871 0.2664360 0.2666667
## prop 71 prop 72 prop 73 prop 74 prop 75 prop 76 prop 77
## 0.2666667 0.2666667 0.2673267 0.2673469 0.2674419 0.2685185 0.2696391
## prop 78 prop 79 prop 80 prop 81 prop 82 prop 83 prop 84
## 0.2699822 0.2702703 0.2709677 0.2713178 0.2713348 0.2714536 0.2714777
## prop 85 prop 86 prop 87 prop 88 prop 89 prop 90 prop 91
## 0.2714932 0.2715232 0.2725528 0.2727273 0.2728732 0.2728785 0.2732342
## prop 92 prop 93 prop 94 prop 95 prop 96 prop 97 prop 98
## 0.2736842 0.2738589 0.2738739 0.2741433 0.2741935 0.2744063 0.2745098
## prop 99 prop 100 prop 101 prop 102 prop 103 prop 104 prop 105
## 0.2749004 0.2750000 0.2756410 0.2758621 0.2764706 0.2765957 0.2767857
## prop 106 prop 107 prop 108 prop 109 prop 110 prop 111 prop 112
## 0.2768080 0.2774566 0.2782609 0.2784091 0.2786070 0.2794118 0.2794411
## prop 113 prop 114 prop 115 prop 116 prop 117 prop 118 prop 119
## 0.2795389 0.2802548 0.2803030 0.2804878 0.2806122 0.2806324 0.2809917
## prop 120 prop 121 prop 122 prop 123 prop 124 prop 125 prop 126
## 0.2811245 0.2812500 0.2816901 0.2819383 0.2822967 0.2824074 0.2826087
## prop 127 prop 128 prop 129 prop 130 prop 131 prop 132 prop 133
## 0.2829736 0.2832370 0.2835821 0.2837838 0.2840909 0.2842809 0.2844828
## prop 134 prop 135 prop 136 prop 137 prop 138 prop 139 prop 140
## 0.2845528 0.2847059 0.2847682 0.2847682 0.2848233 0.2849162 0.2854077
## prop 141 prop 142 prop 143 prop 144 prop 145 prop 146 prop 147
## 0.2854442 0.2857143 0.2857143 0.2857143 0.2861789 0.2863850 0.2870370
## prop 148 prop 149 prop 150 prop 151 prop 152 prop 153 prop 154
## 0.2870813 0.2871287 0.2872928 0.2883117 0.2883721 0.2887029 0.2888350
## prop 155 prop 156 prop 157 prop 158 prop 159 prop 160 prop 161
## 0.2891089 0.2895377 0.2895753 0.2896996 0.2897727 0.2897898 0.2901554
## prop 162 prop 163 prop 164 prop 165 prop 166 prop 167 prop 168
## 0.2905028 0.2905983 0.2910053 0.2910448 0.2914286 0.2914286 0.2919708
## prop 169 prop 170 prop 171 prop 172 prop 173 prop 174 prop 175
## 0.2920518 0.2922078 0.2923077 0.2923077 0.2925046 0.2926829 0.2927308
## prop 176 prop 177 prop 178 prop 179 prop 180 prop 181 prop 182
## 0.2929936 0.2932331 0.2933333 0.2934028 0.2934363 0.2938596 0.2939189
## prop 183 prop 184 prop 185 prop 186 prop 187 prop 188 prop 189
## 0.2939633 0.2944345 0.2944785 0.2945205 0.2946429 0.2946429 0.2946708
## prop 190 prop 191 prop 192 prop 193 prop 194 prop 195 prop 196
## 0.2947598 0.2949853 0.2951168 0.2952381 0.2954545 0.2955801 0.2956811
## prop 197 prop 198 prop 199 prop 200 prop 201 prop 202 prop 203
## 0.2960000 0.2962963 0.2962963 0.2962963 0.2970451 0.2971429 0.2971888
## prop 204 prop 205 prop 206 prop 207 prop 208 prop 209 prop 210
## 0.2972973 0.2975709 0.2976879 0.2977941 0.2978723 0.2987013 0.2987698
## prop 211 prop 212 prop 213 prop 214 prop 215 prop 216 prop 217
## 0.2988506 0.2992701 0.2993631 0.2994012 0.2995392 0.3000000 0.3009119
## prop 218 prop 219 prop 220 prop 221 prop 222 prop 223 prop 224
## 0.3011811 0.3012048 0.3012259 0.3012821 0.3015267 0.3016529 0.3017241
## prop 225 prop 226 prop 227 prop 228 prop 229 prop 230 prop 231
## 0.3020833 0.3021390 0.3031161 0.3032258 0.3033708 0.3035019 0.3035714
## prop 232 prop 233 prop 234 prop 235 prop 236 prop 237 prop 238
## 0.3035714 0.3037190 0.3040541 0.3043478 0.3044776 0.3047138 0.3049096
## prop 239 prop 240 prop 241 prop 242 prop 243 prop 244 prop 245
## 0.3049327 0.3053435 0.3054187 0.3055556 0.3058824 0.3060429 0.3060498
## prop 246 prop 247 prop 248 prop 249 prop 250 prop 251 prop 252
## 0.3062016 0.3064516 0.3070175 0.3072626 0.3075314 0.3076923 0.3076923
## prop 253 prop 254 prop 255 prop 256 prop 257 prop 258 prop 259
## 0.3076923 0.3083333 0.3084416 0.3086420 0.3088803 0.3088803 0.3089005
## prop 260 prop 261 prop 262 prop 263 prop 264 prop 265 prop 266
## 0.3091787 0.3092105 0.3093923 0.3095238 0.3095238 0.3097166 0.3098107
## prop 267 prop 268 prop 269 prop 270 prop 271 prop 272 prop 273
## 0.3100437 0.3100775 0.3103448 0.3103448 0.3103448 0.3105802 0.3111111
## prop 274 prop 275 prop 276 prop 277 prop 278 prop 279 prop 280
## 0.3113208 0.3118908 0.3119266 0.3120567 0.3121272 0.3121387 0.3121693
## prop 281 prop 282 prop 283 prop 284 prop 285 prop 286 prop 287
## 0.3125000 0.3125000 0.3132530 0.3132530 0.3134328 0.3137255 0.3137755
## prop 288 prop 289 prop 290 prop 291 prop 292 prop 293 prop 294
## 0.3139535 0.3139535 0.3141593 0.3145161 0.3148148 0.3149171 0.3149171
## prop 295 prop 296 prop 297 prop 298 prop 299 prop 300 prop 301
## 0.3149606 0.3160920 0.3161765 0.3163636 0.3168724 0.3169014 0.3173077
## prop 302 prop 303 prop 304 prop 305 prop 306 prop 307 prop 308
## 0.3176895 0.3181818 0.3181818 0.3181818 0.3187500 0.3188976 0.3190184
## prop 309 prop 310 prop 311 prop 312 prop 313 prop 314 prop 315
## 0.3191489 0.3192771 0.3194103 0.3196721 0.3197279 0.3200000 0.3200758
## prop 316 prop 317 prop 318 prop 319 prop 320 prop 321 prop 322
## 0.3203125 0.3205128 0.3206522 0.3208333 0.3209877 0.3214286 0.3218391
## prop 323 prop 324 prop 325 prop 326 prop 327 prop 328 prop 329
## 0.3221757 0.3223684 0.3225806 0.3227513 0.3232044 0.3234568 0.3245033
## prop 330 prop 331 prop 332 prop 333 prop 334 prop 335 prop 336
## 0.3245779 0.3249551 0.3252033 0.3255814 0.3257576 0.3262411 0.3262411
## prop 337 prop 338 prop 339 prop 340 prop 341 prop 342 prop 343
## 0.3263598 0.3266332 0.3273138 0.3274854 0.3275862 0.3276231 0.3277778
## prop 344 prop 345 prop 346 prop 347 prop 348 prop 349 prop 350
## 0.3277778 0.3284314 0.3285372 0.3291139 0.3295455 0.3297872 0.3301205
## prop 351 prop 352 prop 353 prop 354 prop 355 prop 356 prop 357
## 0.3305085 0.3305785 0.3311688 0.3311688 0.3312883 0.3317972 0.3333333
## prop 358 prop 359 prop 360 prop 361 prop 362 prop 363 prop 364
## 0.3333333 0.3333333 0.3333333 0.3333333 0.3350000 0.3359375 0.3359375
## prop 365 prop 366 prop 367 prop 368 prop 369 prop 370 prop 371
## 0.3359375 0.3360434 0.3363229 0.3364486 0.3368421 0.3381295 0.3381295
## prop 372 prop 373 prop 374 prop 375 prop 376 prop 377 prop 378
## 0.3382353 0.3387978 0.3402778 0.3412698 0.3422222 0.3440367 0.3443709
## prop 379 prop 380 prop 381 prop 382 prop 383 prop 384 prop 385
## 0.3452915 0.3453237 0.3454545 0.3454545 0.3459716 0.3460317 0.3478261
## prop 386 prop 387 prop 388 prop 389 prop 390 prop 391 prop 392
## 0.3482759 0.3483146 0.3483871 0.3485714 0.3486842 0.3486842 0.3514644
## prop 393 prop 394 prop 395 prop 396 prop 397 prop 398 prop 399
## 0.3521127 0.3529412 0.3539326 0.3541667 0.3543478 0.3545817 0.3551724
## prop 400 prop 401 prop 402 prop 403 prop 404 prop 405 prop 406
## 0.3571429 0.3575949 0.3579545 0.3584906 0.3593750 0.3596838 0.3605634
## prop 407 prop 408 prop 409 prop 410 prop 411 prop 412 prop 413
## 0.3636364 0.3644860 0.3648649 0.3660714 0.3680556 0.3703704 0.3709677
## prop 414 prop 415 prop 416 prop 417 prop 418 prop 419 prop 420
## 0.3734940 0.3750000 0.3812950 0.3823529 0.3879310 0.3885714 0.4000000
## prop 421
## 0.4117647
This prints way too much. All we really want is the p-value.
prop.test(BABIPbypitcher$iph, BABIPbypitcher$N)$p.value
## [1] 0.000773256
The p value is less than .1%. This would qualify as statistically significant. But whenever we have this much data, almost anything that does not exactly follow the null hypothesis (equal proportions) will be significant. It will be more useful to compare this to the values calculated for other quantities.
I also want to see how the confidence intervals compare to other pitching numbers to make sure it makes sense. I’m going to find the number each pitcher had of some other common events that are likely to vary more from pitcher to pitcher.
Checking proportion of balls in play
otherpitcherdata <- d2 %>% group_by(pitcher_name) %>% summarize(ipo=sum(event %in% inplayouts), iph=sum(event %in% inplayhits), ipall=ipo+iph, strikeout=sum(event %in% c("Strikeout", "Strikeout - DP")), walk=sum(event %in% c("Walk")), homerun=sum(event %in% c("Home Run")), allhits=iph+homerun, groundout=sum(event %in% c("Groundout", "Grounded Into DP", "Forceout", "Fielders Choice Out", "Fielders Choice")), flyout=sum(event %in% "Flyout"), N=n(), BABIP=iph/(iph+ipo)) %>% filter(N>100) %>% arrange(BABIP) %>% filter(!is.na(pitcher_name)) %>% mutate(i=1:n())
First let’s check the proportion of at bats that get a ball in play. This will be lower for strikeout/walk/home run pitchers.
otherpitcherdata %>% mutate(ballinplayprop=(iph+ipo)/N,
ballinplayprop_low=ballinplayprop-2*sqrt(ballinplayprop*(1-ballinplayprop)/N),
ballinplayprop_high=ballinplayprop+2*sqrt(ballinplayprop*(1-ballinplayprop)/N)
) %>%
{ggplot(data=., mapping=aes(ballinplayprop, i)) + geom_line(aes(value, i, group=i), reshape2::melt(., id.vars="i", measure.vars=c("ballinplayprop_low", "ballinplayprop_high"))) + geom_point(color="green") + geom_vline(xintercept = sum(.$iph+.$ipo)/sum(.$N))}
The proportion that includes the league mean is
otherpitcherdata %>% mutate(ballinplayprop=(iph+ipo)/N,
ballinplayprop_low=ballinplayprop-2*sqrt(ballinplayprop*(1-ballinplayprop)/N),
ballinplayprop_high=ballinplayprop+2*sqrt(ballinplayprop*(1-ballinplayprop)/N),
containsleagueavg = sum(.$iph+.$ipo)/sum(.$N) > ballinplayprop_low & sum(.$iph+.$ipo)/sum(.$N) < ballinplayprop_high
) %>% {table(.$containsleagueavg)}
##
## FALSE TRUE
## 252 283
This is just over half, or 52.9%. Much less than 95%. And we get a p-value of zero.
otherpitcherdata %>% with(prop.test(iph+ipo, N)) %>% with(p.value)
## [1] 0
This code is awful, and difficult to copy/paste to try other events. I’ll write a function for it.
Function for comparing rates across pitchers
Here’s a function that will do all the steps we need.
- Get data for the events of interest
events1
as a proportion ofevents1+events2
. - Make a plot with confidence intervals.
- Find the confidence interval coverage.
- Find p-value.
I’m requiring at least 20 of each events to avoid small sample issues.
compareevents <- function(events1, events2=setdiff(unique(d2$event), events1)) {
otherpitcherdata <- d2 %>% group_by(pitcher_name) %>%
summarize(e1=sum(event %in% events1), e2=sum(event %in% events2),
N=e1+e2, e1rate=e1/N, e1stderr=sqrt(e1*e2/N^3),
e1rate_low=e1rate-2*e1stderr, e1rate_high=e1rate+2*e1stderr) %>%
filter(e1>20, e2>20) %>% filter(!is.na(pitcher_name)) %>%
arrange(e1rate) %>% mutate(i=1:n())
leagueavg <- sum(otherpitcherdata$e1)/sum(otherpitcherdata$N)
otherpitcherdata %<>% mutate(containsleagueavg=e1rate_low < leagueavg & e1rate_high > leagueavg)
# Make plot of CI
# browser()
print(ggplot(data=otherpitcherdata,
mapping=aes(e1rate, i)) + geom_line(aes(value, i, group=i),
reshape2::melt(otherpitcherdata, id.vars="i", measure.vars=c("e1rate_low", "e1rate_high"))) +
geom_point(color="green") +
geom_vline(xintercept = leagueavg, color="red") +
ggtitle(paste(c("Proportion of ", events1, "\nvs", events2),collapse = " "))
)
# Find coverage
cat("Coverage is\n")
print(table(otherpitcherdata$containsleagueavg))
cat("\nProportion = ", sum(otherpitcherdata$containsleagueavg)/nrow(otherpitcherdata), '\n')
# Find p-value
cat("p-value is: ", with(otherpitcherdata, prop.test(e1, N)$p.value), "\n")
}
As a reminder, all of the events and their number of appearances is:
table(d2$event)
##
## Batter Interference Bunt Groundout Bunt Lineout
## 58 349 11
## Bunt Pop Out Catcher Interference Double
## 150 53 10372
## Double Play Fan interference Field Error
## 602 39 2166
## Fielders Choice Fielders Choice Out Flyout
## 127 390 25867
## Forceout Grounded Into DP Groundout
## 4438 4277 40187
## Hit By Pitch Home Run Intent Walk
## 2423 9136 3393
## Lineout Pop Out Runner Out
## 12900 10874 391
## Sac Bunt Sac Fly Sac Fly DP
## 911 1600 17
## Sacrifice Bunt DP Single Strikeout
## 1 32733 65614
## Strikeout - DP Triple Triple Play
## 263 1152 4
## Walk
## 27066
Now we can quickly check how proportions vary across pitchers in the league for 2018.
BABIP
We’ve already checked this, it’s just a check to make sure the new function is correct. It’ll be a little different since now I require 20 hits and 20 outs instead of just 100 in play events.
compareevents(inplayhits, inplayouts)
## Coverage is
##
## FALSE TRUE
## 35 464
##
## Proportion = 0.9298597
## p-value is: 1.284174e-05
Strikeout proportion
Let’s check strikeout proportion of all at bats.
compareevents(c("Strikeout", "Strikeout - DP"))
## Coverage is
##
## FALSE TRUE
## 268 323
##
## Proportion = 0.5465313
## p-value is: 0
Here there is clearly a difference across pitchers. Again, this means that pitchers have control of this aspect of the game and it’s not random chance, like BABIP appears to be.
Three true outcomes
compareevents(c("Strikeout", "Strikeout - DP", "Walk", "Intent Walk", "Home Run"))
## Coverage is
##
## FALSE TRUE
## 309 325
##
## Proportion = 0.5126183
## p-value is: 0
There is a difference across pitchers for the three true outcomes vs everything else (balls in play).
Home runs vs other hits
compareevents("Home Run", inplayhits)
## Coverage is
##
## FALSE TRUE
## 29 90
##
## Proportion = 0.7563025
## p-value is: 1.605688e-26
Some pitchers give up a lot of homeruns.
Ground ball outs vs fly ball outs
I’d rather check ground balls vs fly balls, including hits and outs, but the various hits aren’t split up by hit type.
compareevents(c("Bunt Groundout", "Double Play", "Fielder Error", "Fielders Choice",
"Fielders Choice Out", "Forceout", "Groundout"),
c("Flyout", "Lineout", "Pop Out", "Sac Fly", "Sac Fly DP"))
## Coverage is
##
## FALSE TRUE
## 151 311
##
## Proportion = 0.6731602
## p-value is: 1.53392e-237
This is significant, there is a difference between flyball and groundball pitchers. About half of outs are fly balls, half are grounders.
On base percentage
compareevents(c("Home Run", inplayhits, "Walk", "Intent Walk", "Hit By Pitch"))
## Coverage is
##
## FALSE TRUE
## 168 470
##
## Proportion = 0.7366771
## p-value is: 3.336466e-160
There is a difference across pitchers for their allowed on base percentage.
Conclusion
I’ve looked again at some BABIP data. I’ve added in confidence intervals to show that over 92% of the 95% confidence intervals contain the league average, signififying that pitchers don’t have much control over BABIP. p-values from proportion tests are useful for big samples, but we looked at them to see how they differ. We checked some other rates for pitchers, such as ground ball rate, and everything else did show a difference across pitchers.