To get some perspective on the relationship between team performance and team payrolls, I have gathered data for each season from 2011 until 2015 on player salaries from the widely-cited database available at Spotrac. These data are broken out by positional categories and include a figure for “dead” money, contractual obligations to players that are no longer with the team.
Regression Results
I start with a model that includes player salaries, the sum of payments to pitchers, catchers, infielders, outfielders, and designated hitters, in millions of dollars (“PlayersMM”) and a residual “other” category that is the difference between reported total payroll and player salaries again in millions of dollars (“OtherMM”). As these results show clearly, the player salaries are what matter when it comes to winning
Pooled OLS, using 150 observations Included 30 cross-sectional units Time-series length = 5 Dependent variable: Regular-Season Wins coefficient std. error t-ratio p-value --------------------------------------------------------- const 69.2663 2.22782 31.09 1.62e-66 *** PlayersMM 0.130358 0.0227014 5.742 5.17e-08 *** OtherMM 0.00928354 0.0340035 0.2730 0.7852 Mean dependent var 81.02667 S.D. dependent var 11.10653 Sum squared resid 14929.62 S.E. of regression 10.07780 R-squared 0.187720 Adjusted R-squared 0.176668
The overall explanatory power of this model is pretty low. Just under 19% of the variance in wins can be statistically accounted for using these salary data. That’s just another way of stating what the graph in the main article shows, that there is a lot of “scatter” around the model’s predictions. Teams with identical payrolls can win considerably more games than the model predicts, or considerably fewer.
Because the data contains multiple measurements on each of the thirty teams we can exploit that feature and allow for specific “team effects.” These are the source of the second graph in the main article and comes from estimating a model with “dummy” variables included for each of the teams. After whittling down the results to the teams with “significant” effects, we’re left with this model that forms the basis for the chart comparing teams in the main article:
Pooled OLS, using 150 observations Included 30 cross-sectional units Time-series length = 5 Dependent variable: Regular-Season wins coefficient std. error t-ratio p-value --------------------------------------------------------- const 75.3935 2.20255 34.23 2.64e-69 *** PlayersMM 0.103430 0.0207827 4.977 1.89e-06 *** OtherMM −0.0360856 0.0302921 −1.191 0.2356 CHC −10.1867 3.98745 −2.555 0.0117 ** COL −13.5862 4.02725 −3.374 0.0010 *** CWS −8.46620 3.98705 −2.123 0.0355 ** HOU −16.6031 4.11470 −4.035 9.00e-05 *** MIA −11.2013 4.08010 −2.745 0.0069 *** MIN −13.8250 4.01661 −3.442 0.0008 *** PHI −8.65976 4.05484 −2.136 0.0345 ** SEA −7.50076 4.00319 −1.874 0.0631 * STL 10.1054 3.98768 2.534 0.0124 ** Mean dependent var 81.02667 S.D. dependent var 11.10653 Sum squared resid 10435.62 S.E. of regression 8.695999 R-squared 0.432227 Adjusted R-squared 0.386969
After adjusting for the extreme cases, two changes happen to the effects of spending. One, the coefficient for player salaries falls from 0.13 in the simple model without adjustments to 0.10 here. That suggests that an average team needs to spend about $10 million to gain another victory, rather than the $8 million figure based on the simple model. However a more nuanced view of the spending effect includes the coefficient’s “standard error” of 0.02. That can be used to construct a “confidence interval” around the estimated effect; the actual effect of spending is 95% certain to fall somewhere in the range between 0.06 to 0.14, the result of subtracting or adding twice the standard error. That means it costs somewhere between $7 million (7 x 0.14) and $16 million (16 x 0.06) to gain another win.
Second, the portion of teams’ payrolls not devoted to players on the field now has the “proper” negative sign, though it still falls considerably short of conventional significance levels. It’s possible that teams are penalized for making bad decisions that lead to large pools of “dead” money, but the evidence is still pretty weak, and the effect quite small.
Finally we can ask whether spending on pitchers is more or less productive than spending on positional players. The answer is that it doesn’t matter. Including separate terms for both groups’ salaries adds no predictive power to the model. Spending another ten million dollars on pitching has the same effect as investing that money in the rest of the team.