A Field-test of Basic Empirical Bayes and Bayes Methodologies: In-Season Prediction of Baseball Batting Averages
- đ¤ Speaker: Lawrence D. Brown (Pennsylvania)
- đ Date & Time: Tuesday 14 October 2008, 17:00 - 18:00
- đ Venue: Wolfson Room (MR 2) Centre for Mathematical Sciences, Wilberforce Road, Cambridge
Abstract
Batting average is one of the principle performance measures for an individual baseball player. It has a simple numerical structure as the percentage of successful attempts, âHitsâ, as a proportion of the total number of qualifying attempts, âAt-Batsâ. This situation, with Hits as a number of successes within a qualifying number of attempts, makes it natural to statistically model each playerâs batting average as a binomial variable outcome, with a true (but unknown) value of that represents the i-th playerâs latent ability. This is a common data structure in many statistical applications; and so the methodological study here has implications for such a range of applications.
We will look at batting records for every Major League player over the course of a single season (2005). The primary focus is on using only the batting record from an earlier part of the season (e.g., the first 3 months) in order to predict the batterâs latent ability, , and consequently to predict their batting-average performance for the remainder of the season. Since we are using a season that has already concluded, we can validate our predictive performance by comparing the predicted values to the actual values for the remainder of the season.
The methodological purpose of this study is to gain experience with a variety of predictive methods applicable to a much wider range of situations. Several of the methods to be investigated derive from empirical Bayes and hierarchical Bayes interpretations. Although the general ideas behind these techniques have been understood for many decades*, some of these methods have only been refined relatively recently in a manner that promises to more accurately fit data such as that at hand.
One feature of all of the statistical methodologies here is the preliminary use of a particular form of variance stabilizing transformation in order to transform the binomial data problem into a somewhat more familiar structure involving (approximately) Normal random variables with known variances. This transformation technique is also useful in validating the binomial model assumption that is the conceptual basis for all our analyses. If time permits we will also describe how it can be used to test for the presence of âstreaky hittersâ, batters whose latent ability appears to significantly change over time.
No prior knowledge of the sport of baseball is required.
- A particularly relevant background reference is Efron, B. and Morris, C. (1977) Steinâs paradox in statisticsâ Scientific American 236 119-127, and the earlier, more technical version (1975), âData analysis using Steinâs estimator and its generalizationsâ Jour. Amer. Stat. Assoc. 70 311-319.
Series This talk is part of the Kuwait Foundation Lectures series.
Included in Lists
- All CMS events
- All Talks (aka the CURE list)
- bld31
- CMS Events
- DPMMS info aggregator
- DPMMS lists
- DPMMS Lists
- DPMMS Pure Maths Seminar
- Hanchen DaDaDash
- Interested Talks
- Kuwait Foundation Lectures
- School of Physical Sciences
- Wolfson Room (MR 2) Centre for Mathematical Sciences, Wilberforce Road, Cambridge
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Lawrence D. Brown (Pennsylvania)
Tuesday 14 October 2008, 17:00-18:00