The Craft of Smoothing
20-21 December 2010
COURSE:
"The Craft of Smoothing"
Paul Eilers & Brian Marx
Abstract
Schedule
Course Description:
In this course we present the basics and use of P-splines, a combination of regression on a B-spline basis and difference penalties (on the B-spline coefficients). Our approach is practical, because we see smoothing as an everyday tool for data analysis and statistics. We emphasize the use of modern software and we provide functions for R/S-Plus and Matlab.
There will be eight sessions:
Session 1 presents the idea of bases for regression. It shows why global bases, like power functions or orthogonal polynomials are ineffective and why local bases (like B-splines) are attractive.
In Session 2, penalties are introduced, as a tool to give complete and easy control over smoothness. The combination of B-splines and difference penalties will be studied for smoothing, interpolation and extrapolation. In these first two sessions the data are assumed to be normally distributed around a smooth curve.
In Session 3, we extend P-splines to non-normal data, like counts or a binomial response. The penalized regression framework makes it straightforward to transplant most ideas from generalized linear models to P-spline smoothing. Important applications are density estimation and variance smoothing.
Any smoothing method has to balance fidelity to the data and smoothness of the fitted curve. The optimal balance can be found by cross-validation or AIC. This subject is studied in Session 4, as well as the computation of error bands of an estimated curve. We also show how optimal smoothing performs on simulated data, to give you confidence in that it makes the right choices.
Session 5 places P-splines in a wider perspective. It presents Bayesian and mixed model interpretations of P-splines. Special attention is being paid to streamlined computation
In the first five sessions we only consider one-dimensional smoothing. When there are multiple explanatory variables, we can use generalized additive models, varying-coefficient models, or combinations of them. Tensor products of B-splines and multi-dimensional difference penalties make an excellent tool for smoothing in two (or more) dimensions. This is the subject of Session 6.
The final Session 8 looks at the use of P-splines in regression problems with very many variables, which are ordered, like in optical spectra. In the chemometric literature this is known as multivariate calibration.
There will be a computer lab session, in which R software will be used to solve a number of smoothing problems. One part of the lab will concentrate on simple functions with limited goals. This will improve your understanding of what is going on “under the hood". The other part will use the mgcv package, written by Simon Wood, a large but powerful tool that can handle a variety of situations.
Deadline for registration: 17 December 2010. For registration and further practical information, please contact Dymph Wijnen (d.wijnen@erasmusmc.nl), Department of Biostatistics, Erasmus MC, Dr. Molewaterplein 50, 3015 GE Rotterdam, Ee 2124 (21st floor), tel: +31-10-70 44514
Schedule
COURSE:
"The Craft of Smoothing"
Paul Eilers & Brian Marx
Abstract
Schedule
Course Description:
In this course we present the basics and use of P-splines, a combination of regression on a B-spline basis and difference penalties (on the B-spline coefficients). Our approach is practical, because we see smoothing as an everyday tool for data analysis and statistics. We emphasize the use of modern software and we provide functions for R/S-Plus and Matlab.
There will be eight sessions:
Session 1 presents the idea of bases for regression. It shows why global bases, like power functions or orthogonal polynomials are ineffective and why local bases (like B-splines) are attractive.
In Session 2, penalties are introduced, as a tool to give complete and easy control over smoothness. The combination of B-splines and difference penalties will be studied for smoothing, interpolation and extrapolation. In these first two sessions the data are assumed to be normally distributed around a smooth curve.
In Session 3, we extend P-splines to non-normal data, like counts or a binomial response. The penalized regression framework makes it straightforward to transplant most ideas from generalized linear models to P-spline smoothing. Important applications are density estimation and variance smoothing.
Any smoothing method has to balance fidelity to the data and smoothness of the fitted curve. The optimal balance can be found by cross-validation or AIC. This subject is studied in Session 4, as well as the computation of error bands of an estimated curve. We also show how optimal smoothing performs on simulated data, to give you confidence in that it makes the right choices.
Session 5 places P-splines in a wider perspective. It presents Bayesian and mixed model interpretations of P-splines. Special attention is being paid to streamlined computation
In the first five sessions we only consider one-dimensional smoothing. When there are multiple explanatory variables, we can use generalized additive models, varying-coefficient models, or combinations of them. Tensor products of B-splines and multi-dimensional difference penalties make an excellent tool for smoothing in two (or more) dimensions. This is the subject of Session 6.
The final Session 8 looks at the use of P-splines in regression problems with very many variables, which are ordered, like in optical spectra. In the chemometric literature this is known as multivariate calibration.
There will be a computer lab session, in which R software will be used to solve a number of smoothing problems. One part of the lab will concentrate on simple functions with limited goals. This will improve your understanding of what is going on “under the hood". The other part will use the mgcv package, written by Simon Wood, a large but powerful tool that can handle a variety of situations.
Deadline for registration: 17 December 2010. For registration and further practical information, please contact Dymph Wijnen (d.wijnen@erasmusmc.nl), Department of Biostatistics, Erasmus MC, Dr. Molewaterplein 50, 3015 GE Rotterdam, Ee 2124 (21st floor), tel: +31-10-70 44514
Schedule
| Day | Start | End | Type | Who | Subject |
| Mon | 10:00 | 10:45 | Lecture | Paul | Regression and basis functions |
| Mon | 11:00 | 11:45 | Lecture | Paul | The power of penalties |
| Mon | 11:45 | 12:30 | Lecture | Brian | Generalized linear smoothing |
| Mon | 13:30 | 14:15 | Lecture | Brian | Optimal smoothing in action |
| Mon | 14:30 | 16:30 | Lab | Brian | Computer lab |
| Tue | 09:30 | 10:30 | Lecture | Paul | Bayesian, mixed model smoothing |
| Tue | 11:00 | 12:00 | Lecture | Brian | Multi-dimensional smoothing |
| Tue | 13:00 | 14:00 | Lecture | Paul | Specialized penalties |
| Tue | 14:00 | 15:00 | Lecture | Brian | Penalized signal regression |
Back to Top