ICS 2011

Welcome to ICS2011
Innovations in Computer Science  ICS 2011, Tsinghua University, Beijing, China, January 79, 2011. Proceedings, 487495, 9787302245179
Tsinghua University Press
A popular practical method of obtaining a good estimate of the error rate of a learning algorithm is kfold crossvalidation. Here, the set of examples is first partitioned into k equalsized folds. Each fold acts as a test set for evaluating the hypothesis learned on the other k − 1 folds. The average error across the k hypotheses is used as an estimate of the error rate. Although widely used, especially with small values of k (such as 10), the crossvalidation method has heretofore resisted theoretical analysis due to the fact that the k distinct estimates have inherent correlations between them. With only sanitycheck bounds known, there is no compelling reason to use the kfold crossvalidation estimate over a simpler holdout estimate. Conventional wisdom is that the averaging in crossvalidation leads to a tighter concentration of the estimate of the error around its mean. In this paper, we show that the conventional wisdom is essentially correct. We analyze the reduction in variance of the gap between the crossvalidation estimate and the true error rate, and show that for a large family of stable algorithms crossvalidation achieves a near optimal variance reduction factor of (1 + o(1))k. In these cases, the k different estimates are essentially independent of each other. To proceed with the analysis, we define a new measure of algorithm stability, called meansquare stability. This measure is weaker than most stability notions described in the literature, and encompasses a large class of algorithms including bounded SVM regression and regularized leastsquares regression. For slightly less stable algorithms such as tnearestneighbor, we show that crossvalidation leads to an O( 1 ) reduction in the variance of the generalization error. Preview:

Copyright 20102011, Institute for Computer Science, Tsinghua University, All Rights Reserved.