function MODEL-SELECTION(Learner, examples, k) returns a hypothesis
local variables: err, an array, indexed by size, storing validation-set error rates
for size = 1 to ∞ do
err[size] ← CROSS-VALIDATION(Learner, size, examples, k)
if err is starting to increase significantly then do
best_size ← the value of size with minimum err[size]
return Learner(best_size, examples)
function CROSS-VALIDATION(Learner, size, examples, k) returns average training set error rate:
errs ← 0
for fold = 1 to k do
training_set, validation_set ← PARTITION(examples, fold, k)
h ← Learner(size, training_set)
errs ← errs + ERROR-RATE(h, validation_set)
return errs ⁄ k // average error rate on validation sets, across k-fold cross-validation
Figure?? An algorithm to select the model that has the lowest error rate on validation data by building models of increasing complexity, and choosing the one with best empirical error rate, err, on validation data. Learner(size, exmaples) returns a hypothesis whose complexity is set by the parameter size, and which is trained on the examples. PARTITION(examples, fold, k) splits examples into two subsets: a validation set of size N ⁄ k and a training set with all the other examples. The split is different for each value of fold.
In the fourth edition, cross vaidation wrapper has been renamed to Model-Selection.