Skip to content

Latest commit

 

History

History
28 lines (21 loc) · 1.8 KB

Model-Selection.md

File metadata and controls

28 lines (21 loc) · 1.8 KB

MODEL-SELECTION

AIMA4e

function MODEL-SELECTION(Learner, examples, k) returns a hypothesis
local variables: err, an array, indexed by size, storing validation-set error rates
for size = 1 todo
   err[size] ← CROSS-VALIDATION(Learner, size, examples, k)
   if err is starting to increase significantly then do
     best_size ← the value of size with minimum err[size]
     return Learner(best_size, examples)


function CROSS-VALIDATION(Learner, size, examples, k) returns average training set error rate:
errs ← 0
for fold = 1 to k do
   training_set, validation_set ← PARTITION(examples, fold, k)
   hLearner(size, training_set)
   errserrs + ERROR-RATE(h, validation_set)
return errsk // average error rate on validation sets, across k-fold cross-validation


Figure?? An algorithm to select the model that has the lowest error rate on validation data by building models of increasing complexity, and choosing the one with best empirical error rate, err, on validation data. Learner(size, exmaples) returns a hypothesis whose complexity is set by the parameter size, and which is trained on the examples. PARTITION(examples, fold, k) splits examples into two subsets: a validation set of size Nk and a training set with all the other examples. The split is different for each value of fold.


In the fourth edition, cross vaidation wrapper has been renamed to Model-Selection.