We revisit the oft-studied asymptotic (in sample size) behavior ofthe parameter or weight estimate returned by any member of a largefamily of neural network training algorithms. By properly accounting forthe characteristic property of neural networks that their empirical andgeneralization errors possess multiple minima, we establish conditionsunder which the parameter estimate converges strongly into the set ofminima of the generalization error. These results are then used toderive learning curves for generalization and empirical errors thatleads to bounds on rates of convergence
展开▼