AIE

How can the generalization error be estimated?

asked 6 months ago

10.3K

How would you estimate the generalization error? What are the methods of achieving this?

machine-learning

deep-neural-networks

overfitting

computational-learning-theory

generalization

3 Answers

AIE

• answered 6 months ago

Generalization error is the error obtained by applying a model to data it has not seen before. So, if you want to measure generalization error, you need to remove a subset from your data and don't train your model on it. After training, you verify your model accuracy (or other performance measures) on the subset you have removed since your model hasn't seen it before. Hence, this subset is called a test set.

Additionally, another subset can also be used for parameter selection, which we call a validation set. We can't use the training set for parameter tuning, since it does not measure generalization error, but we can't use the test set too since our parameter tuning would overfit test data. That's why we need a third subset.

Finally, in order to obtain more predictive performance measures, we can use many different train/test partitions and average the results. This is called cross-validation.

AIE

• answered 6 months ago

Error Estimation is a subject with a long history. The test-set method is only one way to estimate generalization error. Others include resubstitution, cross-validation, bootstrap, posterior-probability estimators, and bolstered estimators. These and more are reviewed, for instance, in the book: Braga-Neto and Dougherty, "Error Estimation for Pattern Recognition," IEEE-Wiley, 2015.

AIE

• answered 6 months ago

-1

It's basically not possible to test besides some empirical experiments. All the generalization bounds only apply if your process actually follows the model assumptions which you don't actually know to be true.

How can the generalization error be estimated?

3 Answers

Write your answer here