Statistical Learning Theory Part 2: Optimality of the Bayes Classifier
Motivation and Proof of Optimality of the Bayes Classifier
1: Background & Motivation
Statistical Learning Theory provides a probabilistic framework for understanding the problem of Machine Learning Inference. In mathematical terms, the basic goals of Statistical Learning Theory can be formulated as:
The Table of Contents for this piece are as follows:
With that said, let’s jump in.
2: Proof of Bayes Classifier Optimality
As stated in the previous section, we rarely know the true Bayes Classifier in practice. However, studying the Bayes Classifier provides a lower-bound benchmark for what is possible to achieve. In the case of Supervised Learning for Classification with a 0/1 loss function, we start by proving optimality of the Bayes Classifier for binary classification, and then multiclass classification.
2.1: Binary Classification Case
2.2: Multiclass Classification Case
3: Wrap-up and Conclusions
In will be writing future pieces on Statistical Learning Theory, where we will be leveraging optimality of the Bayes Classifier as proven above.
For reference of solid Statistical Learning Theory content, I would recommend textbooks “All of Statistics” and “All of Nonparametric Statistics” from Larry Wasserman (Statistics and Machine Learning Professor at Carnegie Mellon), Elements of Statistical Learning by faculty at Stanford, and Statistical Learning Theory by Vladimir Vapnik.
I look forward to writing future pieces, and please subscribe and follow me here on Medium!