Member-only story

Machine Learning and Generalization Error — In Practice

Helene
5 min readJan 13, 2021

--

In the former two articles, we talked respectively about Hoeffding’s Inequality and how it could be used in the setting of the learning problem. We saw how Hoeffding’s Inequality could not be directly applied to the learning problem, otherwise, it would simply be a verification of a hypothesis — instead of learning and choosing the best hypothesis from a hypothesis set. In this article, we will get some practical examples of how to use the two bounds before the next article in the series will continue upon how to improve the bound. In the last article, concerning the generalization error, it was made clear that the bound had to be loosened to become applicable to the learning situations — exactly M times looser, with M being the size of the hypothesis set. This will be illustrated in this article through the two given examples. We will start with the inequality concerning a single hypothesis.

Verification of a Hypothesis

In the case of a single hypothesis, we had the following inequality to work with:

This inequality was directly translated from Hoeffding’s Inequality to fit with a hypothesis. In case it was forgotten, we will quickly re-iterate what the different variables in the inequality mean:

  • E_in(h): The in-sample of h, Ein(h).
  • E_out(h): Out of sample — this is the generalization error.
  • ε: This is our tolerance of how much we accept E_in to deviate from E_out.

--

--

No responses yet