In this page, we will be discussing some math terms that are used in machine learning and deep learning that i have no idea about.

Lipschitz continuity

Given two metric spaces and , a function is said to be Lipschitz continuous if there exists a constant such that for all .

Any such is called a Lipschitz constant for the function . The smallest constant for which the inequality holds is called the Lipschitz constant of .

Think of a double cone, where the slope of the cone is the Lipschitz constant. The function is Lipschitz continuous if the slope of the cone is bounded.

Krylov subspace

Given a matrix and a vector , the Krylov subspace of order is defined as:

Polyak-Lojasiewicz inequality

Polyak-Lojasiewicz inequality is a mathematical inequality that is used to prove the convergence of optimization algorithms. It states that the difference between the function value and the minimum value of the function is bounded by the gradient of the function.

where is the function value at , is the minimum value of the function, is the gradient of the function at , and is a constant that depends on the function.

This can also translate to if the magnitude of the gradient of the loss function is greater than constant times the loss function, we have exponential convergence of gradients.

We care about a more simpler version of this inequality, the minimum value of the function is hard to guess, so we forget about the term. We first prove this simpler form of the inequality and then we will see how it can be used to prove the convergence of optimization algorithms.

Generalization Error Bound

Rademacher Complexity