Hellinger Distance
The Hellinger distance is a measure of the similarity between two probability distributions. It is defined as:
where and are two probability distributions over the same set of events.
In the continuous case, the Hellinger distance between two probability density functions and is defined as:
Mahalanobis Distance
import numpy as np
import pandas as pd
import scipy as stats
# calculateMahalanobis function to calculate
# the Mahalanobis distance
def calculateMahalanobis(y=None, data=None, cov=None):
y_mu = y - np.mean(data)
if not cov:
cov = np.cov(data.values.T)
inv_covmat = np.linalg.inv(cov)
left = np.dot(y_mu, inv_covmat)
mahal = np.dot(left, y_mu.T)
return mahal.diagonal()