`qolmat.benchmark.metrics`.kl_divergence¶

qolmat.benchmark.metrics.kl_divergence(df1: DataFrame, df2: DataFrame, df_mask: DataFrame, method: str = 'columnwise', min_n_rows: int = 10) → Series[source]¶

Estimate the KL divergence.

Estimation of the Kullback-Leibler divergence between too empirical distributions. Three methods are implemented: - columnwise, relying on a uniform binarization and only taking marginals into account (https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence), - gaussian, relying on a Gaussian approximation,

Parameters

df1pd.DataFrame: First empirical distribution
df2pd.DataFrame: Second empirical distribution
df_mask: pd.DataFrame: Mask indicating on what values the divergence should be computed
method: str: Method used to compute the divergence on multivariate datasets with missing values. Possible values are columnwise and gaussian.
min_n_rows: int: Minimum number of rows for a KL estimation

Returns

pd.Series: Kullback-Leibler divergence

Raises

AssertionError: If the empirical distributions do not have enough samples to estimate a KL divergence. Consider using a larger dataset of lowering the parameter min_n_rows.

qolmat.benchmark.metrics.kl_divergence¶

`qolmat.benchmark.metrics`.kl_divergence¶