Qolmat API

Imputers API

imputations.imputers.ImputerEM([groups, ...])

EM imputer.

imputations.imputers.ImputerKNN([groups, ...])

K-nnearest neighbors imputer.

imputations.imputers.ImputerInterpolation([...])

Interpolation imputer.

imputations.imputers.ImputerLOCF([groups])

LOCF imputer.

imputations.imputers.ImputerSimple([groups, ...])

Simple imputer.

imputations.imputers.ImputerMICE([groups, ...])

MICE imputer.

imputations.imputers.ImputerNOCB([groups])

NOCB imputer.

imputations.imputers.ImputerOracle()

Perfect imputer, requires to know real values.

imputations.imputers.ImputerRegressor([...])

Regressor imputer.

imputations.imputers.ImputerResiduals([...])

Residual imputer.

imputations.imputers.ImputerRpcaPcp([...])

PCP RPCA imputer.

imputations.imputers.ImputerRpcaNoisy([...])

Noise RPCA imputer.

imputations.imputers.ImputerSoftImpute([...])

SoftImpute imputer.

imputations.imputers.ImputerShuffle([...])

Impute using random samples from the considered column.

Comparator API

benchmark.comparator.Comparator(dict_models, ...)

Comparator class.

Missing Patterns API

benchmark.missing_patterns.UniformHoleGenerator(...)

UniformHoleGenerator class.

benchmark.missing_patterns.GeometricHoleGenerator(...)

GeometricHoleGenerator class.

benchmark.missing_patterns.EmpiricalHoleGenerator(...)

EmpiricalHoleGenerator class.

benchmark.missing_patterns.MultiMarkovHoleGenerator(...)

MultiMarkovHoleGenerator class.

benchmark.missing_patterns.GroupedHoleGenerator(...)

GroupedHoleGenerator class.

Metrics API

benchmark.metrics.mean_squared_error(df1, ...)

Mean squared error between two dataframes.

benchmark.metrics.root_mean_squared_error(...)

Compute the root mean squared error between two dataframes.

benchmark.metrics.mean_absolute_error(df1, ...)

Compute the mean absolute error between two dataframes.

benchmark.metrics.mean_absolute_percentage_error(...)

Compute the mean absolute percentage error between two dataframes.

benchmark.metrics.weighted_mean_absolute_percentage_error(...)

Compute the weighted mean absolute percentage error between 2 df.

benchmark.metrics.accuracy(df1, df2, df_mask)

Compute the matching ratio between the two datasets.

benchmark.metrics.dist_wasserstein(df1, df2, ...)

Compute the Wasserstein distances between columns of 2 dataframes.

benchmark.metrics.kl_divergence(df1, df2, ...)

Estimate the KL divergence.

benchmark.metrics.kolmogorov_smirnov_test(...)

Compute the Kolmogorov Smirnov Test for numerical features.

benchmark.metrics.total_variance_distance(...)

Compute the total variance distance for categorical features.

benchmark.metrics.mean_difference_correlation_matrix_numerical_features(...)

Compute the mean absolute of differences.

benchmark.metrics.mean_difference_correlation_matrix_categorical_features(...)

Compute the mean absolute of differences.

benchmark.metrics.mean_diff_corr_matrix_categorical_vs_numerical_features(...)

Compute the mean absolute of differences.

benchmark.metrics.sum_energy_distances(df1, ...)

Compute the sum of energy distances between df1 and df2.

benchmark.metrics.frechet_distance(df1, df2, ...)

Compute Frechet distance computed using a pattern decomposition.

benchmark.metrics.pattern_based_weighted_mean_metric(...)

Compute a mean score based on missing patterns.

RPCA engine API

imputations.rpca.rpca_pcp.RpcaPcp([...])

Class for the basic RPCA decomposition.

imputations.rpca.rpca_noisy.RpcaNoisy([...])

Class for a noisy version of the so-called 'improved RPCA'.

Expectation-Maximization engine API

imputations.em_sampler.MultiNormalEM([...])

Multinormal EM imputer.

imputations.em_sampler.VARpEM([method, ...])

VAR(p) EM imputer.

Diffusion Model engine API

imputations.imputers_pytorch.ImputerDiffusion(...)

Imputer based on diffusion models.

imputations.diffusions.ddpms.TabDDPM([...])

Tab DDPM.

imputations.diffusions.ddpms.TsDDPM([...])

Time series DDPM.

Preprocessing API

imputations.preprocessing.MixteHGBM()

MixteHGBM class.

imputations.preprocessing.BinTransformer([cols])

BinTransformer class.

imputations.preprocessing.OneHotEncoderProjector(...)

Class for one-hot encoding of categorical features.

imputations.preprocessing.WrapperTransformer(...)

Wrap a transformer.

imputations.preprocessing.make_pipeline_mixte_preprocessing([...])

Create a preprocessing pipeline managing mixed type data.

imputations.preprocessing.make_robust_MixteHGB([...])

Create a robust pipeline for MixteHGBM.

Utils API

utils.data.add_holes(df, ratio_masked, mean_size)

Create holes in a dataset with no missing value, starting from df.