qolmat.benchmark.missing_patterns.UniformHoleGenerator

class qolmat.benchmark.missing_patterns.UniformHoleGenerator(n_splits: int, subset: Optional[List[str]] = None, ratio_masked: float = 0.05, random_state: Optional[Union[int, RandomState]] = None, sample_proportional: bool = False)[source]

UniformHoleGenerator class.

This class implements a way to generate holes in a dataframe. The holes are generated randomly, using the resample method of sklearn.

Parameters
n_splitsint

Number of splits

subsetOptional[List[str]], optional

Names of the columns for which holes must be created, by default None

ratio_maskedOptional[float], optional

Ratio of masked values ​​to add, by default 0.05.

random_stateint, RandomState instance or None, default=None

Controls the randomness. Pass an int for reproducible output across multiple function calls.

sample_proportional: bool, optional

If True, generates holes in target columns with same equal frequency. If False, reproduces the empirical proportions between the variables.

__init__(n_splits: int, subset: Optional[List[str]] = None, ratio_masked: float = 0.05, random_state: Optional[Union[int, RandomState]] = None, sample_proportional: bool = False)[source]
generate_mask(X: DataFrame) DataFrame[source]

Return a mask for the dataframe at hand.

Parameters
Xpd.DataFrame

Initial dataframe with a missing pattern to be imitated.

Examples using qolmat.benchmark.missing_patterns.UniformHoleGenerator

Benchmark for categorical data

Benchmark for categorical data

Comparison of basic imputers

Comparison of basic imputers

Tutorial for imputers based on diffusion models

Tutorial for imputers based on diffusion models

Tutorial for Testing the MCAR Case

Tutorial for Testing the MCAR Case

Tutorial for hole generation in tabular data

Tutorial for hole generation in tabular data