qolmat.benchmark.missing_patterns.GeometricHoleGenerator

class qolmat.benchmark.missing_patterns.GeometricHoleGenerator(n_splits: int, subset: Optional[List[str]] = None, ratio_masked: float = 0.05, random_state: Optional[Union[int, RandomState]] = None, groups: Tuple[str, ...] = ())[source]

GeometricHoleGenerator class.

This class implements a way to generate holes in a dataframe. The holes are generated following a Markov 1D process.

Parameters
n_splitsint

Number of splits

subsetOptional[List[str]], optional

Names of the columns for which holes must be created, by default None

ratio_maskedOptional[float], optional

Ratio of masked values ​​to add, by default 0.05.

random_stateint, RandomState instance or None, default=None

Controls the randomness. Pass an int for reproducible output across multiple function calls.

groups: Tuple[str, …]

Column names used to group the data

__init__(n_splits: int, subset: Optional[List[str]] = None, ratio_masked: float = 0.05, random_state: Optional[Union[int, RandomState]] = None, groups: Tuple[str, ...] = ())[source]
fit(X: DataFrame) GeometricHoleGenerator[source]

Get the transition matrix from a list of states.

Parameters
Xpd.DataFrame

transition matrix (stochastic matrix) current in index, next in columns 1 is missing

Returns
Markov1DHoleGenerator

The model itself

sample_sizes(column: str, n_masked: int)[source]

Sample sizes.

Parameters
columnstr

column name

n_maskedint

number of masks

Returns
pd.Series

sizes sampled

Examples using qolmat.benchmark.missing_patterns.GeometricHoleGenerator

Tutorial for hole generation in tabular data

Tutorial for hole generation in tabular data