`qolmat.benchmark.missing_patterns`.MultiMarkovHoleGenerator¶

class qolmat.benchmark.missing_patterns.MultiMarkovHoleGenerator(n_splits: int, subset: Optional[List[str]] = None, ratio_masked: float = 0.05, random_state: Optional[Union[int, RandomState]] = None, groups: Tuple[str, ...] = ())[source]¶

MultiMarkovHoleGenerator class.

This class implements a way to generate holes in a dataframe. The holes are generated according to a Markov process. Each line of the dataframe mask (np.nan) represents a state of the Markov chain.

Parameters

n_splitsint: Number of splits
subsetOptional[List[str]], optional: Names of the columns for which holes must be created, by default None
ratio_maskedOptional[float], optional: Ratio of masked values to add, by default 0.05
random_stateint, RandomState instance or None, default=None: Controls the randomness. Pass an int for reproducible output across multiple function calls.
groups: Tuple[str, …]: Column names used to group the data

__init__(n_splits: int, subset: Optional[List[str]] = None, ratio_masked: float = 0.05, random_state: Optional[Union[int, RandomState]] = None, groups: Tuple[str, ...] = ())[source]¶

fit(X: DataFrame) → MultiMarkovHoleGenerator[source]¶

Get the transition matrix.

Get the transition matrix from a list of states transition matrix (stochastic matrix) current in index, next in columns 1 is missing

Parameters

Xpd.DataFrame: input dataframe

Returns

MultiMarkovHoleGenerator: The model itself

generate_mask(X: DataFrame) → List[DataFrame][source]¶

Create missing data in an array-like object based on a markov chain.

States of the MC are the different masks of missing values: there are at most pow(2,X.shape[1]) possible states.

Parameters

Xpd.DataFrame: initial dataframe with missing (true) entries

Returns

Dict[str, pd.DataFrame]: the initial dataframe, the dataframe with additional missing entries and the created mask

generate_multi_realisation(n_masked: int) → List[List[Tuple[bool, ...]]][source]¶

Generate a sequence of states “states” of size “size”.

Generated from a transition matrix “df_transition”

Parameters

n_maskedint: number of masks.

Returns

realisation ; List[int]: sequence of states

Examples using `qolmat.benchmark.missing_patterns.MultiMarkovHoleGenerator`¶

Tutorial for hole generation in tabular data

qolmat.benchmark.missing_patterns.MultiMarkovHoleGenerator¶

Examples using qolmat.benchmark.missing_patterns.MultiMarkovHoleGenerator¶

`qolmat.benchmark.missing_patterns`.MultiMarkovHoleGenerator¶

Examples using `qolmat.benchmark.missing_patterns.MultiMarkovHoleGenerator`¶