qolmat.benchmark.missing_patterns.EmpiricalHoleGenerator¶
- class qolmat.benchmark.missing_patterns.EmpiricalHoleGenerator(n_splits: int, subset: Optional[List[str]] = None, ratio_masked: float = 0.05, random_state: Optional[Union[int, RandomState]] = None, groups: Tuple[str, ...] = ())[source]¶
EmpiricalHoleGenerator class.
This class implements a way to generate holes in a dataframe. The distribution of holes is learned from the data. The distributions are learned column by column.
- Parameters
- n_splitsint
Number of splits
- subsetOptional[List[str]], optional
Names of the columns for which holes must be created, by default None
- ratio_maskedOptional[float], optional
Ratio of masked values to add, by default 0.05.
- random_stateint, RandomState instance or None, default=None
Controls the randomness. Pass an int for reproducible output across multiple function calls.
- groups: Tuple[str, …]
Column names used to group the data
- __init__(n_splits: int, subset: Optional[List[str]] = None, ratio_masked: float = 0.05, random_state: Optional[Union[int, RandomState]] = None, groups: Tuple[str, ...] = ())[source]¶
- compute_distribution_holes(states: Series) Series[source]¶
Compute the hole distribution.
- Parameters
- statespd.Series
Series of states.
- Returns
- pd.Series
hole distribution
- fit(X: DataFrame) EmpiricalHoleGenerator[source]¶
Compute the holes sizes of a dataframe.
Dataframe df has only one column.
- Parameters
- Xpd.DataFrame
data with holes
- Returns
- EmpiricalTimeHoleGenerator
The model itself