`qolmat.imputations.imputers`.ImputerEM¶

class qolmat.imputations.imputers.ImputerEM(groups: Tuple[str, ...] = (), model: Optional[str] = 'multinormal', columnwise: bool = False, random_state: Optional[Union[int, RandomState]] = None, method: Literal['mle', 'sample'] = 'sample', max_iter_em: int = 200, n_iter_ou: int = 50, ampli: float = 1, dt: float = 0.02, tolerance: float = 0.0001, stagnation_threshold: float = 0.005, stagnation_loglik: float = 2, period: int = 1, verbose: bool = False, p: Union[None, int] = None)[source]¶

EM imputer.

This class implements an imputation method based on joint modelling and an inference using a Expectation-Minimization algorithm.

Parameters

groupsTuple[str, …], default=(): List of column names to group by.
model{‘multinormal’, ‘VAR’}, default=’multinormal’: Method defining the hypothesis made on the data distribution. Possible values: - ‘multinormal’ : the data points are independent and uniformly distributed following a multinormal distribution - ‘VAR’ : the data is a time series modeled by a VAR(p) process
columnwisebool, default=False: If False, correlations between variables will be used, which is advised. If True, each column is imputed independently. For the multinormal case each value will be imputed by the mean up to a noise with fixed noise, for the VAR case the imputation will be a noisy temporal interpolation.
random_stateRandomSetting, optional: Controls the randomness of the fit_transform, by default None
method{‘mle’, ‘sample’}, default=’sample’: Imputation method after EM convergence. - ‘mle’ : Maximum Likelihood Estimation - ‘sample’ : Sample from the posterior distribution
max_iter_emint, default=200: Maximum number of EM iterations.
n_iter_ouint, default=50: Number of Ornstein-Uhlenbeck process iterations for sampling.
amplifloat, default=1: Amplitude parameter for the Ornstein-Uhlenbeck process.
dtfloat, default=0.02: Time step for the Ornstein-Uhlenbeck process discretization.
tolerancefloat, default=1e-4: Convergence tolerance for EM algorithm.
stagnation_thresholdfloat, default=5e-3: Threshold for element-wise stagnation detection in EM algorithm.
stagnation_loglikfloat, default=2: Threshold for log-likelihood stagnation in EM algorithm.
periodint, default=1: If different from 1, the data is folded with respect to the given period before applying the imputation.
verbosebool, default=False: If True, print convergence information during fitting.
pint, optional: Order of the VAR process (only used when model=’VAR’), by default None

__init__(groups: Tuple[str, ...] = (), model: Optional[str] = 'multinormal', columnwise: bool = False, random_state: Optional[Union[int, RandomState]] = None, method: Literal['mle', 'sample'] = 'sample', max_iter_em: int = 200, n_iter_ou: int = 50, ampli: float = 1, dt: float = 0.02, tolerance: float = 0.0001, stagnation_threshold: float = 0.005, stagnation_loglik: float = 2, period: int = 1, verbose: bool = False, p: Union[None, int] = None)[source]¶

get_model(**hyperparams) → EM[source]¶

Get the underlying model of the imputer based on its attributes.

Returns

em_sampler.EM: EM model to be used in the fit and transform methods.

Examples using `qolmat.imputations.imputers.ImputerEM`¶

Benchmark for time series

qolmat.imputations.imputers.ImputerEM¶

Examples using qolmat.imputations.imputers.ImputerEM¶

`qolmat.imputations.imputers`.ImputerEM¶

Examples using `qolmat.imputations.imputers.ImputerEM`¶