qolmat.imputations.imputers.ImputerResiduals¶
- class qolmat.imputations.imputers.ImputerResiduals(period: int = 1, groups: Tuple[str, ...] = (), model_tsa: Optional[str] = 'additive', extrapolate_trend: Optional[Union[int, str]] = 'freq', method_interpolation: Optional[str] = 'linear')[source]¶
Residual imputer.
This class implements an imputation method based on a STL decomposition. The series are de-seasonalised, de-trended, residuals are imputed, then residuals are re-seasonalised and re-trended.
- Parameters
- groups: Tuple[str, …]
List of column names to group by, by default []
- periodint
Period of the series. Must be used if x is not a pandas object or if the index of x does not have a frequency. Overrides default periodicity of x if x is a pandas object with a timeseries index.
- model_tsaOptional[str]
Type of seasonal component “additive” or “multiplicative”. Abbreviations are accepted. By default, the value is set to “additive”
- extrapolate_trendint or ‘freq’, optional
If set to > 0, the trend resulting from the convolution is linear least-squares extrapolated on both ends (or the single one if two_sided is False) considering this many (+1) closest points. If set to ‘freq’, use freq closest points. Setting this parameter results in no NaN values in trend or resid components.
- method_interpolationstr
method for the residuals interpolation
Examples
>>> import numpy as np >>> import pandas as pd >>> from qolmat.imputations.imputers import ImputerResiduals >>> np.random.seed(100) >>> df = pd.DataFrame(index=pd.date_range("2015-01-01", "2020-01-01")) >>> mean = 5 >>> offset = 10 >>> df["y"] = np.cos(df.index.dayofyear / 365 * 2 * np.pi - np.pi) * mean + offset >>> trend = 5 >>> df["y"] = df["y"] + trend * np.arange(0, df.shape[0]) / df.shape[0] >>> noise_mean = 0 >>> noise_var = 2 >>> df["y"] = df["y"] + np.random.normal(noise_mean, noise_var, df.shape[0]) >>> mask = np.random.choice([True, False], size=df.shape) >>> df = df.mask(mask) >>> imputor = ImputerResiduals(period=365, model_tsa="additive") >>> imputor.fit_transform(df) y 2015-01-01 1.501210 2015-01-02 5.691061 2015-01-03 4.404106 2015-01-04 3.531540 2015-01-05 3.129532 ... ... 2019-12-28 10.288054 2019-12-29 10.632659 2019-12-30 14.900671 2019-12-31 12.957837 2020-01-01 12.780517 [1827 rows x 1 columns]