Non-NN Integration Path¶
Use this path for models that should not use BaseNNModel at all.
LOCF is the cleanest example.
When This Path Is Correct¶
Choose the non-NN path when:
There is no gradient-based training loop
There is no optimizer
The model is rule-based, statistical, or algorithmic
Wrapping it in a neural-network base class would add fake complexity
Good examples in PyPOTS:
LOCF(Last Observed Carried Forward)Mean(fill with mean values)Median(fill with median values)Lerp(linear interpolation)TRMF(Temporal Regularized Matrix Factorization)BTTF(Bayesian Temporal Tensor Factorization)
LOCF as the Reference Pattern¶
LOCF inherits BaseImputer, not BaseNNImputer.
That choice immediately removes:
Optimizer setup
_train_model()_assemble_input_*hooksCheckpoint-selection logic tied to NN training
Its implementation is direct and clean:
# pypots/imputation/locf/model.py (simplified)
import warnings
from typing import Union, Optional
import h5py
import numpy as np
import torch
from .core import locf_numpy, locf_torch
from ..base import BaseImputer
class LOCF(BaseImputer):
"""LOCF imputation: fills missing values with the last observed value.
Parameters
----------
first_step_imputation : str, default='zero'
Strategy for imputing missing values at the beginning of sequences.
Can be 'backward', 'zero', 'median', or 'nan'.
"""
def __init__(
self,
first_step_imputation: str = "zero",
device: Optional[Union[str, torch.device, list]] = None,
):
super().__init__(device=device)
assert first_step_imputation in ["nan", "zero", "backward", "median"]
self.first_step_imputation = first_step_imputation
def fit(
self,
train_set: Union[dict, str],
val_set: Optional[Union[dict, str]] = None,
file_type: str = "hdf5",
) -> None:
"""LOCF does not need training. Issues a warning."""
warnings.warn(
"LOCF has no parameter to train. "
"Please run func `predict()` directly."
)
def predict(
self,
test_set: Union[dict, str],
file_type: str = "hdf5",
**kwargs,
) -> dict:
# Handle both dict and file input
if isinstance(test_set, str):
with h5py.File(test_set, "r") as f:
X = f["X"][:]
else:
X = test_set["X"]
assert len(X.shape) == 3, (
f"Input X should have 3 dimensions "
f"[n_samples, n_steps, n_features], "
f"but got shape: {X.shape}"
)
if isinstance(X, np.ndarray):
imputed_data = locf_numpy(X, self.first_step_imputation)
elif isinstance(X, torch.Tensor):
imputed_data = locf_torch(X, self.first_step_imputation)
result_dict = {
"imputation": imputed_data,
}
return result_dict
This is exactly what a non-NN wrapper should look like: clean, explicit, and contract-driven.
Two Valid Non-NN Styles¶
Stateless Models¶
Examples: LOCF, Mean, Median, Lerp
These models do not learn parameters from data.
fit() is an explicit no-op with a warning.
class StatelessModel(BaseImputer):
def fit(self, train_set, val_set=None, file_type="hdf5"):
warnings.warn("This model has no parameters to train.")
def predict(self, test_set, file_type="hdf5", **kwargs):
X = test_set["X"]
imputed_data = self._apply_algorithm(X)
return {"imputation": imputed_data}
Stateful Models¶
Examples: TRMF, BTTF
These models still do not use the NN training loop, but they do learn
algorithm state in fit().
class StatefulModel(BaseImputer):
def fit(self, train_set, val_set=None, file_type="hdf5"):
X = train_set["X"]
# Learn parameters from training data
self.learned_params = self._fit_algorithm(X)
def predict(self, test_set, file_type="hdf5", **kwargs):
X = test_set["X"]
imputed_data = self._apply_algorithm(X, self.learned_params)
return {"imputation": imputed_data}
In both cases, the public contract is the same:
predict() must return the task-level result key (e.g. "imputation").
Step-by-Step Implementation Guide¶
Step 1: Choose the Base Class¶
Inherit the correct non-NN task base:
Task |
Non-NN Base |
Result Key |
|---|---|---|
Imputation |
|
|
Forecasting |
|
|
Classification |
|
|
Anomaly Detection |
|
|
Clustering |
|
|
Step 2: Implement fit()¶
Be explicit, even if it only warns:
def fit(self, train_set, val_set=None, file_type="hdf5"):
"""Train the model. For stateless models, this is a no-op."""
warnings.warn("This model has no parameters to train.")
Step 3: Implement predict()¶
Keep it simple and contract-driven:
def predict(self, test_set, file_type="hdf5", **kwargs):
# Handle both dict and file input
if isinstance(test_set, str):
with h5py.File(test_set, "r") as f:
X = f["X"][:]
else:
X = test_set["X"]
# Validate input shape
assert len(X.shape) == 3, (
f"Input X should have 3 dimensions, got {X.shape}"
)
# Apply your algorithm
imputed_data = your_algorithm(X)
return {"imputation": imputed_data}
Step 4: Implement Helper Methods¶
Make helper methods like impute() or forecast() return the raw array users expect:
def impute(self, test_set, file_type="hdf5", **kwargs):
result = self.predict(test_set, file_type, **kwargs)
return result["imputation"]
Step 5: Wire the Package¶
Same as the standard NN path:
# pypots/imputation/your_model/__init__.py
from .model import YourModel
__all__ = ["YourModel"]
Common Mistake¶
Do not force a non-NN model into BaseNNModel just because most folders around it are neural models.
That usually creates:
Fake hooks that do nothing
Fake optimizers that are never used
Confusing tests with unnecessary training loops
Review confusion for maintainers
If there is no gradient, there should be no BaseNNModel.
Definition of Done¶
Your non-NN integration is done when:
The chosen base class matches the real algorithm
fit()behavior is explicit (even if it’s a no-op)predict()returns the correct task result keyHelper methods return the expected array
Targeted tests cover the advertised input modes (both dict and file input)