Non-NN Integration Path

Use this path for models that should not use BaseNNModel at all.

LOCF is the cleanest example.

When This Path Is Correct

Choose the non-NN path when:

  • There is no gradient-based training loop

  • There is no optimizer

  • The model is rule-based, statistical, or algorithmic

  • Wrapping it in a neural-network base class would add fake complexity

Good examples in PyPOTS:

  • LOCF (Last Observed Carried Forward)

  • Mean (fill with mean values)

  • Median (fill with median values)

  • Lerp (linear interpolation)

  • TRMF (Temporal Regularized Matrix Factorization)

  • BTTF (Bayesian Temporal Tensor Factorization)

LOCF as the Reference Pattern

LOCF inherits BaseImputer, not BaseNNImputer.

That choice immediately removes:

  • Optimizer setup

  • _train_model()

  • _assemble_input_* hooks

  • Checkpoint-selection logic tied to NN training

Its implementation is direct and clean:

# pypots/imputation/locf/model.py (simplified)

import warnings
from typing import Union, Optional

import h5py
import numpy as np
import torch

from .core import locf_numpy, locf_torch
from ..base import BaseImputer


class LOCF(BaseImputer):
    """LOCF imputation: fills missing values with the last observed value.

    Parameters
    ----------
    first_step_imputation : str, default='zero'
        Strategy for imputing missing values at the beginning of sequences.
        Can be 'backward', 'zero', 'median', or 'nan'.
    """

    def __init__(
        self,
        first_step_imputation: str = "zero",
        device: Optional[Union[str, torch.device, list]] = None,
    ):
        super().__init__(device=device)
        assert first_step_imputation in ["nan", "zero", "backward", "median"]
        self.first_step_imputation = first_step_imputation

    def fit(
        self,
        train_set: Union[dict, str],
        val_set: Optional[Union[dict, str]] = None,
        file_type: str = "hdf5",
    ) -> None:
        """LOCF does not need training. Issues a warning."""
        warnings.warn(
            "LOCF has no parameter to train. "
            "Please run func `predict()` directly."
        )

    def predict(
        self,
        test_set: Union[dict, str],
        file_type: str = "hdf5",
        **kwargs,
    ) -> dict:
        # Handle both dict and file input
        if isinstance(test_set, str):
            with h5py.File(test_set, "r") as f:
                X = f["X"][:]
        else:
            X = test_set["X"]

        assert len(X.shape) == 3, (
            f"Input X should have 3 dimensions "
            f"[n_samples, n_steps, n_features], "
            f"but got shape: {X.shape}"
        )

        if isinstance(X, np.ndarray):
            imputed_data = locf_numpy(X, self.first_step_imputation)
        elif isinstance(X, torch.Tensor):
            imputed_data = locf_torch(X, self.first_step_imputation)

        result_dict = {
            "imputation": imputed_data,
        }
        return result_dict

This is exactly what a non-NN wrapper should look like: clean, explicit, and contract-driven.

Two Valid Non-NN Styles

Stateless Models

Examples: LOCF, Mean, Median, Lerp

These models do not learn parameters from data. fit() is an explicit no-op with a warning.

class StatelessModel(BaseImputer):
    def fit(self, train_set, val_set=None, file_type="hdf5"):
        warnings.warn("This model has no parameters to train.")

    def predict(self, test_set, file_type="hdf5", **kwargs):
        X = test_set["X"]
        imputed_data = self._apply_algorithm(X)
        return {"imputation": imputed_data}

Stateful Models

Examples: TRMF, BTTF

These models still do not use the NN training loop, but they do learn algorithm state in fit().

class StatefulModel(BaseImputer):
    def fit(self, train_set, val_set=None, file_type="hdf5"):
        X = train_set["X"]
        # Learn parameters from training data
        self.learned_params = self._fit_algorithm(X)

    def predict(self, test_set, file_type="hdf5", **kwargs):
        X = test_set["X"]
        imputed_data = self._apply_algorithm(X, self.learned_params)
        return {"imputation": imputed_data}

In both cases, the public contract is the same: predict() must return the task-level result key (e.g. "imputation").

Step-by-Step Implementation Guide

Step 1: Choose the Base Class

Inherit the correct non-NN task base:

Task

Non-NN Base

Result Key

Imputation

BaseImputer

"imputation"

Forecasting

BaseForecaster

"forecasting"

Classification

BaseClassifier

"classification"

Anomaly Detection

BaseDetector

"anomaly_detection"

Clustering

BaseClusterer

"clustering"

Step 2: Implement fit()

Be explicit, even if it only warns:

def fit(self, train_set, val_set=None, file_type="hdf5"):
    """Train the model. For stateless models, this is a no-op."""
    warnings.warn("This model has no parameters to train.")

Step 3: Implement predict()

Keep it simple and contract-driven:

def predict(self, test_set, file_type="hdf5", **kwargs):
    # Handle both dict and file input
    if isinstance(test_set, str):
        with h5py.File(test_set, "r") as f:
            X = f["X"][:]
    else:
        X = test_set["X"]

    # Validate input shape
    assert len(X.shape) == 3, (
        f"Input X should have 3 dimensions, got {X.shape}"
    )

    # Apply your algorithm
    imputed_data = your_algorithm(X)

    return {"imputation": imputed_data}

Step 4: Implement Helper Methods

Make helper methods like impute() or forecast() return the raw array users expect:

def impute(self, test_set, file_type="hdf5", **kwargs):
    result = self.predict(test_set, file_type, **kwargs)
    return result["imputation"]

Step 5: Wire the Package

Same as the standard NN path:

# pypots/imputation/your_model/__init__.py
from .model import YourModel
__all__ = ["YourModel"]

Common Mistake

Do not force a non-NN model into BaseNNModel just because most folders around it are neural models.

That usually creates:

  • Fake hooks that do nothing

  • Fake optimizers that are never used

  • Confusing tests with unnecessary training loops

  • Review confusion for maintainers

If there is no gradient, there should be no BaseNNModel.

Definition of Done

Your non-NN integration is done when:

  • The chosen base class matches the real algorithm

  • fit() behavior is explicit (even if it’s a no-op)

  • predict() returns the correct task result key

  • Helper methods return the expected array

  • Targeted tests cover the advertised input modes (both dict and file input)