All APIs of BenchPOTS

BenchPOTS logo

BenchPOTS

benchpots.datasets

benchpots.datasets.preprocess_physionet2012(subset, rate, pattern='point', features=None, **kwargs)[source]

Load and preprocess the dataset PhysionNet2012.

Parameters:
  • subset – The name of the subset dataset to be loaded. Must be one of [‘all’, ‘set-a’, ‘set-b’, ‘set-c’].

  • rate – The missing rate.

  • pattern (str) – The missing pattern to apply to the dataset. Must be one of [‘point’, ‘subseq’, ‘block’].

  • features (Optional[list]) – The features to be used in the dataset. If None, all features except the static features will be used.

Returns:

A dictionary containing the processed PhysionNet2012.

Return type:

processed_dataset

benchpots.datasets.preprocess_physionet2019(subset, rate, pattern='point', features=None, **kwargs)[source]

Load and preprocess the dataset PhysionNet2019.

Parameters:
  • subset – The name of the subset dataset to be loaded. Must be one of [‘all’, ‘training_setA’, ‘training_setB’].

  • rate – The missing rate.

  • pattern (str) – The missing pattern to apply to the dataset. Must be one of [‘point’, ‘subseq’, ‘block’].

  • features (Optional[list]) – The features to be used in the dataset. If None, all features except the static features will be used.

Returns:

A dictionary containing the processed PhysionNet2019.

Return type:

processed_dataset

benchpots.datasets.preprocess_beijing_air_quality(rate, n_steps, pattern='point', **kwargs)[source]

Load and preprocess the dataset Beijing Multi-site Air Quality.

Parameters:
  • rate – The missing rate.

  • n_steps – The number of time steps to in the generated data samples. Also the window size of the sliding window.

  • pattern (str) – The missing pattern to apply to the dataset. Must be one of [‘point’, ‘subseq’, ‘block’].

Returns:

A dictionary containing the processed Beijing Multi-site Air Quality dataset.

Return type:

processed_dataset

benchpots.datasets.preprocess_italy_air_quality(rate, n_steps, pattern='point', **kwargs)[source]

Load and preprocess the dataset Italy Air Quality.

Parameters:
  • rate – The missing rate.

  • n_steps – The number of time steps to in the generated data samples. Also the window size of the sliding window.

  • pattern (str) – The missing pattern to apply to the dataset. Must be one of [‘point’, ‘subseq’, ‘block’].

Returns:

A dictionary containing the processed Italy Air Quality.

Return type:

processed_dataset

benchpots.datasets.preprocess_electricity_load_diagrams(rate, n_steps, pattern='point', **kwargs)[source]

Load and preprocess the dataset Electricity Load Diagrams.

Parameters:
  • rate – The missing rate.

  • n_steps – The number of time steps to in the generated data samples. Also the window size of the sliding window.

  • pattern (str) – The missing pattern to apply to the dataset. Must be one of [‘point’, ‘subseq’, ‘block’].

Returns:

A dictionary containing the processed Electricity Load Diagrams.

Return type:

processed_dataset

benchpots.datasets.preprocess_ett(subset, rate, n_steps, pattern='point', **kwargs)[source]

Load and preprocess the dataset ETT.

Parameters:
  • subset – The name of the subset dataset to be loaded. Must be one of [‘ETTm1’, ‘ETTm2’, ‘ETTh1’, ‘ETTh2’].

  • rate – The missing rate.

  • n_steps – The number of time steps to in the generated data samples. Also the window size of the sliding window.

  • pattern (str) – The missing pattern to apply to the dataset. Must be one of [‘point’, ‘subseq’, ‘block’].

Returns:

A dictionary containing the processed ETT.

Return type:

processed_dataset

benchpots.datasets.preprocess_pems_traffic(rate, n_steps, pattern='point', **kwargs)[source]

Load and preprocess the dataset PeMS traffic.

Parameters:
  • rate – The missing rate.

  • n_steps – The number of time steps to in the generated data samples. Also the window size of the sliding window.

  • pattern (str) – The missing pattern to apply to the dataset. Must be one of [‘point’, ‘subseq’, ‘block’].

Returns:

A dictionary containing the processed PeMS traffic.

Return type:

processed_dataset

benchpots.datasets.preprocess_ucr_uea_datasets(dataset_name, rate, pattern='point', **kwargs)[source]

Load and preprocess the dataset from UCR&UEA.

Parameters:
  • dataset_name – The name of the UCR_UEA dataset to be loaded. Must start with ‘ucr_uea_’. Use tsdb.list() to get all available datasets.

  • rate – The missing rate.

  • pattern (str) – The missing pattern to apply to the dataset. Must be one of [‘point’, ‘subseq’, ‘block’].

Returns:

A dictionary containing the processed UCR&UEA dataset.

Return type:

processed_dataset

benchpots.datasets.preprocess_solar_alabama(rate, n_steps, pattern='point', **kwargs)[source]

Load and preprocess the dataset Solar Alabama.

Parameters:
  • rate – The missing rate.

  • n_steps – The number of time steps to in the generated data samples. Also the window size of the sliding window.

  • pattern (str) – The missing pattern to apply to the dataset. Must be one of [‘point’, ‘subseq’, ‘block’].

Returns:

A dictionary containing the processed Solar Alabama.

Return type:

processed_dataset

benchpots.datasets.preprocess_random_walk(n_steps=24, n_features=10, n_classes=2, n_samples_each_class=1000, missing_rate=0.1, pattern='point', **kwargs)[source]

Generate a random-walk data.

Parameters:
  • n_steps (int, default=24) – Number of time steps in each sample.

  • n_features (int, default=10) – Number of features.

  • n_classes (int, default=2) – Number of classes (types) of the generated data.

  • n_samples_each_class (int, default=1000) – Number of samples for each class to generate.

  • missing_rate (float, default=0.1) – The rate of randomly missing values to generate, should be in [0,1).

  • pattern (str) – The missing pattern to apply to the dataset. Must be one of [‘point’, ‘subseq’, ‘block’].

Returns:

data – A dictionary containing the generated data.

Return type:

dict,