pypots.nn package#

pypots.nn.functional#

pypots.nn.functional.nonstationary_norm(X, missing_mask=None)[source]#

Normalization from Non-stationary Transformer. Please refer to [22] for more details.

Parameters:
  • X (torch.Tensor) – Input data to be normalized. Shape: (n_samples, n_steps (seq_len), n_features).

  • missing_mask (torch.Tensor, optional) – Missing mask has the same shape as X. 1 indicates observed and 0 indicates missing.

Return type:

Tuple[Tensor, Tensor, Tensor]

Returns:

  • X_enc (torch.Tensor) – Normalized data. Shape: (n_samples, n_steps (seq_len), n_features).

  • means (torch.Tensor) – Means values for de-normalization. Shape: (n_samples, n_features) or (n_samples, 1, n_features).

  • stdev (torch.Tensor) – Standard deviation values for de-normalization. Shape: (n_samples, n_features) or (n_samples, 1, n_features).

pypots.nn.functional.nonstationary_denorm(X, means, stdev)[source]#

De-Normalization from Non-stationary Transformer. Please refer to [22] for more details.

Parameters:
  • X (torch.Tensor) – Input data to be de-normalized. Shape: (n_samples, n_steps (seq_len), n_features).

  • means (torch.Tensor) – Means values for de-normalization . Shape: (n_samples, n_features) or (n_samples, 1, n_features).

  • stdev (torch.Tensor) – Standard deviation values for de-normalization. Shape: (n_samples, n_features) or (n_samples, 1, n_features).

Returns:

X_denorm – De-normalized data. Shape: (n_samples, n_steps (seq_len), n_features).

Return type:

torch.Tensor

pypots.nn.modules.rnn#

The implementation of some common-use modules related to RNN.

class pypots.nn.modules.rnn.TemporalDecay(input_size, output_size, diag=False)[source]#

The module used to generate the temporal decay factor gamma in the GRUD model. Please refer to the original paper [16] for more deinails.

Attributes:
  • W (tensor,) – The weights (parameters) of the module.

  • b (tensor,) – The bias of the module.

Parameters:
  • input_size (int,) – the feature dimension of the input

  • output_size (int,) – the feature dimension of the output

  • diag (bool,) – whether to product the weight with an identity matrix before forward processing

References

forward(delta)[source]#

Forward processing of this NN module.

Parameters:

delta (tensor, shape [n_samples, n_steps, n_features]) – The time gaps.

Returns:

gamma – The temporal decay factor.

Return type:

tensor, of the same shape with parameter delta, values in (0,1]

pypots.nn.modules.transformer#

class pypots.nn.modules.transformer.ScaledDotProductAttention(temperature, attn_dropout=0.1)[source]#

Scaled dot-product attention.

Parameters:
  • temperature (float) – The temperature for scaling.

  • attn_dropout (float) – The dropout rate for the attention map.

forward(q, k, v, attn_mask=None, **kwargs)[source]#

Forward processing of the scaled dot-product attention.

Parameters:
  • q (Tensor) – Query tensor.

  • k (Tensor) – Key tensor.

  • v (Tensor) – Value tensor.

  • attn_mask (Optional[Tensor]) – Masking tensor for the attention map. The shape should be [batch_size, n_heads, n_steps, n_steps]. 0 in attn_mask means values at the according position in the attention map will be masked out.

Return type:

Tuple[Tensor, Tensor]

Returns:

  • output – The result of Value multiplied with the scaled dot-product attention map.

  • attn – The scaled dot-product attention map.

class pypots.nn.modules.transformer.MultiHeadAttention(n_heads, d_model, d_k, d_v, attention_operator)[source]#

Transformer multi-head attention module.

Parameters:
  • n_heads (int) – The number of heads in multi-head attention.

  • d_model (int) – The dimension of the input tensor.

  • d_k (int) – The dimension of the key and query tensor.

  • d_v (int) – The dimension of the value tensor.

  • attention_operator (AttentionOperator) – The attention operator, e.g. the self-attention proposed in Transformer.

forward(q, k, v, attn_mask, **kwargs)[source]#

Forward processing of the multi-head attention module.

Parameters:
  • q (Tensor) – Query tensor.

  • k (Tensor) – Key tensor.

  • v (Tensor) – Value tensor.

  • attn_mask (Optional[Tensor]) – Masking tensor for the attention map. The shape should be [batch_size, n_heads, n_steps, n_steps]. 0 in attn_mask means values at the according position in the attention map will be masked out.

Return type:

Tuple[Tensor, Tensor]

Returns:

  • v – The output of the multi-head attention layer.

  • attn_weights – The attention map.

class pypots.nn.modules.transformer.PositionalEncoding(d_hid, n_positions=1000)[source]#

The original positional-encoding module for Transformer.

Parameters:
  • d_hid (int) – The dimension of the hidden layer.

  • n_positions (int) – The max number of positions.

forward(x, return_only_pos=False)[source]#

Forward processing of the positional encoding module.

Parameters:
  • x (Tensor) – Input tensor.

  • return_only_pos (bool) – Whether to return only the positional encoding.

Return type:

Tensor

Returns:

  • If return_only_pos is True

    pos_enc:

    The positional encoding.

  • else

    x_with_pos:

    Output tensor, the input tensor with the positional encoding added.

class pypots.nn.modules.transformer.EncoderLayer(d_model, d_ffn, n_heads, d_k, d_v, slf_attn_opt, dropout=0.1)[source]#

Transformer encoder layer.

Parameters:
  • d_model (int) – The dimension of the input tensor.

  • d_ffn (int) – The dimension of the hidden layer.

  • n_heads (int) – The number of heads in multi-head attention.

  • d_k (int) – The dimension of the key and query tensor.

  • d_v (int) – The dimension of the value tensor.

  • slf_attn_opt (AttentionOperator) – The attention operator for the self multi-head attention module in the encoder layer.

  • dropout (float) – The dropout rate.

forward(enc_input, src_mask=None, **kwargs)[source]#

Forward processing of the encoder layer.

Parameters:
  • enc_input (Tensor) – Input tensor.

  • src_mask (Optional[Tensor]) – Masking tensor for the attention map. The shape should be [batch_size, n_heads, n_steps, n_steps].

Return type:

Tuple[Tensor, Tensor]

Returns:

  • enc_output – Output tensor.

  • attn_weights – The attention map.

class pypots.nn.modules.transformer.DecoderLayer(d_model, d_ffn, n_heads, d_k, d_v, slf_attn_opt, enc_attn_opt, dropout=0.1)[source]#

Transformer decoder layer.

Parameters:
  • d_model (int) – The dimension of the input tensor.

  • d_ffn (int) – The dimension of the hidden layer.

  • n_heads (int) – The number of heads in multi-head attention.

  • d_k (int) – The dimension of the key and query tensor.

  • d_v (int) – The dimension of the value tensor.

  • slf_attn_opt (AttentionOperator) – The attention operator for the self multi-head attention module in the decoder layer.

  • enc_attn_opt (AttentionOperator) – The attention operator for the encoding multi-head attention module in the decoder layer.

  • dropout (float) – The dropout rate.

forward(dec_input, enc_output, slf_attn_mask=None, dec_enc_attn_mask=None, **kwargs)[source]#

Forward processing of the decoder layer.

Parameters:
  • dec_input (Tensor) – Input tensor.

  • enc_output (Tensor) – Output tensor from the encoder.

  • slf_attn_mask (Optional[Tensor]) – Masking tensor for the self-attention module. The shape should be [batch_size, n_heads, n_steps, n_steps].

  • dec_enc_attn_mask (Optional[Tensor]) – Masking tensor for the encoding attention module. The shape should be [batch_size, n_heads, n_steps, n_steps].

Return type:

Tuple[Tensor, Tensor, Tensor]

Returns:

  • dec_output – Output tensor.

  • dec_slf_attn – The self-attention map.

  • dec_enc_attn – The encoding attention map.

class pypots.nn.modules.transformer.PositionWiseFeedForward(d_in, d_hid, dropout=0.1)[source]#

Position-wise feed forward network (FFN) in Transformer.

Parameters:
  • d_in (int) – The dimension of the input tensor.

  • d_hid (int) – The dimension of the hidden layer.

  • dropout (float) – The dropout rate.

forward(x)[source]#

Forward processing of the position-wise feed forward network.

Parameters:

x (Tensor) – Input tensor.

Returns:

Output tensor.

Return type:

x

class pypots.nn.modules.transformer.Encoder(n_layers, n_steps, n_features, d_model, d_ffn, n_heads, d_k, d_v, dropout, attn_dropout)[source]#

Transformer encoder.

Parameters:
  • n_layers (int) – The number of layers in the encoder.

  • n_steps (int) – The number of time steps in the input tensor.

  • n_features (int) – The number of features in the input tensor.

  • d_model (int) – The dimension of the module manipulation space. The input tensor will be projected to a space with d_model dimensions.

  • d_ffn (int) – The dimension of the hidden layer in the feed-forward network.

  • n_heads (int) – The number of heads in multi-head attention.

  • d_k (int) – The dimension of the key and query tensor.

  • d_v (int) – The dimension of the value tensor.

  • dropout (float) – The dropout rate.

  • attn_dropout (float) – The dropout rate for the attention map.

forward(x, src_mask=None, return_attn_weights=False)[source]#

Forward processing of the encoder.

Parameters:
  • x (Tensor) – Input tensor.

  • src_mask (Optional[Tensor]) – Masking tensor for the attention map. The shape should be [batch_size, n_heads, n_steps, n_steps].

  • return_attn_weights (bool) – Whether to return the attention map.

Return type:

Union[Tensor, Tuple[Tensor, list]]

Returns:

  • enc_output – Output tensor.

  • attn_weights_collector – A list containing the attention map from each encoder layer.

class pypots.nn.modules.transformer.Decoder(n_layers, n_steps, n_features, d_model, d_ffn, n_heads, d_k, d_v, dropout, attn_dropout)[source]#

Transformer decoder.

Parameters:
  • n_layers (int) – The number of layers in the decoder.

  • n_steps (int) – The number of time steps in the input tensor.

  • n_features (int) – The number of features in the input tensor.

  • d_model (int) – The dimension of the module manipulation space. The input tensor will be projected to a space with d_model dimensions.

  • d_ffn (int) – The dimension of the hidden layer in the feed-forward network.

  • n_heads (int) – The number of heads in multi-head attention.

  • d_k (int) – The dimension of the key and query tensor.

  • d_v (int) – The dimension of the value tensor.

  • dropout (float) – The dropout rate.

  • attn_dropout (float) – The dropout rate for the attention map.

forward(trg_seq, enc_output, trg_mask=None, src_mask=None, return_attn_weights=False)[source]#

Forward processing of the decoder.

Parameters:
  • trg_seq (Tensor) – Input tensor.

  • enc_output (Tensor) – Output tensor from the encoder.

  • trg_mask (Optional[Tensor]) – Masking tensor for the self-attention module.

  • src_mask (Optional[Tensor]) – Masking tensor for the encoding attention module.

  • return_attn_weights (bool) – Whether to return the attention map.

Return type:

Union[Tensor, Tuple[Tensor, list, list]]

Returns:

  • dec_output – Output tensor.

  • dec_slf_attn_collector – A list containing the self-attention map from each decoder layer.

  • dec_enc_attn_collector – A list containing the encoding attention map from each decoder layer.