Preprocessing#

The module tsl.data.preprocessing exposes API to preprocess spatiotemporal data.

Scalers#

`Scaler`	Base class for linear `SpatioTemporalDataset` scalers.
`StandardScaler`	Apply standardization to data by removing mean and scaling to unit variance.
`MinMaxScaler`	Rescale data such that all lay in the specified range (default is \([0,1]\)).
`RobustScaler`	Removes the median and scales the data according to the quantile range.
`ScalerModule`	Converts a `Scaler` to a `torch.nn.Module`, to insert transformation parameters and functions into the minibatch.

class Scaler(bias=0.0, scale=1.0)[source]#

Base class for linear SpatioTemporalDataset scalers.

A Scaler is the base class for linear scaler objects. A linear scaler apply a linear transformation to the input using parameters bias \(\mu\) and scale \(\sigma\):

\[f(x) = (x - \mu) / \sigma.\]

Parameters:

bias (float) – the offset of the linear transformation. (default: 0.)
scale (float) – the scale of the linear transformation. (default: 1.)

params() → dict[source]#

Dictionary of the scaler parameters bias and scale.

Returns:: Scaler’s parameters bias and scale.
Return type:: dict

numpy(inplace=True)[source]#: Transform all tensors to numpy arrays.

transform(x: Union[Tensor, ndarray])[source]#: Apply transformation \(f(x) = (x - \mu) / \sigma\).

inverse_transform(x: Union[Tensor, ndarray])[source]#: Apply inverse transformation \(f(x) = (x \cdot \sigma) + \mu\).

fit_transform(x: Union[Tensor, ndarray], *args, **kwargs)[source]#: Fit scaler’s parameters using input x and then transform x.

save(filename: str, make_dir: bool = True) → str[source]#

Save the scaler to disk.

Parameters:

filename (str) – The path to the filename for storage.
make_dir (bool) – If True, then create non-existing directories in filename. (default: True)

Returns:

The absolute path to the saved file.

Return type:

str

classmethod load(filename: str) → Scaler[source]#

Load instance of this type of scaler from disk.

Parameters:: filename (str) – The path to the scaler file.

class StandardScaler(axis: Union[int, Tuple] = 0)[source]#

Apply standardization to data by removing mean and scaling to unit variance.

Parameters:: axis (int) – dimensions of input to fit parameters on. (default: 0)

class MinMaxScaler(axis: Union[int, Tuple] = 0, out_range: Tuple[float, float] = (0.0, 1.0))[source]#

Rescale data such that all lay in the specified range (default is \([0,1]\)).

Parameters:

axis (int) – dimensions of input to fit parameters on. (default: 0)
out_range (tuple) – output range of transformed data. (default: (0, 1))

class RobustScaler(axis: Union[int, Tuple] = 0, quantile_range: Tuple[float, float] = (25.0, 75.0), unit_variance: bool = False)[source]#

Removes the median and scales the data according to the quantile range.

Default range is the Interquartile Range (IQR), i.e., the range between the 1st quartile (25th quantile) and the 3rd quartile (75th quantile).

Parameters:

axis (int) – dimensions of input to fit parameters on. (default: 0)
quantile_range (tuple) – quantile range \((q_{\min}, q_{\max})\), with \(0.0 < q_{\min} < q_{\max} < 100.0\), used to calculate scale. (default: (25.0, 75.0))

class ScalerModule(scaler: Optional[Union[Scaler, ScalerModule]] = None, *, bias: Union[Tensor, float] = 0.0, scale: Union[Tensor, float] = 1.0, pattern: Optional[str] = None)[source]#

Converts a Scaler to a torch.nn.Module, to insert transformation parameters and functions into the minibatch.

extra_repr() → str[source]#

Set the extra representation of the module

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

params() → dict[source]#

Dictionary of the scaler parameters bias and scale.

Returns:: Scaler’s parameters bias and scale.
Return type:: dict

property t: int#: Size of temporal dimension (None if time-invariant).

property n: int#: Size of node dimension (None if node-invariant).

transform_tensor(x: Tensor) → Tensor[source]#: Apply transformation \(f(x) = (x - \mu) / \sigma\) to tensor x.

inverse_transform_tensor(x: Tensor) → Tensor[source]#: Apply inverse transformation \(f(x) = (x \cdot \sigma) + \mu\) to tensor x.

transform(x)[source]#: Recursively apply transformation \(f(x) = (x - \mu) / \sigma\) to x.

inverse_transform(x)[source]#: Recursively apply inverse transformation \(f(x) = (x \cdot \sigma) + \mu\) to x.

numpy()[source]#: Transform ScalerModule to Scaler.

rearrange(pattern: str, inplace=False, **axes_lengths) → ScalerModule[source]#: Rearrange parameters in the scaler according to the provided patter using einops.rearrange.

slice(time_index: Optional[Union[List, Tensor]] = None, node_index: Optional[Union[List, Tensor]] = None)[source]#

Slice the parameters of the scaler with the given time and node indices.

The scaler must have a pattern defining the dimensions of the parameters. This operation is not in place, it always returns a new ScalerModule. The parameters of the new scaler have same size of the indices provided along the slicing axes or 1 for the params with a single, broadcastable, value.