Operations#

The module tsl.ops exposes API for operations and utilities on spatiotemporal data. It is divided into submodules, one for each operation scope.

Connectivity#

adj_to_edge_index(adj: Union[Tensor, ndarray], backend: Optional[module] = None) → Tuple[Union[Tensor, ndarray], Union[Tensor, ndarray]][source]#

Convert adjacency matrix from dense layout to (edge_index, edge_weight) tuple. The input adjacency matrix is transposed before conversion.

Parameters:

adj (TensArray) – dense adjacency matrix as Tensor or ndarray.
backend (ModuleType, optional) – backend matching adj type (either numpy or torch), if None it is inferred from adj type. (default None)

Returns:

(edge_index, edge_weight) tuple of same type of: adj (Tensor or ndarray).

Return type:

tuple

reduce_graph(subset: Union[Tensor, List[int]], edge_index: Union[Tensor, SparseTensor, ndarray, coo_matrix, csr_matrix, csc_matrix], num_nodes: Optional[int] = None, backend: Optional[module] = None) → Tuple[Union[Tensor, ndarray], Optional[Union[Tensor, ndarray]]][source]#

Returns the subgraph with all nodes in subset and only the edges between them.

Parameters:

subset – The index of the nodes in the output subgraph.
edge_index – Adjacency matrix as COO edge_index or torch_sparse.SparseTensor.
num_nodes – The number of nodes. (default: None)
backend (ModuleType, optional) – Backend matching edge_index type (either numpy or torch), if None it is inferred from edge_index type. (default None)

Returns:

edge_index, edge_mask

Return type:

tuple

weighted_degree(index: Union[Tensor, ndarray], weights: Optional[Union[Tensor, ndarray]] = None, num_nodes: Optional[int] = None) → Union[Tensor, ndarray][source]#

Computes the weighted degree of a given one-dimensional index tensor.

Parameters:

index (LongTensor) – Index tensor.
weights (Tensor) – Edge weights tensor.
num_nodes (int, optional) – The number of nodes, i.e. max_val + 1 of index. (default: None)

asymmetric_norm(edge_index: Union[Tensor, SparseTensor, ndarray, coo_matrix, csr_matrix, csc_matrix], edge_weight: Optional[Union[Tensor, ndarray]] = None, dim: int = 0, num_nodes: Optional[int] = None, add_self_loops: bool = False) → Tuple[Union[Tensor, SparseTensor, ndarray, coo_matrix, csr_matrix, csc_matrix], Optional[Union[Tensor, ndarray]]][source]#

Normalize edge weights across dimension dim.

\[e_{i,j} = \frac{e_{i,j}}{deg_{i}\ \text{if dim=0 else}\ deg_{j}}\]

Parameters:

edge_index (LongTensor) – Edge index tensor.
edge_weight (Tensor) – Edge weights tensor.
dim (int) – Dimension over which to compute normalization.
num_nodes (int, optional) – The number of nodes, i.e. max_val + 1 of index. (default: None)
add_self_loops – Whether to add self loops to the adjacency matrix.

power_series(edge_index: Union[Tensor, ndarray], edge_weights: Optional[Union[Tensor, ndarray]] = None, k: int = 2, num_nodes: Optional[int] = None) → Tuple[Union[Tensor, ndarray], Union[Tensor, ndarray]][source]#

Compute order \(k\) power series of sparse adjacency matrix (\(A^k\)).

Parameters:

edge_index (LongTensor) – Edge index tensor.
edge_weights (Tensor) – Edge weights tensor.
k (int) – Order of power series.
num_nodes (int, optional) – The number of nodes, i.e. max_val + 1 of index. (default: None)

get_dummy_edge_index(dummy: str, num_nodes: int, edge_prob: float = 0.1, directed: bool = True, device=None)[source]#

Create an edge index corresponding to a certain dummy connectivity (e.g., full graph).

Parameters:

dummy (str) – The dummy connectivity, can be one of "identity" (A`=`I), "full" (A = np.ones(N, N)), "random" or :obj:”none”` (returns None).
num_nodes (int) – Number of nodes in the graph.
edge_prob (float) – Edge probability for the random graph. (default 0.1)
directed (bool) – Whether to generate a directed/undirected graph. (default True)
device (optional) – Device for the created tensor. (default None)

FrameArray#

aggregate(x: ~typing.Union[~pandas.core.frame.DataFrame, ~numpy.ndarray], index: ~typing.Union[~typing.List, ~typing.Tuple, ~torch.Tensor, ~numpy.ndarray], aggr_fn: ~typing.Callable = <function sum>, axis: int = 1, level: int = 0) → Union[DataFrame, ndarray][source]#

Aggregate rows/columns in (MultiIndexed) DataFrame according to a new index.

Parameters:

x (pd.DataFrame) – DataFrame to be aggregated.
index (Index) – A sequence of cluster_id with length equal to the index over which aggregation is performed. The i-th element of index at axis and level will be mapped to index[i]-th position in new index.
aggr_fn (Callable) – Function to be used for aggregation.
axis (int) – Axis over which performing aggregation, 0 for index, 1 for columns. (default 1)
level (int) – Level over which performing aggregation if axis is a MultiIndex. (default 0)

temporal_mean(x: Union[DataFrame, ndarray], index: Optional[DatetimeIndex] = None) → Union[DataFrame, ndarray][source]#

Compute the mean values for each row.

The mean is first computed hourly over the week of the year. Further NaN values are imputed using hourly mean over the same month through the years. If other NaN are present, they are replaced with the mean of the sole hours. Remaining missing values are filled with ffill and bfill.

Parameters:

x (np.array | pd.Dataframe) – Array-like with missing values.
index (pd.DatetimeIndex, optional) – Temporal index if x is not a :obj:’~pandas.Dataframe’ with a temporal index. Must have same length as x. (default None)

get_trend(df, period='week', train_len=None, valid_mask=None)[source]#

Perform detrending on a time series by subtrating from each value of the input dataframe the average value computed over the training dataset for each hour/weekday.

Parameters:

df – dataframe
period – period of the trend (‘day’, ‘week’, ‘month’)
train_len – train length

Returns:

the detrended dataset and the trend values

Return type:

tuple

normalize(x: Union[DataFrame, ndarray], by: Optional[Any] = None, axis: int = 0, level: int = 0)[source]#

Normalize input ndarray or DataFrame using mean and standard deviation. If x is a DataFrame, normalization can be done on a specific group.

Parameters:

x (FrameArray) – the FrameArray to be normalized.
by – the conditions used to determine the groups for the groupby(). (default None)
axis (int) – axis for the function to be applied on. (default 0)
level (int) – level of axis for the function to be applied on (for MultiIndexed DataFrames). (default 0)

Returns:

the normalized FrameArray

Return type:

FrameArray

Imputation#

prediction_dataframe(y, index, columns=None, aggregate_by='mean')[source]#

Aggregate batched predictions in a single DataFrame.

Parameters:

y (list or np.ndarray) – The list of predictions.
index (list or np.ndarray) – The list of time indexes coupled with the predictions.
columns (list or pd.Index) – The columns of the returned DataFrame.
aggregate_by (str or list) –
How to aggregate the predictions in case there are more than one for a step.
- mean: take the mean of the predictions;
- central: take the prediction at the central position, assuming that the predictions are ordered chronologically;
- smooth_central: average the predictions weighted by a gaussian signal with std=1.

Returns:

The evaluation mask for the DataFrame.

Return type:

pd.DataFrame

Pattern#

check_pattern(pattern: str, split: bool = False, ndim: Optional[int] = None, include_batch: bool = False) → Union[str, list][source]#

Check that pattern is allowed. A pattern is a string of tokens interleaved with blank spaces, where each token specifies what an axis in a tensor refers to. The supported tokens are:

‘t’, for the time dimension
‘n’, for the node dimension
‘e’, for the edge dimension
‘f’ or ‘c’, for the feature/channel dimension (‘c’ token is automatically converted to ‘f’)

In order to be valid, a pattern must have:

at most one ‘t’ dimension, as the first token;
at most two (consecutive) ‘n’ dimensions, right after the ‘t’ token or at the beginning of the pattern;
at most one ‘e’ dimension, either as the first token or after a ‘t’;
either ‘n’ or ‘e’ dimensions, but not both together;
all further tokens must be ‘c’ or ‘f’.

Parameters:

pattern (str) –
The input pattern, specifying with a token what an axis in a tensor refers to. The supported tokens are:
- ’t’, for the time dimension
- ’n’, for the node dimension
- ’e’, for the edge dimension
- ’f’ or ‘c’, for the feature/channel dimension (‘c’ token is automatically converted to ‘f’)
split (bool) – If True, then return an ordered list of the tokens in the sanitized pattern. (default: False)
ndim (int, optional) – If it is not None, then check that pattern has ndim tokens. (default: None)
include_batch (bool) – If True, then allows the token b. (default: False)

Returns:

The sanitized pattern as a string, or a list of the tokens: in the pattern.

Return type:

str or list

AZ-Test#

class AZWhitenessTestResult(statistic, pvalue)#

pvalue#: Alias for field number 1

statistic#: Alias for field number 0

class AZWhitenessMultiTestResult(statistic, pvalue, componentwise_tests)#

componentwise_tests#: Alias for field number 2

pvalue#: Alias for field number 1

statistic#: Alias for field number 0

az_whiteness_test(x: Union[Tensor, ndarray], edge_index: Union[Tensor, ndarray], mask: Optional[Union[Tensor, ndarray]] = None, pattern: str = 't n f', edge_weight: Optional[Union[Tensor, ndarray, float]] = None, edge_weight_temporal: Optional[float] = None, lamb: float = 0.5, multivariate: bool = False, remove_median: bool = False) → Union[AZWhitenessTestResult, AZWhitenessMultiTestResult][source]#

Implementation of the AZ-whiteness test from the paper “AZ-whiteness test: a test for uncorrelated noise on spatio-temporal graphs” (D. Zambon and C. Alippi, NeurIPS 2022).

Parameters:

x (TensArray) – graph signal, typically with pattern “t n f” and representing the prediction residuals.
edge_index (TensArray) – indices of the spatial edges with shape (2, E). Current implementation supports only a static topology.
mask (TensArray, optional) – boolean mask of signal x, with same size of x. The mask is True where the observations in x are valid and False otherwise. (default: None)
pattern (str) – string encoding the index pattern of x, typically “t n f” representing time, nodes and node features dimensions, respectively. (default: "t n f")
edge_weight (TensArray or float, optional) – positive weights of the spatial edges. It can be a TensArray of shape (E,), or a scalar value (same weight for all edges). (default: None)
edge_weight_temporal (float, optional) – positive scalar weight for all temporal edges. If None or "auto", the weight is computed to balance the contribution of the spatial and temporal components (see Zambon and Alippi, 2022). (default: None)
lamb (float, optional) – scalar factor in within \(0.0\) and \(1.0\) defining a convex combination of the spatial and temporal components; if lamb == 1.0 the test is applied on the spatial topology only, for lamb == 0.0 only the serial correlation is considered. (default: 0.5)
multivariate (bool) – whether to run a single test on a multivariate signal or combine multiple scalar tests, one for each of the f features. It applies only when f > 1. (default: False)
remove_median (bool) – whether to manually fulfill — where possible — the assumption of null median or not. (default: False)

Returns:

The test statistics.

Return type:

AZWhitenessTestResult or AZWhitenessMultiTestResult