Operations#
The module tsl.ops
exposes API for operations and utilities on
spatiotemporal data. It is divided into submodules, one for each operation scope.
Connectivity#
- adj_to_edge_index(adj: Union[Tensor, ndarray], backend: Optional[module] = None) Tuple[Union[Tensor, ndarray], Union[Tensor, ndarray]] [source]#
Convert adjacency matrix from dense layout to (
edge_index
,edge_weight
) tuple. The input adjacency matrix is transposed before conversion.- Parameters:
- Returns:
- Return type:
- reduce_graph(subset: Union[Tensor, List[int]], edge_index: Union[Tensor, SparseTensor, ndarray, coo_matrix, csr_matrix, csc_matrix], num_nodes: Optional[int] = None, backend: Optional[module] = None) Tuple[Union[Tensor, ndarray], Optional[Union[Tensor, ndarray]]] [source]#
Returns the subgraph with all nodes in
subset
and only the edges between them.- Parameters:
subset – The index of the nodes in the output subgraph.
edge_index – Adjacency matrix as COO
edge_index
ortorch_sparse.SparseTensor
.num_nodes – The number of nodes. (default:
None
)backend (ModuleType, optional) – Backend matching
edge_index
type (eithernumpy
ortorch
), ifNone
it is inferred fromedge_index
type. (defaultNone
)
- Returns:
edge_index, edge_mask
- Return type:
- weighted_degree(index: Union[Tensor, ndarray], weights: Optional[Union[Tensor, ndarray]] = None, num_nodes: Optional[int] = None) Union[Tensor, ndarray] [source]#
Computes the weighted degree of a given one-dimensional index tensor.
- asymmetric_norm(edge_index: Union[Tensor, SparseTensor, ndarray, coo_matrix, csr_matrix, csc_matrix], edge_weight: Optional[Union[Tensor, ndarray]] = None, dim: int = 0, num_nodes: Optional[int] = None, add_self_loops: bool = False) Tuple[Union[Tensor, SparseTensor, ndarray, coo_matrix, csr_matrix, csc_matrix], Optional[Union[Tensor, ndarray]]] [source]#
Normalize edge weights across dimension
dim
.\[e_{i,j} = \frac{e_{i,j}}{deg_{i}\ \text{if dim=0 else}\ deg_{j}}\]- Parameters:
edge_index (LongTensor) – Edge index tensor.
edge_weight (Tensor) – Edge weights tensor.
dim (int) – Dimension over which to compute normalization.
num_nodes (int, optional) – The number of nodes, i.e.
max_val + 1
ofindex
. (default:None
)add_self_loops – Whether to add self loops to the adjacency matrix.
- power_series(edge_index: Union[Tensor, ndarray], edge_weights: Optional[Union[Tensor, ndarray]] = None, k: int = 2, num_nodes: Optional[int] = None) Tuple[Union[Tensor, ndarray], Union[Tensor, ndarray]] [source]#
Compute order \(k\) power series of sparse adjacency matrix (\(A^k\)).
- get_dummy_edge_index(dummy: str, num_nodes: int, edge_prob: float = 0.1, directed: bool = True, device=None)[source]#
Create an edge index corresponding to a certain dummy connectivity (e.g., full graph).
- Parameters:
dummy (str) – The dummy connectivity, can be one of
"identity"
(A`=`I),"full"
(A = np.ones(N, N)),"random"
or :obj:”none”` (returnsNone
).num_nodes (int) – Number of nodes in the graph.
edge_prob (float) – Edge probability for the random graph. (default
0.1
)directed (bool) – Whether to generate a directed/undirected graph. (default
True
)device (optional) – Device for the created tensor. (default
None
)
FrameArray#
- aggregate(x: ~typing.Union[~pandas.core.frame.DataFrame, ~numpy.ndarray], index: ~typing.Union[~typing.List, ~typing.Tuple, ~torch.Tensor, ~numpy.ndarray], aggr_fn: ~typing.Callable = <function sum>, axis: int = 1, level: int = 0) Union[DataFrame, ndarray] [source]#
Aggregate rows/columns in (MultiIndexed) DataFrame according to a new index.
- Parameters:
x (pd.DataFrame) –
DataFrame
to be aggregated.index (Index) – A sequence of
cluster_id
with length equal to the index over which aggregation is performed. Thei
-th element of index ataxis
andlevel
will be mapped toindex[i]
-th position in new index.aggr_fn (Callable) – Function to be used for aggregation.
axis (int) – Axis over which performing aggregation,
0
for index,1
for columns. (default1
)level (int) – Level over which performing aggregation if
axis
is aMultiIndex
. (default0
)
- temporal_mean(x: Union[DataFrame, ndarray], index: Optional[DatetimeIndex] = None) Union[DataFrame, ndarray] [source]#
Compute the mean values for each row.
The mean is first computed hourly over the week of the year. Further
NaN
values are imputed using hourly mean over the same month through the years. If otherNaN
are present, they are replaced with the mean of the sole hours. Remaining missing values are filled withffill
andbfill
.- Parameters:
x (np.array | pd.Dataframe) – Array-like with missing values.
index (pd.DatetimeIndex, optional) – Temporal index if x is not a :obj:’~pandas.Dataframe’ with a temporal index. Must have same length as
x
. (defaultNone
)
- get_trend(df, period='week', train_len=None, valid_mask=None)[source]#
Perform detrending on a time series by subtrating from each value of the input dataframe the average value computed over the training dataset for each hour/weekday.
- Parameters:
df – dataframe
period – period of the trend (‘day’, ‘week’, ‘month’)
train_len – train length
- Returns:
the detrended dataset and the trend values
- Return type:
- normalize(x: Union[DataFrame, ndarray], by: Optional[Any] = None, axis: int = 0, level: int = 0)[source]#
Normalize input
ndarray
orDataFrame
using mean and standard deviation. Ifx
is aDataFrame
, normalization can be done on a specific group.- Parameters:
x (FrameArray) – the FrameArray to be normalized.
by – the conditions used to determine the groups for the
groupby()
. (defaultNone
)axis (int) – axis for the function to be applied on. (default 0)
level (int) – level of axis for the function to be applied on (for MultiIndexed DataFrames). (default 0)
- Returns:
the normalized FrameArray
- Return type:
FrameArray
Imputation#
- prediction_dataframe(y, index, columns=None, aggregate_by='mean')[source]#
Aggregate batched predictions in a single DataFrame.
- Parameters:
y (list or np.ndarray) – The list of predictions.
index (list or np.ndarray) – The list of time indexes coupled with the predictions.
columns (list or pd.Index) – The columns of the returned DataFrame.
How to aggregate the predictions in case there are more than one for a step.
mean: take the mean of the predictions;
central: take the prediction at the central position, assuming that the predictions are ordered chronologically;
smooth_central: average the predictions weighted by a gaussian signal with std=1.
- Returns:
The evaluation mask for the DataFrame.
- Return type:
pd.DataFrame
Pattern#
- check_pattern(pattern: str, split: bool = False, ndim: Optional[int] = None, include_batch: bool = False) Union[str, list] [source]#
Check that
pattern
is allowed. A pattern is a string of tokens interleaved with blank spaces, where each token specifies what an axis in a tensor refers to. The supported tokens are:‘t’, for the time dimension
‘n’, for the node dimension
‘e’, for the edge dimension
‘f’ or ‘c’, for the feature/channel dimension (‘c’ token is automatically converted to ‘f’)
In order to be valid, a pattern must have:
at most one ‘t’ dimension, as the first token;
at most two (consecutive) ‘n’ dimensions, right after the ‘t’ token or at the beginning of the pattern;
at most one ‘e’ dimension, either as the first token or after a ‘t’;
either ‘n’ or ‘e’ dimensions, but not both together;
all further tokens must be ‘c’ or ‘f’.
- Parameters:
pattern (str) –
The input pattern, specifying with a token what an axis in a tensor refers to. The supported tokens are:
’t’, for the time dimension
’n’, for the node dimension
’e’, for the edge dimension
’f’ or ‘c’, for the feature/channel dimension (‘c’ token is automatically converted to ‘f’)
split (bool) – If
True
, then return an ordered list of the tokens in the sanitized pattern. (default:False
)ndim (int, optional) – If it is not
None
, then check thatpattern
hasndim
tokens. (default:None
)include_batch (bool) – If
True
, then allows the tokenb
. (default:False
)
- Returns:
- The sanitized pattern as a string, or a list of the tokens
in the pattern.
- Return type:
AZ-Test#
- class AZWhitenessTestResult(statistic, pvalue)#
- pvalue#
Alias for field number 1
- statistic#
Alias for field number 0
- class AZWhitenessMultiTestResult(statistic, pvalue, componentwise_tests)#
- componentwise_tests#
Alias for field number 2
- pvalue#
Alias for field number 1
- statistic#
Alias for field number 0
- az_whiteness_test(x: Union[Tensor, ndarray], edge_index: Union[Tensor, ndarray], mask: Optional[Union[Tensor, ndarray]] = None, pattern: str = 't n f', edge_weight: Optional[Union[Tensor, ndarray, float]] = None, edge_weight_temporal: Optional[float] = None, lamb: float = 0.5, multivariate: bool = False, remove_median: bool = False) Union[AZWhitenessTestResult, AZWhitenessMultiTestResult] [source]#
Implementation of the AZ-whiteness test from the paper “AZ-whiteness test: a test for uncorrelated noise on spatio-temporal graphs” (D. Zambon and C. Alippi, NeurIPS 2022).
- Parameters:
x (TensArray) – graph signal, typically with pattern “t n f” and representing the prediction residuals.
edge_index (TensArray) – indices of the spatial edges with shape (2, E). Current implementation supports only a static topology.
mask (TensArray, optional) – boolean mask of signal
x
, with same size ofx
. The mask isTrue
where the observations inx
are valid andFalse
otherwise. (default:None
)pattern (str) – string encoding the index pattern of x, typically “t n f” representing time, nodes and node features dimensions, respectively. (default:
"t n f"
)edge_weight (TensArray or float, optional) – positive weights of the spatial edges. It can be a
TensArray
of shape (E,), or a scalar value (same weight for all edges). (default:None
)edge_weight_temporal (float, optional) – positive scalar weight for all temporal edges. If
None
or"auto"
, the weight is computed to balance the contribution of the spatial and temporal components (see Zambon and Alippi, 2022). (default:None
)lamb (float, optional) – scalar factor in within \(0.0\) and \(1.0\) defining a convex combination of the spatial and temporal components; if
lamb == 1.0
the test is applied on the spatial topology only, forlamb == 0.0
only the serial correlation is considered. (default:0.5
)multivariate (bool) – whether to run a single test on a multivariate signal or combine multiple scalar tests, one for each of the
f
features. It applies only whenf > 1
. (default:False
)remove_median (bool) – whether to manually fulfill — where possible — the assumption of null median or not. (default:
False
)
- Returns:
The test statistics.
- Return type: