Organizing data#
DataModule#
Base |
- class SpatioTemporalDataModule(dataset: SpatioTemporalDataset, scalers: Optional[Mapping] = None, mask_scaling: bool = True, splitter: Optional[Splitter] = None, batch_size: int = 32, workers: int = 0, pin_memory: bool = False)[source]#
Base
LightningDataModule
forSpatioTemporalDataset
.- Parameters:
dataset (SpatioTemporalDataset) – The complete dataset.
scalers (dict, optional) – Named mapping of
Scaler
to be used for data rescaling after splitting. Every scaler is given as input the attribute of the dataset named as the scaler’s key. IfNone
, no scaling is performed. (defaultNone
)mask_scaling (bool) – If
True
, then compute statistics fordataset.target
scaler (if any) by considering only valid values (according todataset.mask
). (defaultTrue
)splitter (Splitter, optional) – The
Splitter
to be used for splittingdataset
into train/validation/test sets. (defaultNone
)batch_size (int) – Size of the mini-batches for the dataloaders. (default
32
)workers (int) – Number of workers to use in the dataloaders. (default
0
)pin_memory (bool) – If
True
, then enable pinned GPU memory fortrain_dataloader()
. (defaultFalse
)
- setup(stage: Optional[Literal['fit', 'validate', 'test', 'predict']] = None)[source]#
Called at the beginning of fit (train + validate), validate, test, or predict. This is a good hook when you need to build models dynamically or adjust something about them. This hook is called on every process when using DDP.
- Parameters:
stage – either
'fit'
,'validate'
,'test'
, or'predict'
Example:
class LitModel(...): def __init__(self): self.l1 = None def prepare_data(self): download_data() tokenize() # don't do this self.something = else def setup(self, stage): data = load_data(...) self.l1 = nn.Linear(28, data.num_classes)
Splitters#
Base class for splitter module. |
|
Create a |
|
Split the data sequentially with specified lengths. |
|
Split the data at given time steps (only for |
- class CustomSplitter(*args, **kwargs)[source]#
Create a
Splitter
using custom validation and test sets splitting functions.
- class TemporalSplitter(*args, **kwargs)[source]#
Split the data sequentially with specified lengths.
- class AtTimeStepSplitter(*args, **kwargs)[source]#
Split the data at given time steps (only for
SpatioTemporalDataset
withDatetimeIndex
index).