DataLoader๏
- class dgl.graphbolt.DataLoader(datapipe, num_workers=0, persistent_workers=True, max_uva_threads=10240)[source]๏
Bases:
MiniBatchTransformer
Multiprocessing DataLoader.
Iterates over the data pipeline with everything before feature fetching (i.e.
dgl.graphbolt.FeatureFetcher
) in subprocesses, and everything after feature fetching in the main process. The datapipe is modified in-place as a result.When the copy_to operation is placed earlier in the data pipeline, the num_workers argument is required to be 0 as utilizing CUDA in multiple worker processes is not supported.
- Parameters:
datapipe (DataPipe) โ The data pipeline.
num_workers (int, optional) โ Number of worker processes. Default is 0.
persistent_workers (bool, optional) โ If True, the data loader will not shut down the worker processes after a dataset has been consumed once. This allows to maintain the workers instances alive.
max_uva_threads (int, optional) โ Limits the number of CUDA threads used for UVA copies so that the rest of the computations can run simultaneously with it. Setting it to a too high value will limit the amount of overlap while setting it too low may cause the PCI-e bandwidth to not get fully utilized. Manually tuned default is 10240, meaning around 5-7 Streaming Multiprocessors.