You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Nov 3, 2023. It is now read-only.
The GPUStatsMonitor Callback records information about the GPU utilization in Tensorboard logs, however when running with ray_lightning, it raises a MisconfigurationException:
pytorch_lightning.utilities.exceptions.MisconfigurationException: You are using GPUStatsMonitor but are not running on GPU since gpus attribute in Trainer is set to None.
This is due to the code in the stats monitor callback:
if trainer._device_type != DeviceType.GPU:
raise MisconfigurationException(
"You are using GPUStatsMonitor but are not running on GPU"
f" since gpus attribute in Trainer is set to {trainer.gpus}."
)
It seems like ray_lightning, thus, doesn't set the DeviceType to GPU - which may have other unintended consequences later on.
This may also be solved by #118, but It's not entirely clear
The text was updated successfully, but these errors were encountered:
Hey @DavidMChan yes that's right this is the same issue as #99. Ray Lightning does set the device type to gpu (when use_gpu=True) but only on the workers that actually execute training. But for things like mixed precision or GPUStatsMonitor callback, Pytorch Lightning requires GPUs to be enabled on the driver side as well (even though they are not actually used). If you set gpus=1 in your Trainer, then this will tell PTL that the driver has GPUs available, and then this should work.
Unfortunately, this gets a bit tricky when wanting to use Ray Client, or executing a script with a CPU head node, but GPU worker nodes. PTL is not designed to support these types of deployments.
Uh oh!
There was an error while loading. Please reload this page.
The GPUStatsMonitor Callback records information about the GPU utilization in Tensorboard logs, however when running with ray_lightning, it raises a MisconfigurationException:
This is due to the code in the stats monitor callback:
It seems like ray_lightning, thus, doesn't set the DeviceType to GPU - which may have other unintended consequences later on.
This may also be solved by #118, but It's not entirely clear
The text was updated successfully, but these errors were encountered: