Freeing GPU memory after `torch.compile` StableDiffusionXLPipeline UNet

While exploring optimizations listed in the [documentation](https://huggingface.co/docs/diffusers/optimization/torch2.0), I find myself unable to free GPU memory after using `torch.compile` on a StableDiffusionXLPipeline UNet.

```Python
from diffusers import StableDiffusionXLPipeline

pipe = StableDiffusionXLPipeline.from_pretrained(
    'stabilityai/stable-diffusion-xl-base-1.0',
    torch_dtype=torch.float16,
    variant="fp16",
    use_safetensors=True
).to('cuda')

# Compile UNet
pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)

generator = torch.Generator(device="cuda").manual_seed(42)
prompt = "a photo of an astronaut riding a horse on mars"

image = pipe(prompt=prompt, num_inference_steps=20, generator=generator).images[0]

del pipe

gc.collect()
torch._dynamo.reset()
torch.cuda.empty_cache()
torch.cuda.synchronize()

# GPU memory is still in use, but it's not the case when we do not compile the pipeline unet.
```

It can sometimes be useful to free the GPU memory, especially if you want to load and compile another pipeline checkpoint to perform another large number of generations.

I made a [code reproduction](https://colab.research.google.com/drive/191XutMhBarF0MFXOmPbsOQH-a8CB9vmE?usp=drive_link) in collab for testing.

Am I missing something? Could it be a [memory leak](https://dev-discuss.pytorch.org/t/fixing-torch-compile-reference-leaks-automatic-deletion-of-dynamo-code-objects/2197) on the compilation backend side, in which case it might be better to turn to PyTorch to discuss about this?

### System Info
python: 3.10.12
diffusers: 0.30.3
torch: 2.4.1+cu121
Running on Google Colab?: Yes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Freeing GPU memory after `torch.compile` StableDiffusionXLPipeline UNet #9530

System Info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Freeing GPU memory after torch.compile StableDiffusionXLPipeline UNet #9530

Description

System Info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Freeing GPU memory after `torch.compile` StableDiffusionXLPipeline UNet #9530