NaN in DPMSolverMultistepInverseScheduler

Hi, everyone, I'm new to diffusers. I'm trying to use DPMSolverMultistepInverseScheduler for DDIM inversion. The applied config is:
```python
dpmpp_2m_sde_karras_scheduler_inv = DPMSolverMultistepInverseScheduler(
    num_train_timesteps=1000,
    beta_start=0.00085,
    beta_end=0.012,
    algorithm_type="sde-dpmsolver++",
    use_karras_sigmas=True,
    steps_offset=1
)
```
And the DDIM inversion is realized through:
```python
self.scheduler.set_timesteps(self.inv_config.steps)
timesteps = self.scheduler.timesteps
with torch.autocast(device_type=self.device, dtype=self.dtype):
    for i, t in enumerate(tqdm(timesteps)):
        noises = []
        x_index = torch.arange(len(x))
        batches = x_index.split(self.batch_size, dim = 0)
        for batch in batches:
            noise = self.pred_noise(
                x[batch], conds, timesteps[i], concat_conds=x[batch], batch_idx=batch)
            noises += [noise]
        noises = torch.cat(noises)
        
        x = self.scheduler.step(noises, t, x, generator=self.rng, return_dict=False)[0]
```
But NaN occurs in the first scheduler step. I dug into it and found it happens in `dpm_solver_first_order_update`:
https://github.com/huggingface/diffusers/blob/560fb5f4d65b8593c13e4be50a59b1fd9c2d9992/src/diffusers/schedulers/scheduling_dpmsolver_multistep_inverse.py#L627-L633
The self.sigmas is `tensor([ 0.0292,  0.0462,  0.0710,  0.1065,  0.1563,  0.2249,  0.3178,  0.4417,
         0.6050,  0.8176,  1.0911,  1.4396,  1.8795,  2.4300,  3.1132,  3.9548,
         4.9844,  6.2356,  7.7471,  9.5622, 11.7303, 14.3068, 17.3539, 20.9411,
        25.1461, 25.1461])`. Its increasing order leads `lambda_t ` to be smaller than `lambda_s` and therefore a negative `h`.
As a result, in https://github.com/huggingface/diffusers/blob/560fb5f4d65b8593c13e4be50a59b1fd9c2d9992/src/diffusers/schedulers/scheduling_dpmsolver_multistep_inverse.py#L638-L644
`torch.sqrt(1.0 - torch.exp(-2 * h))` becomes NaN. But I noticed that in `DPMSolverMultistepScheduler`, the problem is avoided by applying flip:
https://github.com/huggingface/diffusers/blob/560fb5f4d65b8593c13e4be50a59b1fd9c2d9992/src/diffusers/schedulers/scheduling_dpmsolver_multistep.py#L395-L396
I've searched for many usage examples, but I still can't figure out the stem of the problem. Can anybody give a help?🙏

	sigma_t, sigma_s = self.sigmas[self.step_index + 1], self.sigmas[self.step_index]
	alpha_t, sigma_t = self._sigma_to_alpha_sigma_t(sigma_t)
	alpha_s, sigma_s = self._sigma_to_alpha_sigma_t(sigma_s)
	lambda_t = torch.log(alpha_t) - torch.log(sigma_t)
	lambda_s = torch.log(alpha_s) - torch.log(sigma_s)

	h = lambda_t - lambda_s

	elif self.config.algorithm_type == "sde-dpmsolver++":
	assert noise is not None
	x_t = (
	(sigma_t / sigma_s * torch.exp(-h)) * sample
	+ (alpha_t * (1 - torch.exp(-2.0 * h))) * model_output
	+ sigma_t * torch.sqrt(1.0 - torch.exp(-2 * h)) * noise
	)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NaN in DPMSolverMultistepInverseScheduler #10748

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	if self.config.use_karras_sigmas:
	sigmas = np.flip(sigmas).copy()

NaN in DPMSolverMultistepInverseScheduler #10748

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions