About the idea behind the implementation of method "_get_prev_sample " in PNDSScheduler class. #7533

moonryul · 2024-03-31T06:54:19Z

moonryul
Mar 31, 2024

Hi.

Let me quote the code for _get_prev_sample():

def get_prev_sample(self, sample, timestep, prev_timestep, model_output): #MJ: timestep=981; prev_timestep = 961, e.g.
# See formula (9) of PNDM paper https://arxiv.org/pdf/2202.09778.pdf
# this function computes x(t−δ) using the formula of (9)
# Note that x_t needs to be added to both sides of the equation

    # Notation (<variable name> -> <name in paper>
    # alpha_prod_t -> α_t
    # alpha_prod_t_prev -> α_(t−δ)
    # beta_prod_t -> (1 - α_t)
    # beta_prod_t_prev -> (1 - α_(t−δ))
    # sample -> x_t
    # model_output -> e_θ(x_t, t)
    # prev_sample -> x_(t−δ)
    alpha_prod_t = self.alphas_cumprod[timestep]  #MJ: self.alphas_cumprod has 1000 elements; timestep=981, prev_timestep=961
    alpha_prod_t_prev = self.alphas_cumprod[prev_timestep] if prev_timestep >= 0 else self.final_alpha_cumprod
    beta_prod_t = 1 - alpha_prod_t
    beta_prod_t_prev = 1 - alpha_prod_t_prev

    if self.config.prediction_type == "v_prediction":
        model_output = (alpha_prod_t**0.5) * model_output + (beta_prod_t**0.5) * sample
    elif self.config.prediction_type != "epsilon":
        raise ValueError(
            f"prediction_type given as {self.config.prediction_type} must be one of `epsilon` or `v_prediction`"
        )

    # corresponds to (α_(t−δ) - α_t) divided by
    # denominator of x_t in formula (9) and plus 1; MJ: "plus 1" is used to add x_t to the right hand side of eq (9),how?
    # Note: (α_(t−δ) - α_t) / (  sqrt(α_t) * ( sqrt(α_(t−δ)) + sqr(α_t) )  )  =
    # sqrt(α_(t−δ)) / sqrt(α_t))
    sample_coeff = (alpha_prod_t_prev / alpha_prod_t) ** (0.5)  #MJ: = sqrt(α_(t−δ)) / sqrt(α_t))

    # corresponds to denominator of e_θ(x_t, t) in formula (9)
    model_output_denom_coeff = alpha_prod_t * beta_prod_t_prev ** (0.5) + (
        alpha_prod_t * beta_prod_t * alpha_prod_t_prev
    ) ** (0.5)

    # full formula (9)
    prev_sample = (
        sample_coeff * sample - (alpha_prod_t_prev - alpha_prod_t) * model_output / model_output_denom_coeff
    )

    return prev_sample
    
    
    Q1. I read formula (9) of the paper PNDM paper https://arxiv.org/pdf/2202.09778.pdf. I also understand the comment
    "Note that x_t needs to be added to both sides of the equation. 
    
    But the the last line of the function reads:
    
    # full formula (9)
    prev_sample = (
        sample_coeff * sample - (alpha_prod_t_prev - alpha_prod_t) * model_output / model_output_denom_coeff
    )
    
    I thought the line should be as follows (by adding x_t (sample) to both sides of the equation (9):
    
     prev_sample = sample + (
        sample_coeff * sample - (alpha_prod_t_prev - alpha_prod_t) * model_output / model_output_denom_coeff
    )
    What is going on here?
    
    Q2. I thought that the following comment is related to the current issue:
    # corresponds to (α_(t−δ) - α_t) divided by
    # denominator of x_t in formula (9) and plus 1; 
    # Note: (α_(t−δ) - α_t) / (  sqrt(α_t) * ( sqrt(α_(t−δ)) + sqr(α_t) )  )  =
    # sqrt(α_(t−δ)) / sqrt(α_t))
    
    I read this comment as saying:
       Add 1 to  (α_(t−δ) - α_t) / denominator of x_t in formula (9).
       
        (α_(t−δ) - α_t) / denominator of x_t in formula (9). =  (α_(t−δ) - α_t) / (  sqrt(α_t) * ( sqrt(α_(t−δ)) + sqr(α_t) )  )  |
        
        
        We have comment:
        
        # Note: (α_(t−δ) - α_t) / (  sqrt(α_t) * ( sqrt(α_(t−δ)) + sqr(α_t) )  )  =
    # sqrt(α_(t−δ)) / sqrt(α_t))

I think this comment should have been (according to the intention of the previous comment mentioning "plus 1":

Note: (α_(t−δ) - α_t) / ( sqrt(α_t) * ( sqrt(α_(t−δ)) + sqr(α_t) ) ) + 1 =

    # sqrt(α_(t−δ)) / sqrt(α_t))

If this holds, then sample_coeff defined to be (alpha_prod_t_prev / alpha_prod_t) ** (0.5)
contains "plus 1" in it so that

full formula (9)

    prev_sample = (
        sample_coeff * sample - (alpha_prod_t_prev - alpha_prod_t) * model_output / model_output_denom_coeff
    )

works. sample_coeff contains "plus 1", so that the above formula comes down to (allowing the abuse of notation):

prev_sample = (
(sample_coeff +1) * sample - (alpha_prod_t_prev - alpha_prod_t) * model_output / model_output_denom_coeff
)
= sample + sample_coeff * sample - (alpha_prod_t_prev - alpha_prod_t) * model_output / model_output_denom_coeff
)

BUT, I could not prove:
α_(t−δ) - α_t) / ( sqrt(α_t) * ( sqrt(α_(t−δ)) + sqr(α_t) ) ) + 1 =
sqrt(α_(t−δ)) / sqrt(α_t)

I consulted the original implementation of the authors of the paper. I had no problem with understanding it.
I have problem with understanding the implementation of Diffusers library.

Because Diffusers library is widely used, I hope this problem will be resolved by the help of the colleagues.

Sincerely

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

About the idea behind the implementation of method "_get_prev_sample " in PNDSScheduler class. #7533

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

About the idea behind the implementation of method "_get_prev_sample " in PNDSScheduler class. #7533

Uh oh!

moonryul Mar 31, 2024

Note: (α_(t−δ) - α_t) / ( sqrt(α_t) * ( sqrt(α_(t−δ)) + sqr(α_t) ) ) + 1 =

full formula (9)

Replies: 0 comments

moonryul
Mar 31, 2024