Untrained parameter problem #3

Breeze-Zero · 2024-04-08T06:12:21Z

Lines 339 to 347 in 69e13fb

    
           x = self.conv1(x) 
        
           mask = torch.zeros(x.shape).to(x.device) 
        
           h, w = x.shape[-2:] 
        
           threshold = F.adaptive_avg_pool2d(x, 1) 
        
           threshold = self.rate_conv(threshold).sigmoid() 
        
           for i in range(mask.shape[0]): 
        
               h_ = (h//n * threshold[i,0,:,:]).int() 
        
               w_ = (w//n * threshold[i,1,:,:]).int()

I found a problem, since I was training with DDP, that would indicate the presence of parameters that were not involved in the training. Through my investigation, self.score_gen and self.conv are unnecessary, and these problems are not serious. But the most important thing is that self.rate_conv will not participate in the gradient calculation, because the operation of generating mask with threshold is not differentiable.

c-yn · 2024-04-08T16:51:01Z

Hi, thanks for your interest.

self.score_gen and self.conv are not used in our models, and we forget to delete them in our code.
And thank you for pointing out the gradient-related issue. We will delve into it.

sentinel8b · 2024-07-15T04:51:44Z

I think gradient issue of self.rate_conv is crucial, do you have specific plan or milestone to fix it out?

c-yn closed this as completed Apr 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Untrained parameter problem #3

Untrained parameter problem #3

Breeze-Zero commented Apr 8, 2024

c-yn commented Apr 8, 2024

sentinel8b commented Jul 15, 2024

Untrained parameter problem #3

Untrained parameter problem #3

Comments

Breeze-Zero commented Apr 8, 2024

c-yn commented Apr 8, 2024

sentinel8b commented Jul 15, 2024