- 
                Notifications
    You must be signed in to change notification settings 
- Fork 3.1k
Pull requests: PaddlePaddle/PaddleNLP
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
      support sharding stage3 for deepseekv3 model
        
              
                contributor
        
      
    
      
  
        
          #11149
            opened Oct 23, 2025  by
            AlAuAu
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      【FlexCheckpoint】fix_the_optimizer_init
        
              
                contributor
        
      
    
      
  
        
          #11123
            opened Sep 27, 2025  by
            zty-king
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    2 tasks
  
      hack offload optimizer减少一次master weight的offload&reload
      
    
      
  
        
          #11111
            opened Sep 23, 2025  by
            Wennie396
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      add script for training gpt3 on XPU machine using flagcx as comm backend
        
              
                contributor
        
      
    
      
  
        
          #11014
            opened Aug 26, 2025  by
            mikethegoblin
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    2 tasks
  
      [NOT MERGE]Pr adapt flex checkpoint
        
              
                contributor
        
      
    
      
  
        
          #10996
            opened Aug 25, 2025  by
            zty-king
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    2 tasks
  
      [BUG]: fix the bug in  PretrainedModel.recompute_disable()
        
              
                contributor
              
                stale
        
      
    
      
  
        
          #10988
            opened Aug 21, 2025  by
            hongjx175
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    2 tasks
  
      recompute support offload tensor
      
    
      
  
        
          #10981
            opened Aug 21, 2025  by
            blacksheep-Aristotle
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    2 tasks
  
      moe_layer support fine_grained_forward
      
    
      
  
        
          #10980
            opened Aug 21, 2025  by
            blacksheep-Aristotle
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    2 tasks
  
      update expert parallel init logic
        
              
                stale
        
      
    
      
  
        
          #10966
            opened Aug 18, 2025  by
            blacksheep-Aristotle
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    2 tasks
  
Previous Next
  
  
  ProTip!
  Follow long discussions with comments:>50.