-
Notifications
You must be signed in to change notification settings - Fork 2.6k
[Newton] Graphed Workflows #3657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev/newton
Are you sure you want to change the base?
[Newton] Graphed Workflows #3657
Conversation
|
this is amazing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR introduces a major architectural enhancement by implementing graphed workflows using Warp for GPU-accelerated RL training in Isaac Lab. The implementation adds a complete parallel pipeline alongside the existing Torch-based workflow.
Key Changes
- New Backend Abstraction: Factory pattern in
backend_utils.pyenables dynamic loading of Newton vs base implementations, with backend currently hardcoded to "newton" - Warp-based Environment:
DirectRLEnvWarpclass implements CUDA graph capture forstep_warp_action()andstep_warp_end()to eliminate kernel launch overhead - Newton Assets & Actuators: Complete reimplementation of articulation and actuator systems using Warp arrays and kernels instead of Torch tensors
- Performance Improvements: Benchmarks show significant speedups (Cartpole: 3.4ms→1.1ms, Ant: 7.6ms→4.6ms, Humanoid: 17.5ms→14.9ms per collection step)
- Example Environments: Warp implementations for Cartpole, Ant, and Humanoid demonstrating the new workflow
Issues Found
- Critical: Line 375 in
direct_rl_env_warp.pycreates new tensors each step (wp.from_torch(action)), preventing full graph capture and undermining performance benefits - Type inconsistency:
_pre_physics_stepsignature expectstorch.Tensorbut receiveswp.array - Code cleanup needed: Significant commented-out code for event managers and noise models throughout
direct_rl_env_warp.py - Minor: Missing newline at EOF in
__init__.py
Confidence Score: 2/5
- This PR has significant architectural changes with critical performance-impacting issues that need resolution before merging
- Score reflects the critical issue at line 375 where tensor conversion breaks CUDA graph capture, undermining the core performance benefits of the graphed workflow. The extensive commented-out code suggests incomplete implementation of event management and noise models. While the architecture is sound and benchmarks show promise, the tensor conversion bug needs immediate attention as it prevents achieving the advertised performance gains in production.
- Pay special attention to
source/isaaclab/isaaclab/envs/direct_rl_env_warp.py- the action tensor conversion at line 375 must be fixed to achieve full CUDA graph benefits
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| source/isaaclab/isaaclab/envs/direct_rl_env_warp.py | 3/5 | New warp-based RL environment with CUDA graph capture for performance. Contains commented-out event manager and noise model code that should be cleaned up or enabled. |
| source/isaaclab/isaaclab/init.py | 4/5 | Added Backend IntEnum to distinguish Newton from PhysX. Missing newline at end of file. |
| source/isaaclab/isaaclab/actuators/actuator_base.py | 3/5 | Refactored to support backend abstraction. Removed torch-specific attributes, simplified constructor signature for backend compatibility. |
| source/isaaclab/isaaclab/newton/assets/articulation/articulation.py | 3/5 | Large new file (2500+ lines) implementing Newton-based articulation with warp kernels. Core implementation for the graphed workflow. |
| source/isaaclab/isaaclab/utils/backend_utils.py | 5/5 | Factory pattern implementation for dynamic backend loading (Newton vs base). Clean abstraction layer. |
| source/isaaclab_tasks/isaaclab_tasks/direct/cartpole/cartpole_warp_env.py | 4/5 | Warp-based Cartpole environment using kernels for observations, rewards, and resets. Good example of the new workflow. |
Sequence Diagram
sequenceDiagram
participant User
participant DirectRLEnvWarp
participant Scene
participant Articulation
participant ActuatorModel
participant NewtonSolver
participant WarpKernels
Note over User,WarpKernels: Initialization Phase
User->>DirectRLEnvWarp: __init__(cfg)
DirectRLEnvWarp->>Scene: InteractiveScene(cfg.scene)
DirectRLEnvWarp->>Articulation: Articulation(cfg, frontend=Frontend.WARP)
Articulation->>ActuatorModel: Initialize actuator models with warp arrays
Note over User,WarpKernels: Training Loop
User->>DirectRLEnvWarp: step(action)
DirectRLEnvWarp->>DirectRLEnvWarp: _pre_physics_step(wp.from_torch(action))
loop decimation times
DirectRLEnvWarp->>DirectRLEnvWarp: step_warp_action() [CUDA Graph]
DirectRLEnvWarp->>Articulation: _apply_action()
Articulation->>ActuatorModel: compute()
ActuatorModel->>WarpKernels: compute_pd_actuator / clip_efforts
DirectRLEnvWarp->>Scene: write_data_to_sim()
Scene->>NewtonSolver: Apply computed efforts
DirectRLEnvWarp->>NewtonSolver: sim.step(render=False)
NewtonSolver-->>Articulation: Update physics state
DirectRLEnvWarp->>Scene: update(dt)
end
DirectRLEnvWarp->>DirectRLEnvWarp: step_warp_end() [CUDA Graph]
DirectRLEnvWarp->>WarpKernels: add_to_env (episode_length_buf++)
DirectRLEnvWarp->>DirectRLEnvWarp: _get_dones()
DirectRLEnvWarp->>WarpKernels: get_dones kernel
DirectRLEnvWarp->>DirectRLEnvWarp: _get_rewards()
DirectRLEnvWarp->>WarpKernels: compute_rewards kernel
DirectRLEnvWarp->>DirectRLEnvWarp: _reset_idx(reset_buf)
DirectRLEnvWarp->>Scene: reset(mask=reset_buf)
DirectRLEnvWarp->>DirectRLEnvWarp: _get_observations()
DirectRLEnvWarp->>WarpKernels: get_observations kernel
DirectRLEnvWarp-->>User: (obs, rewards, dones, extras)
94 files reviewed, 5 comments
| __version__ = ISAACLAB_METADATA["package"]["version"] | ||
|
|
||
| class Backend(IntEnum): | ||
| NEWTON = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: missing newline at end of file
| NEWTON = 0 | |
| PHYSX = 1 |
| action = self._action_noise_model(action) | ||
|
|
||
| # process actions, #TODO pass the torch tensor directly. | ||
| self._pre_physics_step(wp.from_torch(action)) # Creates a tensor and throws it away. --> Not graphable unless the training loop is using the same pointer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: the TODO comment indicates this creates a new tensor each step, preventing CUDA graph capture. This will significantly impact the performance benefits of the graphed workflow since the action conversion happens outside the graph
| pass | ||
|
|
||
| @abstractmethod | ||
| def _pre_physics_step(self, actions: torch.Tensor): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: type hint says torch.Tensor but should be wp.array since this is the warp environment and line 375 converts it with wp.from_torch(action)
| # TODO We could split it out. | ||
| # post-step: step interval event | ||
| # if self.cfg.events: | ||
| # if "interval" in self.event_manager.available_modes: | ||
| # self.event_manager.apply(mode="interval", dt=self.step_dt) | ||
|
|
||
| # update observations | ||
| self._get_observations() | ||
|
|
||
| # add observation noise | ||
| # note: we apply no noise to the state space (since it is used for critic networks) | ||
| # if self.cfg.observation_noise_model: | ||
| # self.obs_buf["policy"] = self._observation_noise_model(self.obs_buf["policy"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: large blocks of commented-out code for event management and observation noise should either be removed or properly implemented with a clear plan
|
|
||
| # apply events such as randomization for environments that need a reset | ||
| # if self.cfg.events: | ||
| # if "reset" in self.event_manager.available_modes: | ||
| # env_step_count = self._sim_step_counter // self.cfg.decimation | ||
| # self.event_manager.apply(mode="reset", env_ids=env_ids, global_env_step_count=env_step_count) | ||
|
|
||
| # reset noise models | ||
| # if self.cfg.action_noise_model: | ||
| # self._action_noise_model.reset(env_ids) | ||
| # if self.cfg.observation_noise_model: | ||
| # self._observation_noise_model.reset(env_ids) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: commented-out event manager and noise model reset logic should be addressed before merging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
Introduces a separate Newton-based graphed workflow pipeline for Isaac Lab using Warp kernels to achieve significant performance improvements. The implementation adds a complete parallel execution path with warp-based actuators, articulation management, and a new DirectRLEnvWarp environment class.
Major changes include:
- New Newton backend actuator implementations (
actuator_base.py,actuator_pd.py,actuator_net.py) using warp arrays instead of torch tensors - Complete articulation system rewrite for Newton (
articulation.py,articulation_data.py) with dual Torch/Warp frontend support - Warp kernels for PD control, DC motor clipping, and effort limiting
- New
DirectRLEnvWarpenvironment that aims to support CUDA graph capture for performance - Performance improvements demonstrated: Cartpole overhead reduced from 2560us to 357us, Ant from 3404us to 471us
Key concerns:
- The DC motor torque-speed clipping formula may not correctly handle negative velocities
- TODOs indicate tensor conversions in the action pipeline block CUDA graph capture, limiting the full performance benefit
- Large commented-out sections for event management and noise models suggest incomplete implementation
Confidence Score: 3/5
- Safe to merge with caveats - the parallel pipeline is isolated but has incomplete features and one potential physics bug
- The PR introduces a complete parallel execution path which limits risk to existing code. However, the DC motor clipping logic has a potential bug in handling negative velocities, and the CUDA graph support (a key performance feature) is blocked by tensor conversion issues noted in TODOs. The implementation is well-structured but incomplete (missing event management, noise models).
source/isaaclab/isaaclab/newton/actuators/kernels.pyfor DC motor physics verification, andsource/isaaclab/isaaclab/envs/direct_rl_env_warp.pyfor resolving CUDA graph blocking issues
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| source/isaaclab/isaaclab/newton/actuators/kernels.py | 3/5 | New warp kernels for PD control and DC motor clipping; DC motor torque-speed curve logic needs verification |
| source/isaaclab/isaaclab/newton/actuators/actuator_base.py | 4/5 | Base actuator class for Newton backend with warp array support; clean implementation of parameter parsing |
| source/isaaclab/isaaclab/newton/assets/articulation/articulation.py | 4/5 | Core articulation implementation for Newton with extensive state management and frontend conversions |
| source/isaaclab/isaaclab/newton/assets/articulation/articulation_data.py | 4/5 | Data container with dual Torch/Warp frontend support; comprehensive property accessors for articulation state |
| source/isaaclab/isaaclab/envs/direct_rl_env_warp.py | 2/5 | Warp-based RL environment with CUDA graph support; contains TODOs about tensor conversion blocking graphs |
Sequence Diagram
sequenceDiagram
participant User as Training Script
participant Env as DirectRLEnvWarp
participant Art as Articulation (Newton)
participant Act as Actuator Models
participant Sim as Newton Solver
participant WP as Warp Kernels
User->>Env: step(actions)
Env->>Env: wp.from_torch(actions)
Note over Env: TODO: Tensor conversion blocks CUDA graphs
Env->>Env: _pre_physics_step(wp_actions)
Env->>Art: set_joint_*_target()
Art->>Art: _torch_to_warp_dual_index()
Art->>Act: Store targets in data._actuator_target
Env->>Art: write_data_to_sim()
Art->>Act: _apply_actuator_model()
Act->>WP: compute_pd_actuator()
Note over WP: PD control: k_p*(target-pos) + k_d*(-vel)
WP->>Act: computed_effort
Act->>WP: clip_efforts_with_limits()
WP->>Act: applied_effort
Act->>Art: Update data._sim_bind_joint_effort
Env->>Sim: step(dt)
Sim->>Sim: Newton physics integration
Env->>Art: update(dt)
Art->>Art: _data.update(dt)
Note over Art: Read new states from Newton
Env->>Env: _get_observations()
Env->>Env: _get_rewards()
Env->>User: obs, rewards, dones, info
21 files reviewed, 1 comment
| joint_vel: float, | ||
| effort: float, | ||
| ): | ||
| max_effort = saturation_effort * (1.0 - joint_vel) / vel_limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: DC motor clipping formula appears incorrect. The max_effort calculation uses (1.0 - joint_vel) but this doesn't account for the sign of velocity correctly. When joint_vel is negative (motor spinning backward), (1.0 - joint_vel) becomes larger than 1.0, which seems incorrect for a DC motor torque-speed curve. Standard DC motor model: torque = k * (V - k * omega) suggests the formula should handle forward/backward rotation symmetrically.
Description
Adds graphed workflows as a separate pipeline for now.
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context..
Type of change
How to use:
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Cartpole-Warp-v0 --num_envs 4096 --headless --max_iterations 300
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Ant-Warp-v0 --num_envs 4096 --headless --max_iterations 300
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Humanoid-Warp-v0 --num_envs 4096 --headless --max_iterations 300
Results
Provides some speed-up on the Isaac front:
For Cartpole:
For Ant:
For Humanoid:
Checklist
pre-commitchecks with./isaaclab.sh --formatconfig/extension.tomlfileCONTRIBUTORS.mdor my name already exists there