Skip to content

fix: last segment handling in ShotRS2VPipeline with precise video and audio trimming#982

Merged
helloyongyang merged 2 commits intomainfrom
gp/merged
Apr 2, 2026
Merged

fix: last segment handling in ShotRS2VPipeline with precise video and audio trimming#982
helloyongyang merged 2 commits intomainfrom
gp/merged

Conversation

@GACLove
Copy link
Copy Markdown
Contributor

@GACLove GACLove commented Apr 1, 2026

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves the trimming logic for the last segment of generated video and audio by tracking the actual required frames when padding is applied. It also removes an unnecessary float conversion when appending video segments to the result list. A review comment suggests simplifying the trimming logic in the case where no padding is present, as the current calculations become redundant when the padding length is zero.

Comment on lines +215 to +219
else:
video_pad_len = pad_len // audio_per_frame
audio_pad_len = video_pad_len * audio_per_frame
video_seg = gen_clip_video[:, :, : gen_clip_video.shape[2] - video_pad_len]
audio_seg = audio_clip[:, : audio_clip.shape[1] - audio_pad_len].sum(dim=0)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic in the else block can be simplified. Since is_last is defined as True whenever pad_len > 0 (line 142), and the if block at line 211 handles the case where segment_actual_video_frames is set (which happens when is_last and pad_len > 0), the else block is only reached when pad_len is 0. In this case, video_pad_len and audio_pad_len will always be 0, making the trimming logic redundant.

Suggested change
else:
video_pad_len = pad_len // audio_per_frame
audio_pad_len = video_pad_len * audio_per_frame
video_seg = gen_clip_video[:, :, : gen_clip_video.shape[2] - video_pad_len]
audio_seg = audio_clip[:, : audio_clip.shape[1] - audio_pad_len].sum(dim=0)
else:
video_seg = gen_clip_video
audio_seg = audio_clip.sum(dim=0)

@helloyongyang helloyongyang merged commit 5230943 into main Apr 2, 2026
2 checks passed
@helloyongyang helloyongyang deleted the gp/merged branch April 2, 2026 03:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants