fix: last segment handling in ShotRS2VPipeline with precise video and audio trimming#982
fix: last segment handling in ShotRS2VPipeline with precise video and audio trimming#982helloyongyang merged 2 commits intomainfrom
Conversation
…video and audio trimming
There was a problem hiding this comment.
Code Review
This pull request improves the trimming logic for the last segment of generated video and audio by tracking the actual required frames when padding is applied. It also removes an unnecessary float conversion when appending video segments to the result list. A review comment suggests simplifying the trimming logic in the case where no padding is present, as the current calculations become redundant when the padding length is zero.
lightx2v/shot_runner/rs2v_infer.py
Outdated
| else: | ||
| video_pad_len = pad_len // audio_per_frame | ||
| audio_pad_len = video_pad_len * audio_per_frame | ||
| video_seg = gen_clip_video[:, :, : gen_clip_video.shape[2] - video_pad_len] | ||
| audio_seg = audio_clip[:, : audio_clip.shape[1] - audio_pad_len].sum(dim=0) |
There was a problem hiding this comment.
The logic in the else block can be simplified. Since is_last is defined as True whenever pad_len > 0 (line 142), and the if block at line 211 handles the case where segment_actual_video_frames is set (which happens when is_last and pad_len > 0), the else block is only reached when pad_len is 0. In this case, video_pad_len and audio_pad_len will always be 0, making the trimming logic redundant.
| else: | |
| video_pad_len = pad_len // audio_per_frame | |
| audio_pad_len = video_pad_len * audio_per_frame | |
| video_seg = gen_clip_video[:, :, : gen_clip_video.shape[2] - video_pad_len] | |
| audio_seg = audio_clip[:, : audio_clip.shape[1] - audio_pad_len].sum(dim=0) | |
| else: | |
| video_seg = gen_clip_video | |
| audio_seg = audio_clip.sum(dim=0) |
…ne by removing padding logic
No description provided.