fix(web): match YouTube URLs by hostname instead of substring#1182
fix(web): match YouTube URLs by hostname instead of substring#1182abhay-codes07 wants to merge 1 commit into
Conversation
isYouTubeUrl() and a few inline checks classified any URL containing youtube.com / youtu.be as YouTube, so lookalike hosts like notyoutube.com, youtube.com.evil.example, or URLs that only carry youtube.com in the path were rendered with the YouTube icon, grouped as YouTube videos, and opened in the embed player. Centralize detection in lib/url-helpers.ts using the same parseWebUrl/hostnameMatches approach isTwitterUrl already uses, and switch the document modal, document icon, and timeline view to the shared helper. The modal now also recognizes youtu.be short links, which the player component already supported. Add regression tests covering lookalike domains, subdomain tricks, path-only matches, short/mobile links, uppercase schemes, scheme-less URLs, and nullish input. Fixes supermemoryai#1175
There was a problem hiding this comment.
Summary
Reviewed — found 1 issue(s). This PR changes YouTube URL detection to use hostname-based matching and updates the web UI call sites plus helper coverage. The review focused on correctness around the widened detection behavior and consistency with existing video ID extraction.
Findings
apps/web/lib/url-helpers.ts
- Uppercase YouTube hosts are now detected, but the existing video ID extraction path is still case-sensitive, so some newly accepted URLs can render as invalid instead of embedding.
Verdict
| * Lookalike hosts (`notyoutube.com`, `youtube.com.evil.example`) and URLs | ||
| * that only contain "youtube.com" in the path do not match. | ||
| */ | ||
| export const isYouTubeUrl = (url: string | undefined | null): boolean => { |
There was a problem hiding this comment.
isYouTubeUrl() now intentionally accepts uppercase YouTube hosts, as covered by the new cases for HTTPS://youtube.com / WWW.YOUTUBE.COM, but the existing video ID extractors still use case-sensitive regexes for youtube.com / youtu.be. A URL like HTTPS://WWW.YOUTUBE.COM/watch?v=dQw4w9WgXcQ will now be routed to YoutubeVideo / YoutubePreview, but extractVideoId() / extractYouTubeVideoId() return null, so the modal shows Invalid YouTube URL format and the card cannot embed. Please make the extraction path URL/hostname-based too, or at least case-insensitive, before broadening detection.
TestingThe testing subagent verified the hostname-based YouTube detection behavior through targeted helper coverage plus authenticated UI checks for canonical YouTube URLs, non-YouTube URLs containing Commands run: PATH="/home/ubuntu/.bun/bin:$PATH" bun test apps/web/lib/url-helpers.test.tsResult: Evidence: Verdict✅ Passed. The tested detection behavior works for canonical hosts and avoids the original substring false positives; one separate review finding remains around keeping video ID extraction consistent with the broader detection. Attached Images and Videos 🎥 View recording: pr1182_youtube_hostname_real_api_ui.webm |








What
Fixes #1175
YouTube detection in the web app was done with substring checks (
url.includes("youtube.com")etc.) in four places, so any URL that merely contains those strings got classified as YouTube:Those URLs then got the YouTube icon in
document-icon.tsx, were grouped under "YouTube Videos" in the timeline view, rendered as a YouTube preview on memory cards, and opened in the embed player in the document modal.How
isYouTubeUrl()toapps/web/lib/url-helpers.ts, reusing the existingparseWebUrl+hostnameMatcheshelpers thatisTwitterUrlalready uses. It matchesyoutube.com/youtu.beand their real subdomains (www.,m.,music.) only.components/utils.tsnow re-exports the shared helper, so existing imports (memory card previews,useYouTubeChannelName) are unchanged.document-modal/content/index.tsx,document-icon.tsx,timeline-view.tsx) to the shared helper.youtube.com, soyoutu.beshort links fell through to the webpage renderer. They now open in the embed player —yt-video.tsxalready knew how to extract IDs from short links, so no change was needed there.Uppercase schemes (
HTTPS://youtube.com/...) and scheme-less inputs (youtube.com/watch?v=...) keep working sinceparseWebUrlnormalizes both.Testing
apps/web/lib/url-helpers.test.ts— 10 cases covering canonical/watch/embed/shorts URLs, real subdomains, short links, uppercase schemes, scheme-less input, lookalike domains, subdomain tricks, path-only matches, and nullish input. All pass withbun test.biome checkclean on all touched files.bun run buildforapps/webcompletes successfully.cc @MaheshtheDev