Skip to content

Conversation

@dataverse-hub
Copy link

Actor return selected_language as 'English'

Need to convert a language name to ISO 639-1 code

Mark and others added 15 commits September 15, 2025 23:11
…aper by implementing _scrape_channel method, which calls Apify actor with channel_url, max_videos, start_date, and end_date to fetch multiple video transcripts and metadata.

- Enhanced input schema documentation to clarify field usage:
  - youtube_url: For single video scraping, paired with language for transcript.
  - channel_url: For fetching multiple videos from a channel, paired with max_videos, start_date, and end_date (both YYYY-MM-DD, optional)
…aper by implementing _scrape_channel method, which calls Apify actor with channel_url, max_videos, start_date, and end_date to fetch multiple video transcripts and metadata.

- Enhanced input schema documentation to clarify field usage:
  - youtube_url: For single video scraping, paired with language for transcript.
  - channel_url: For fetching multiple videos from a channel, paired with max_videos, start_date, and end_date (both YYYY-MM-DD, optional)
…itional_fields

remove required metadata date, fix conditional field validation
…itional_fields

convert only if x content is nested
Implement character-level similarity using difflib for non-spaced languages
Include unicodedata for text normalization to handle diacritics
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants