Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subtitles/CC turned off solution? #2

Open
parth31533 opened this issue Dec 16, 2024 · 1 comment
Open

Subtitles/CC turned off solution? #2

parth31533 opened this issue Dec 16, 2024 · 1 comment

Comments

@parth31533
Copy link

Hey man, I was working on this, and had a similar approach like yours using the YT APIs, but in my case the videos are from oldser years and there are 7hrs long, added to that most of the older videos from 2019 have their closed captions turned off from the creator. i have thought of using AssemblyAI approach to downthe audio file and then run that to assemblyAI but that approach takes a lot of time in my case.

do you have these issue if the subtitles are turned off ?

my rep: https://github.com/parth31533/YT-Project/blob/main/Josh.ipynb?short_path=69eb1b4

Br,
Parth

@therohitdas
Copy link
Owner

Hi @parth31533

Transcript being turned off is a huge issue. Why don't you extract audio using ffmpeg and then use whispher/other model to extract text? Speech-to-text models have come a long way.

If I had to speed things up, I'd cut the audio into X chunks and then run the API request to convert speech to text parallelly.

Sorry for the late reply; I have notifications turned off.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants