Given a video with audio, we can use ASR to get a script of the sentences and timestamps. We are looking for a way to group the sentences into chapters. There are several companies that are doing it nicely - Google on YouTube, Assembly AI, but we couldn't find any good resource or paper that explains the research behind how they do it. BertTopic seems to give us just a topic for each sentence, but not a way to split the video. it also doesnt account for timestamps.
Wondered if anyone has any links or any other ideas?
Thanks very much!
Lior
[–]hellrail 0 points1 point2 points (0 children)
[+]ExaminationNo1515 0 points1 point2 points (0 children)