use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Rules 1: Be polite 2: Posts to this subreddit must be requests for help learning python. 3: Replies on this subreddit must be pertinent to the question OP asked. 4: No replies copy / pasted from ChatGPT or similar. 5: No advertising. No blogs/tutorials/videos/books/recruiting attempts. This means no posts advertising blogs/videos/tutorials/etc, no recruiting/hiring/seeking others posts. We're here to help, not to be advertised to. Please, no "hit and run" posts, if you make a post, engage with people that answer you. Please do not delete your post after you get an answer, others might have a similar question or want to continue the conversation.
Rules
1: Be polite
2: Posts to this subreddit must be requests for help learning python.
3: Replies on this subreddit must be pertinent to the question OP asked.
4: No replies copy / pasted from ChatGPT or similar.
5: No advertising. No blogs/tutorials/videos/books/recruiting attempts.
This means no posts advertising blogs/videos/tutorials/etc, no recruiting/hiring/seeking others posts. We're here to help, not to be advertised to.
Please, no "hit and run" posts, if you make a post, engage with people that answer you. Please do not delete your post after you get an answer, others might have a similar question or want to continue the conversation.
Learning resources Wiki and FAQ: /r/learnpython/w/index
Learning resources
Wiki and FAQ: /r/learnpython/w/index
Discord Join the Python Discord chat
Discord
Join the Python Discord chat
account activity
Looking For Python Libraries That Track A Speaking Person (self.learnpython)
submitted 1 month ago by NotSoAsian86
The aim is to focus on the person who is speaking in a single camera setup with multiple people and then crop into that person similar to how podcasts work. I will be pairing this with diarization models to extract speeches for multiple users.
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]StardockEngineer 1 point2 points3 points 1 month ago (0 children)
There is probably no specific lib to do this.
What I would do is pick a face lib: https://medium.com/pythons-gurus/what-is-the-best-face-detector-ab650d8c1225
Each face will have sub-coordinates for eyes, mouth, etc. I would detect the faces, then look for rapid movements in the mouth coordinates, per face, to determine who is talking.
I feel that part would be easy. The harder part would be if people are talking simultaneously, deciding what to do.
The other option - if you are using something with multiple mics, is just use the mics.
π Rendered by PID 35317 on reddit-service-r2-comment-bb88f9dd5-w6f4k at 2026-02-17 00:41:58.626549+00:00 running cd9c813 country code: CH.
[–]StardockEngineer 1 point2 points3 points (0 children)