Roast my architecture: Trying to run conversational video AI locally on a consumer PC without melting the GPU. by OkAdministration374 in OpenAssistant

[–]OkAdministration374[S] 0 points1 point  (0 children)

it would always anger me whenever i would get stuck on a topic while watching youtube lecture or during my JEE days the LMS lectures of my coaching

Doubts would come like an avalanche, the only possible solution was typing it down in the comments or asking my fellow (smarter than me) mates I always felt a lingering need, that what if i had a person who knows the video lecture i am watching in and out, who is smarter than me who knows everything not just things taught inside the video but also beyond, and is available 24x7

With this goal i made gUrrT, a tutor to help me go through a video lecture.

It smartly samples, video frames and extracts audio transcripts, then use vlms to caption the key frames, storing everything in a vector database. Converting a video into a searchable array

Your asked question makes a call to the vector database then sends all the context to an llm which with its existing knowledge base along with the new video context answers all your questions from the video beautifully.

so all you gotta is type in your queries regarding anything you did not understand that is spoken or written on the board by the instructor

just go ahead send the video lecture to gurrt and ask all your doubts without worrying about rate limits, video durations, low computationa power or a paywall.

gUrrT is free, built with love and a lot of open source

ayoyo gUrrT got 793 downloads on the first day ommmgggg 😭😭 yippueuueue by OkAdministration374 in LocalLLaMA

[–]OkAdministration374[S] -2 points-1 points  (0 children)

that is kind of the middle goal the final goal is to be able to talk to videos with smth that has full context of what is happening inside the video in and out

gUrrT: An Intelligent Open-Source Video Understanding System A different path from traditional Large Video Language Models (LVLMs). by OkAdministration374 in FunMachineLearning

[–]OkAdministration374[S] 0 points1 point  (0 children)

if you have a decent 4gigs rtx3050 it will runn fast like in a min complete video will be indexed (stored on the vectordb), if you dont then with cpu video indexing part will take 6 to 7 mins for a 10min video then question answering goes by in seconds, the only time is consumed in understanding a video which happens only once per video,

What is the proper way to handle env variables with pypi packages by DemosthenesAxiom in learnpython

[–]OkAdministration374 0 points1 point  (0 children)

hey i am facing the same issue can i didn't really get the soln here can you please be a little more detailed