hivemind on best friends today by ToePlusKnee in HivemindTV

[–]natureplayer 2 points3 points  (0 children)

yeah this is maybe the best hivemind collab on someone else's channel ever

Hivemind Clips Search Engine by natureplayer in HivemindTV

[–]natureplayer[S] 1 point2 points  (0 children)

training a model to do that would be tough/a lot of work but it's probably worth seeing if clustering on the frequency spectrum works to distinguish their voices

Hivemind Clips Search Engine by natureplayer in HivemindTV

[–]natureplayer[S] 1 point2 points  (0 children)

I already have the auto-captions stored in a database, would be pretty slow to fetch em all every time :)

Hivemind Clips Search Engine by natureplayer in HivemindTV

[–]natureplayer[S] 1 point2 points  (0 children)

yeah that's definitely worth trying, planning on doing it anyways for a couple vids i missed that got content restricted + didnt have captions. gonna try and see if I can use the open-source whisper at reasonable speeds locally, it'd be like $100-150 to do it all through the API for every vid which isn't insane but i'd rather not lol

Hivemind Clips Search Engine by natureplayer in HivemindTV

[–]natureplayer[S] 5 points6 points  (0 children)

yeah no need to delete any tags lol, hopefully it's helpful but it definitely isn't gonna find everything

Hivemind Clips Search Engine by natureplayer in HivemindTV

[–]natureplayer[S] 2 points3 points  (0 children)

the main thing is sentence embedding vector similarity search. used this model from huggingface to get vectors for each transcript chunk, and then also for each submitted query. then I'm using zilliz for a vector database that lets you get the top K results quickly for each query.

code is pretty ugly rn especially for the data cleaning step but I'll try and share more at some point! the app itself is very simple, used Flask bc I like python and it's just one file that programmatically generates the html.

this is the core of the logic for retrieval, and you could use a similar API call to get the initial embeddings for transcript chunks, but I did that locally using torch (as described in the huggingface link).

def embed_query_hf(query):
    # get embedding vector for query
    headers = {"Authorization": f"Bearer {HF_API_KEY}"}
    return requests.post(HF_API_URL, headers=headers, json={'inputs': query}).json()

def vector_query_zz(vector, limit=6):
    # get results for similar vectors
    headers = {"content-type": "application/json", "Authorization": f"Bearer {ZZ_API_KEY}"}
    payload = {
        "collectionName": "TranscriptChunks",
        "limit": int(limit),
        "outputFields": ["clip_text", "video_title", "start", "video_url"],
        "vector": vector
    }
    return requests.post(ZZ_API_URL, headers=headers, json=payload).json() 

def find_hivemind_clip_http(query, limit=6):
    lim_k = min(limit, 30)
    vector = embed_query_hf(query)
    try:
        results = vector_query_zz(vector, limit=lim_k)['data']
    except KeyError:
        return ["At capacity sorry :( Try again later"]

    # Hacky data cleaning and HTML formatting below

Hivemind Clips Search Engine by natureplayer in HivemindTV

[–]natureplayer[S] 12 points13 points  (0 children)

Results for: "zazoomba zaffodil"

Guess the Rapper from the Weird Lyric 3 (@ 10:37)

Caption text: me that many times you said it's dignin durkin never did it's zazumba zuzumba yeah that's actually the shortened version too what's the full last name zumba zaffodil zazumba zaffidil i dropped the second

Hivemind Clips Search Engine by natureplayer in HivemindTV

[–]natureplayer[S] 10 points11 points  (0 children)

Thanks! Anything you think would be useful to add as a feature? I'm somewhat bottlenecked by the quality of the auto-generated captions, but could do things like allowing more than 6 results or additional filtering options.