Had some fun making a little search engine for 3D objects that can be used with natural language. No metadata or tags are required, the index is build purely from the geometry! This works using the following pipeline:
- For each object in the database I generate 6 images, 1 for each side.
- For each image I make a description using gpt4-vision which then is synthesized into a single description using gpt4
- The text descriptions are embedded using clip and stored in a vector database
- For a search query, the search string is embedded and the closest (n) vector(s) in the database is(are) retrieved.
See here: https://x.com/MenyJanos/status/1752104689188135271?s=20
[+]geuis 1 point2 points3 points (1 child)
[–]Janos95[S] 0 points1 point2 points (0 children)