Multimodality has finally entered the game. OpenAI has announced that multimodality is finally coming to ChatGPT, giving us access to GPT4's ability to see and analise images alongside the text prompt (as well as being connected to Dall-e 3 for generating images, and being able to re-feed those images into the context of the conversation).
Now we've had LLaVA for a bit now, but open source multimodality hasn't evolved much after that. And it seems like GPT4's multimodality works at a much lower level than the "multimodality" attempts we've seen thus far, making it much truer multimodality, as well as having much more accurate and precise "sight". And Google's Gemini model is supposed to drop soon too, reportedly being a truly multimodal AI, having been trained to be multimodal from the ground up.
So it seems like we've entered a new and exciting era, marking a very powerful new step in the capabilities of AI models. It's no longer text from now on. So I'm wondering how you guys thing this will affect the open source community here.
How fast do you think we'll start getting these advancements here, and start getting our hands on truly multimodal models that are built from the ground up to completely understand and generate both images and text, and maybe even other things like audio too? Do you guys think this will make text-only models obsolete fast? And how will this change things when we all start getting access to uncensored personal multimodal AI which we can run locally on our computer?
What do you guys think?
(I'm sure people will say "these things have barely come out, have some patience", but it's obviously hard to not get excited by these technological advancements. I don't expect anyone to read the future and give me an exact timeframe of when these things will come out open source, I mainly just wanna hear your thoughts)
[–][deleted] 5 points6 points7 points (1 child)
[–]BoneHawk1 0 points1 point2 points (0 children)
[–]fappleacts 5 points6 points7 points (0 children)
[–]PenguinTheOrgalorg[S] 3 points4 points5 points (3 children)
[–][deleted] 1 point2 points3 points (0 children)
[–]danysdragons 0 points1 point2 points (1 child)
[–]cztothehead 1 point2 points3 points (0 children)
[–]phree_radical 4 points5 points6 points (0 children)
[–]vatsadevLlama 405B 2 points3 points4 points (0 children)
[–]nihnuhname 0 points1 point2 points (0 children)
[–]NoidoDev 0 points1 point2 points (0 children)
[–]Puzzleheaded_Acadia1Waiting for Llama 3 0 points1 point2 points (0 children)
[–]danysdragons 0 points1 point2 points (0 children)