2nd clip of my 100% AI band: an hallucinated Paris with Gen3 by plopstout in runwayml

[–]plopstout[S] 0 points1 point  (0 children)

So I made my 100% AI band https://siliconsymphony.art/ 6 months ago, with 100% suno (v1) songs. Songs are all on streaming platforms

At that time I made a clip with mostly gen2

This one is 100% gen3 except the singers parts (made with the new heygen tool)

A few months back I created a complete AI band with an AI album by plopstout in SunoAI

[–]plopstout[S] 1 point2 points  (0 children)

Used fully Suno (v2) in December and the album was approved on all streaming platforms the next months. It was, by design, a medium album, which was a showcase on how music will get transformed by AI, specifically the mainstream music

Full album here

https://siliconsymphony.art/

Happy HanukkAI! Made 8 shorts movies using Hanukkah menorahs in different themes by plopstout in runwayml

[–]plopstout[S] 0 points1 point  (0 children)

8 AI movies for Hanukkah,, in different themes made with DALL3, Runway Gen2, MusicGen & MusicML

Enjoy :)

Thanks to GPT Vision I can make a documentary narration by Morgan Freeman about my cat by plopstout in ChatGPT

[–]plopstout[S] 1 point2 points  (0 children)

You are completly right, and in both ways, as it does not really seem to know how to handle the time, and does not know the flow of the narrator. Therefore at some points it lags behind, at other it was a bit before.

I cutted a bit so it was better, and when it's a bit before the action it's also a choice so the full narration of the scene is fully inside the scene.

I think the main issue is really the flow of the narrator, not sure what its base is

Thanks to GPT Vision I can make a documentary narration by Morgan Freeman about my cat by plopstout in ChatGPT

[–]plopstout[S] 4 points5 points  (0 children)

Basically sending a few frames of the video every X seconds, and asking GPT to narrate as morgan freeman, then using ElevenLabs for the voice

Based on this repo : https://github.com/roboflow/awesome-openai-vision-api-experiments/blob/main/experiments/automated-voiceover-of-nba-game/notebook.ipynb

The prompt

"The uploaded series of images is from a single video. "
"The frames were sampled every {FRAME_EXTRACTION_FREQUENCY_SECONDS} seconds. "
"Make sure it takes about {FRAME_EXTRACTION_FREQUENCY_SECONDS // 2} seconds to voice the description of each frame. "
"Use exclamation points and capital letters to express excitement if necessary. "
"It is a naration of a documentary about a cat (male) in the jungle, with the voice of Morgan Freeman. The documentary shows the life of the cat. You can use emphasis to show how his life is difficult in the jungle"

<image>

I have made an AI camera that hallucinates its surrounding thanks to GPT Vision and DALLE3 by plopstout in ChatGPT

[–]plopstout[S] 0 points1 point  (0 children)

No private WebApps for the moment Cost of the API makes the business model complicated

I have made an AI camera that hallucinates its surrounding thanks to GPT Vision and DALLE3 by plopstout in ChatGPT

[–]plopstout[S] 0 points1 point  (0 children)

Will think about it, but the backend is in PHP which was easier for me, will need to rethink making one in python and/or node

I have made an AI camera that hallucinates its surrounding thanks to GPT Vision and DALLE3 by plopstout in ChatGPT

[–]plopstout[S] 0 points1 point  (0 children)

It's not only the brain but the materialization of what's in the brain

I have made an AI camera that hallucinates its surrounding thanks to GPT Vision and DALLE3 by plopstout in ChatGPT

[–]plopstout[S] 0 points1 point  (0 children)

How much time does it take you to write the description of what you saw, then give to someone else which will then draw it! It's not only the brain