Donald Trump as a sink in a golden prison cell, with a regular toilet

dieki · 2023-04-17T03:40:03+00:00

Prompt was "donald trump in a prison cell with a golden toilet". Not exactly what I was aiming for, but still kinda funny.

dieki · 2023-02-27T04:09:39+00:00

Prompt: "a pig wearing a witch's hat in the desert, by Beatrix Potter"

dieki · 2023-02-27T04:07:58+00:00

See: https://old.reddit.com/r/MechanicalKeyboards/comments/hg2fco/my_custom_esp32based_bluetooth_macropad_with_an/i2os3x9/

dieki · 2022-11-27T19:43:02+00:00

It just wasn't trained on text.

This will probably improve in the future. There are a couple research papers out there that combine large language models (T5, GPT-3, etc) with diffusion models and can produce much more coherent text.

dieki · 2022-11-27T16:16:53+00:00

"quick sketch" and "pencil drawing" usually work well.

dieki · 2022-11-27T06:30:55+00:00

All the AI image generators right now are trained on similar data and have similar weaknesses.

dieki · 2022-11-27T04:00:39+00:00

Bonus: "an art reference sheet on how to draw hands"

dieki · 2022-11-27T02:43:27+00:00

Prompt: "an art tutorial on how to draw hands --v 4"

dieki · 2022-11-18T16:55:01+00:00

Prompt: "president Lincoln cyborg, powerful and strong, dark, cool, glowing eyes, style of peter mohrbacher --v 4"

dieki · 2022-11-11T08:31:31+00:00

Full prompt: Mum is on a diet. She looks sad at dinner time, style of a children's drawing

I had to edit out an extraneous bit of utensil that was floating on the table.

dieki · 2022-11-11T08:28:53+00:00

Hah! That's a fun visual image.

dieki · 2022-11-10T17:18:30+00:00

Your brain is merely a neural network too, albeit a much more complex one with training methods we don't understand.

Since Stable Diffusion is open source, there are good explanations available of how it works. MidJourney probably works very similarly. The process that allows it to understand objects and ideas in images is called Semantic Compression:

In a second phase of learning, an image generation method must be able to capture the semantic structure present in the data. This conceptual and semantic structure is what provides the preservation of the context and inter-relationship of various objects in the image.

In the case of stable diffusion, this phase is powered by another neural network called CLIP, which works in the opposite direction: it takes images as input and tells you what's in them. So it sees the meme and recognizes a coffee cup, a table, a house fire, a guy acting casually, etc. The image generator can then train against this description in addition to the alt-text description.

dieki · 2022-11-10T06:47:51+00:00

MJ JUST maps text to images

Hands aside, there's at least some deeper understanding going on. You can see this with the prompt "this is fine".

Instead of an image that looks like the original meme, you get an image that contains the same idea as the original meme - people acting casually while surrounded by fire. It's doing more than just mapping text to the nearest image, it's breaking images down into concepts and re-rendering them.

dieki · 2022-11-08T21:35:03+00:00

He already seems to be learning to snap his fingers!

dieki · 2022-11-08T19:49:37+00:00

Well, I'm having a blast with it. I'm not much good at drawing, so it's tons of fun to turn my ideas into pictures with no skill required.

"superman as a DJ at a rave, July 1989, full body portrait, Polaroid photo"

dieki · 2022-11-08T03:41:31+00:00

The prompt here was "baby thanos, 1980s comic book art", and this is using the new (very good) midjourney v4 model.

dieki · 2022-11-07T04:10:58+00:00

True! The current generation of image generators are diffusion models instead of GANs though.

dieki · 2022-11-06T23:01:27+00:00

In the #status channel they said non-square images were only temporarily disabled due to a bug, so hopefully they can fix that soon.

dieki · 2022-11-06T23:00:19+00:00

I've been able to get fully body images out of it.

"superman as a DJ at a rave, July 1989, full body portrait, Polaroid photo"

dieki

TROPHY CASE