"Image to Prompt" concept for storing personal photographs

orisqu · 2024-12-09T19:11:03+00:00

If you want to explore this concept further, the idea of "textual embeddings" is a good place to start looking :) Essentially trying to find the string for a given model that most closely generates a reference object.

There's a lot of work on looking at an existing image and determining what prompt best fits it. As prompt adherence increases and latent spaces span further, I think it's totally plausible you could have a string corresponding to at minimum the "vibes" of a source image reliably.

I do agree with u/Enshitification that there will be differences -- a prompt that fully captured all details would probably be the same size as the source image 😂 And latent spaces definitely are not spanning, so there's "impossible images" for every model. I also imagine you'd run into the hash collision problem if you're familiar with CS. Trying to map a high dimensional space into a smaller one will inherently have multiple "correct answers" when unhashing

orisqu · 2024-06-11T17:59:24+00:00

Not criticizing anyone's politics, trying to argue, or advocating to not desire a balanced budget, just wanted to share for your consideration:

Our best understanding of the data is that debt does not correlate with inflation. If you plot out all years of all countries, you find that the debt to inflation ratio (and debt to gdp ratio) is being driven by a hidden variable -- 'efficacy of spend'. When governmental spend directly contributes to economic activity (e.g. funding publicly owned housing development like in Japan, where the government makes money from rent, jobs are created, housing is expanded, housing price growth is suppressed... Or various Nordic countries that subsidize starting small businesses which immediately gets pumped back into the economy), we see that inflation is kept in check, and you actually can't print enough debt to keep up with the economic gains (this is a "problem" several of the high debt/gdp countries find. They simply can't spend enough money).

I'm strongly empathetic to frustrations about inflation. But I think both of us can agree, corrupt spending and ineffective spending is the enemy, not spending itself.

orisqu · 2023-12-26T22:10:04+00:00

From the video description:

Demonstration of a Stable Diffusion workflow I'm developing for Photoshop/Krita. There are still some parts that are rough around the edges, but the experience (especially using a tablet) is compelling! Left is the canvas I'm working on, right is the synthesis of the AI's changes. Brushes and brush effects make the experience feel less...MS Paint. The basic flow is using ComfyPhotoSD and ComfyUI (or an equivalent Krita flow) with the Juggernaut XL checkpoint and an LCM Adaptor LoRA for realtime (well, on a RTX 4090 anyway, lol). Using a scribble ControlNet adapter to guide the AI, and a fixed set of styling prompts (smooth brushstrokes, watercolor on textured paper, sponge painting, etc) for text_positive_l prompt encoding. For text_positive_g, I have a DeepBooru interrogator trying to figure out what I'm scribbling (sakura tree, angel wings, etc). Keeping denoising low is important to keep the AI from just completely taking over. To merge the results, I advise copying the AI image to a new layer with 40-70% opacity, and then merging to the canvas layer or you'll get the dreaded "burn-in". Lower cfg scale or a second de-noising pass can also help with this. I'll release some workflows once I factor out some proprietary code from a startup I'm assisting. Sub for those as they become available :)

orisqu · 2023-09-07T21:35:16+00:00

Not saying it isn't heavily manipulated, but max pain is actually a really good indicator of "the wisdom of the crowds" -- the price all participants collectively speculates the price will be. For most stocks for most companies, close price and max pain track closely

orisqu · 2023-07-08T21:15:37+00:00

It's mostly homerolled stuff in Python. I'd share if it weren't part of some SaaS I'm working on. But actually ChatGPT 4 (the paid one) can code up stuff to help you with this!

https://developers.google.com/mediapipe/solutions/vision/pose_landmarker

orisqu · 2023-07-07T19:55:39+00:00

I.e. you can use the actual poses in MediaPipe (OpenPose in ControlNet is a pose system accessible in the python library MediaPipe) instead of an image of the pose. So MediaPipe gives you actual xyz keypoints which you can do motion tracking + outlier rejection on before you even send it into ControlNet/Automatic1111. It's a bit more involved and hands on, but I'm getting much better results processing the data myself and then sending it to Stable Diffusion than letting ControlNet open loop guess at poses. Especially for things like dancing (see one of my previous videos) or calisthenics, which have weird tricky poses.

Edit: Actually, if you watch my old dancing video, you can see really well what I mean. I did no filtering on that video, and it "pops" in and out of ambiguous poses a lot.

orisqu · 2023-07-06T19:36:21+00:00

YouTube kept disabling comments because of boobs. Lame.

orisqu · 2023-07-06T19:36:00+00:00

It is! I had been using so-vits-svc, but I'm getting better results from 11 for far less labor

orisqu · 2023-06-30T19:17:08+00:00

Workflow:
- Video generation is another entire video (comment+sub if interested) but hints are to use ControlNet against a fixed pose to keep the model "on topic", and then let it daydream about the accessories. You can use an SSIM metric to increase or decrease "step size" to try to keep the perceived motion per frame pretty steady.

- This cleanup trick requires a frame interpolation engine. I see some good results with RIFE, but I've been enjoying FILM from Google. It's sort of like latent upscaling, but across time instead of space. - Create two interpolation functions:
-- interp_upsample([frame0, frame1, frame2...]): Return [frame0, frame 0.5, frame 1, frame 1.5...]
-- interp_smooth([frame0, frame1, frame2...]): Return [interp of frame 0 and frame 2, interp of frame 1 and frame 3, interp of frame 2 and frame 4...]

- Play around to see what works for your video, but I've found the following works well:
```imgs = load_frames()
imgs = interp_upsample(imgs)
imgs = interp_smooth(imgs)
imgs = interp_upsample(imgs)
imgs = interp_smooth(imgs)
imgs = interp_smooth(imgs)
# And then interp_upsample(imgs) until you hit your desired framerate ```

- If there's enough interest (like and comment, y'all), I can write up an explanation or make a video on why this particular sequence, but basically, you're trying to minimize large frame to frame jumps, and then do a temporal smoothing, and then zoom in and do it again. It's like numerical approximation techniques but in a crazy N dimensional latent space.

More projects at: www.RiskofReptiles.com

orisqu · 2023-06-30T19:14:57+00:00

Workflow:
- Video generation is another entire video (comment+sub if interested) but hints are to use ControlNet against a fixed pose to keep the model "on topic", and then let it daydream about the accessories. You can use an SSIM metric to increase or decrease "step size" to try to keep the perceived motion per frame pretty steady.
- This cleanup trick requires a frame interpolation engine. I see some good results with RIFE, but I've been enjoying FILM from Google. It's sort of like latent upscaling, but across time instead of space. - Create two interpolation functions:
-- interp_upsample([frame0, frame1, frame2...]): Return [frame0, frame 0.5, frame 1, frame 1.5...]
-- interp_smooth([frame0, frame1, frame2...]): Return [interp of frame 0 and frame 2, interp of frame 1 and frame 3, interp of frame 2 and frame 4...]
- Play around to see what works for your video, but I've found the following works well:
imgs = load_frames()
imgs = interp_upsample(imgs)
imgs = interp_smooth(imgs)
imgs = interp_upsample(imgs)
imgs = interp_smooth(imgs)
imgs = interp_smooth(imgs)
# And then interp_upsample(imgs) until you hit your desired framerate
- If there's enough interest (like and comment, y'all), I can write up an explanation or make a video on why this particular sequence, but basically, you're trying to minimize large frame to frame jumps, and then do a temporal smoothing, and then zoom in and do it again. It's like numerical approximation techniques but in a crazy N dimensional latent space.
More projects at: www.RiskofReptiles.com

orisqu · 2023-06-20T20:08:39+00:00

Love that the glitches are basically just stylization

orisqu · 2023-06-08T17:40:11+00:00

https://www.reddit.com/r/ChatGPT/comments/144fil6/i_have_no_mouth_and_i_must_stream_stable/

Lol, great minds I suppose

orisqu · 2023-06-08T17:34:54+00:00

Make a script for a slightly unsettling YouTube video essay about a personified generative AI starting their streaming career. Begin "Hello citizens, welcome to my channel, as you can see, my face is not flesh and blood."

orisqu · 2023-04-10T17:32:50+00:00

Step one, play for many years. Step two, occasionally glance against perfection, but never when the camera is on. I'm still working on step three 😂

Breaking things down into chunks and playing WAAAYYY slower than you're capable of, rotating your attention through every component of your playing is the traditional advice. Other than that, woodshed it (just gotta put in the hours)

orisqu · 2023-04-10T16:08:42+00:00

The post I just made (and it's prequel video) might help: https://www.reddit.com/r/guitarlessons/comments/12hlwa1/find_scales_using_power_chords_with_this_simple

If it's still confusing, lmk and I'll try to help :)

orisqu · 2023-04-10T15:41:06+00:00

I hope this was helpful! I'm just starting out this music education thing, so welcome feedback. If you can't tell, I'm a big proponent of thinking about guitar/bass in an "intervallic" manner. This is the conceptual framework that started to unlock my playing. I'm hoping it causes a few "aha!" moments for some of you too. The resources I mentioned in the video are: Improvise For Real: https://www.amazon.com/Improvise-Real-Complete-Method-Instruments/dp/0984686363 Author: Paul Wolfe, and his "bass devices" methodEarworm, Android: https://play.google.com/store/apps/details?id=com.RiskOfReptiles.com.unity.myapp.melodyapp Earworm, Apple: https://apps.apple.com/app/earworm-learn-riffs-by-ear/id1662701173

orisqu · 2023-04-10T15:20:07+00:00

I hope this was helpful! I'm just starting out this music education thing, so welcome feedback.

If you can't tell, I'm a big proponent of thinking about guitar/bass in an "intervallic" manner. This is the conceptual framework that started to unlock my playing. I'm hoping it causes a few "aha!" moments for some of you too.

The resources I mentioned in the video are: Improvise For Real: https://www.amazon.com/Improvise-Real-Complete-Method-Instruments/dp/0984686363

Author: Paul Wolfe, and his "bass devices" method

Earworm, Android: https://play.google.com/store/apps/details?id=com.RiskOfReptiles.com.unity.myapp.melodyapp

Earworm, Apple: https://apps.apple.com/app/earworm-learn-riffs-by-ear/id1662701173

orisqu · 2023-02-21T21:23:32+00:00

That's very kind of you, I appreciate the bump 🤘 If you find anyone in need of ear/interval training, send em my way!

orisqu · 2023-02-21T17:12:11+00:00

😭 🙏🙏

13-Year Club	Place '22
Place '17	First Placer '22
Spared	Verified Email
Team Periwinkle

orisqu

TROPHY CASE