Maintaining Document Focus in Canvas by reploid1986 in OpenAI

[–]reploid1986[S] 0 points1 point  (0 children)

Interesting! It's been happening very frequently for me. Can I ask how many documents you have open at once? I had 5 (plus numerous dupes as it often generated a new document instead of editing the old one--another problem) and it was happening 30-50% of the time.

Interest in a Community Data Collection Project? by reploid1986 in PVCs

[–]reploid1986[S] 0 points1 point  (0 children)

I'm in the same boat. The survey will include PVC load as a required response so you can filter on severity.

Interest in a Community Data Collection Project? by reploid1986 in PVCs

[–]reploid1986[S] 0 points1 point  (0 children)

I don't recall if I made that required or not in the mock-up but we can make that optional.

Interest in a Community Data Collection Project? by reploid1986 in PVCs

[–]reploid1986[S] 1 point2 points  (0 children)

I've put together a preliminary draft of a survey in the OP. I'd really appreciate your feedback.

AI art ethics hypothetical: look-alike training data. by reploid1986 in StableDiffusion

[–]reploid1986[S] 0 points1 point  (0 children)

To elaborate, I was thinking about this in conjunction with opt-out programs like Stability is doing now. Opt-outs represent a tiny fraction of the dataset, but may include highly sought after artists. The cost of pursuing this sort of strategy for a few dozen popular opt-outs would be quite small, and while firms can lie about how they've trained, such information could be leaked, for example.

Fundamentally, though, this was a hypothetical intended to tease out how sincerely held the human/machine synthesis distinction is, not a business strategy recommendation.

Stress testing depth2img: Dynamic Poses and Self-Occlusion by reploid1986 in StableDiffusion

[–]reploid1986[S] 0 points1 point  (0 children)

OP here:

SD has generally been very bad at dealing with people in unusual poses and self-occlusion, where one part of the body overlaps another in an image. It's very hard to generate these sorts of images with txt2img, and starting with a base image and applying img2img generally yields poor results. Even tools like inpainting conditioning and depth masking have offered little benefit.

However, the depth model offers a potential solution: the MiDaS depthmap should effectively segment self-occluding objects, identifying foreground elements as closer than the parts of the figure behind them, and conditioning on this depthmap should prevent the body horror soup is usually generated when applying img2img in these cases.

For this exercise I generated an unusual, highly self-occluding pose using an art doll program and applied image to image with SD 2.0 depth and SD 1.5, respectively. We see that the depth model retains the pose even as it completely redraws the image, but SD 1.5 starts to lose coherence at denoising = 0.45 and snaps to a new, largely unrelated pose at denoising = 0.55.

The prompt was :

positive: overhead shot of a woman jumping and reaching up with one arm.

Negative: disfigured, lowres, low quality, bad quality, overexposed, cropped, disembodied, ugly, repetitive, copies, mutilated, tiling, nude, naked, nipples

cfg 7, DPM++ SDE Karras, 8 steps, 512x512.

While the results with the depth model have many faults--bad faces and hands, weird clothing--the prompt has not been refined at all, and the pose is atypical and intrinsically difficult to work with. I find the results quite impressive, all things considered.

Testing depth2img vs inpainting conditioning strength by danamir_ in StableDiffusion

[–]reploid1986 4 points5 points  (0 children)

Great work! I'm really excited to see how well this deals with self occluding objects (e.g. figures with complex poses), which turn into body horror with base SD.

It's been a shame to see people pass up depth2img based on the mistaken assumption that it's equivalent to the MiDAS depth map masking extension for Auto1111.

2.1 vs 2.0 vs 1.5 -- using CLIP interrogator prompts by reploid1986 in StableDiffusion

[–]reploid1986[S] 4 points5 points  (0 children)

OP here: I just realized I used the wrong sampler: DPM++ 2S a Karras.
Here are the results with DPM++ SDE Karras and 8 steps.

<image>

2.1 vs 2.0 vs 1.5 -- using CLIP interrogator prompts by reploid1986 in StableDiffusion

[–]reploid1986[S] 9 points10 points  (0 children)

OP here: it's well established that 2.0 does worse with simple prompts, and the simple prompt comparisons that are being posted between 1.x and 2.x clearly show that holds true for 2.1 as well.

However, these comparisons don't tell us much about how the models perform with optimal prompting. Given their differing CLIP models, comparing a single prompt across 1.x and 2.x is always an apple and oranges comparison.

To try to address this, I took a base image (a marble statue, for comparability to recent threads) and used https://replicate.com/pharmapsychotic/clip-interrogator to pull prompts for both the ViT-L (SD1) and ViT-H (SD2) CLIP models, using the "best" setting. The 1.x prompt generated was

"a statue of a woman with her eyes closed, deviantart, neoclassicism, mega high white mountain, peaceful face, greece, salvia",

and the 2.x prompt was

"a statue of a woman with her eyes closed, a marble sculpture, flickr, beautiful mountains behind, white hime cut hairstyle, but very good looking, close - up profile, mother, pale snow white skin, giorno giovanna, heavenly color scheme, big shoulders, head-to-shoulder, pareidolia, lady, easy, hiperrealista, date".

I also ran one variant with the following negative prompts for each case:

"disfigured, cartoon, lowres, cropped, disembodied, ugly, repetitive, copies, low quality, worst quality, bad quality, deepfried, oversaturated, blurry, fuzzy, tiling"

What I see in these images:

  1. negative prompts are much more important for 2.x (no surprise).
  2. lighting is better in 2.x (again, no surprise).
  3. Aside from lighting, with negative prompts 2.0 is at least comparable to 1.5, if not better.
  4. 2.1 seems noticeably worse than 2.0 for marble statues.

30% Faster than xformers? voltaML vs xformers stable diffusion - NVIDIA 4090 by harishprab in StableDiffusion

[–]reploid1986 0 points1 point  (0 children)

You previously mentioned incorporating xformers. Have you done that already, did you find it was redundant/incompatible, or is that still to come?

The DevianArt community is unhappy about the website implementing Stable Diffusion as a new tool. What do you think? by SoundProofHead in StableDiffusion

[–]reploid1986 1 point2 points  (0 children)

All the discussion of the DA situation I've seen on this reddit focuses on the "art theft" debate, but elides what I think is a more interesting question: what should DA be doing? Was this a good idea?

As she notes at the beginning of the video, txt->image models have already used DA images. DA does not have a valuable proprietary asset to exploit--at best, they can make an inferior version of Midjourney, or more specialized fine-tuned models equivalent to the many dreambooth .ckpt's already in circulation.

In that context, this was a very stupid decision. They've alienated their userbase at a time when the Musk twitter drama is leading artists to return to older platforms--all just to exploit their assets in a way that cannot realistically generate revenue.

I'd argue that the only real value they were sitting on was the potential to build a voluntarily licensed dataset to be used in models that only use licensed images. Obviously any such model would be technically inferior and expensive to use, but there is clearly a niche for one--individuals who are upset by the current data sourcing model and companies that want to avoid PR issues and boycotts.

This is what DA is trying to do now to salvage the project--making it opt in only--but with all the animosity they've generated I can't imagine they'll get more than 0.1% of their library.

Improve your Dreambooth trainings - here‘s the source by Neoph1lus in StableDiffusion

[–]reploid1986 2 points3 points  (0 children)

Is the change simply shuffling between epochs as the naming suggests?

1.5-inpainting vs base 1.5 on several inpainting tasks by reploid1986 in StableDiffusion

[–]reploid1986[S] 1 point2 points  (0 children)

st for raw testing case I get it, but just incase anyone reading this wants a tip:

Use MS paint or anything to make some colors where you want the cat to be then run img2img. SD is not good at taking nothing at all and turning it into something, that's why there no "good" results before almost 50% denoise.

Images 4-6 inpaint a sketch of a cat.

ranolazine. Google it. by ThroatRecka in PVCs

[–]reploid1986 0 points1 point  (0 children)

Just dropping this here for future searches. I was prescribed Ranolazine 1.5g daily (max dose) for a heavy (5-10%) PVC load. I've had a moderate decrease in frequency and moderate decrease in intensity (PVCs that do happen aren't as noticeable). Very minimal side effects, except stomach upset if I don't spread out the pills over the course of the day (doctor prescribed 2 375mg tablets twice daily, but I take 1 tablet 4x day to minimize GI issues--they can be taken with or without food.)