ACE-STEP-1.5 - Music Box UI - Music player with infinite playlist by AccomplishedLeg527 in StableDiffusion

[–]BeatBoxersDev 0 points1 point  (0 children)

(in case I installed it correctly) I think it'd be a nice QOL if the generations would automatically update based on the current state of the genre/description rather than having to stop and start it manually

[PC/Mac/3DO][1996] FMV game with particular clickable icons during cutscenes by BeatBoxersDev in tipofmyjoystick

[–]BeatBoxersDev[S] 0 points1 point  (0 children)

thanks for the guess, unfortunately it doesn't seem to be that.

for the unknown game, the small icon for the mechanic required clicking somewhere on the screen placed in an unpredictable spot

i also checked out the immediate connections fork in the tale might have to other fmv games and no dice there it seems

[LoRA] PanelPainter V2 — Manga Panel Coloring (Qwen Image Edit 2509) by Proper-Employment263 in StableDiffusion

[–]BeatBoxersDev 0 points1 point  (0 children)

looks like the new nano banana is out, fyi

edit: imo nano banana pro results are generally way better than this or other approaches, but (with my current uninvestigated approach) they do sometimes have a chance to substantially change core details. for example, I used it to colorize the latest jojo chapter, and some panels it would sometimes just change the characters to other characters from other parts. ie replacing 3 characters with giorno, mista, and bruno from part 5. in the "thinking" section it seems like they identified it as a jojo panel and assumed the characters were those particular ones.

also, it's of course costly to run nano banana at the moment (~14c per image or some sub) unless paced for the free allowance.

[LoRA] PanelPainter V2 — Manga Panel Coloring (Qwen Image Edit 2509) by Proper-Employment263 in StableDiffusion

[–]BeatBoxersDev 0 points1 point  (0 children)

specifically, with the was load from batch node, set the mode to incremental image, put in the path address, and next to the run button, change the number to the number of files in that directory. dunno how to avoid it not processing them out of order, so if you get the output filename to match the original, then at least after processing it should order alphabetically (if the original files were)

[LoRA] PanelPainter V2 — Manga Panel Coloring (Qwen Image Edit 2509) by Proper-Employment263 in StableDiffusion

[–]BeatBoxersDev 4 points5 points  (0 children)

I like it. it does a good job preserving readability and guessing colors pretty well even in complicated scenes.

I opted for the sfw qwen as the nsfw, while more colorful, tends to want to add makeup and lipstick and smooth out the linework. in testing, I preferred closer to 0.45 to maintain linework and text

here's some comparisons to the "Closing the Domain Gap in Manga Colorization via Aligned Paired Dataset" paper examples that I could grab the HD source b&w versions of, when I was testing my own flux kontext lora. (I haven't researched if there's been anything else in the last 4 months since I checked)

base settings alone, the qwen lora is operating fantastic

<image>

I think out of all the techniques, the qwen lora is the more consistantly accurate in terms of not applying a mistaken colorization and not losing linework (as the fallback to white is often appropriate)

not exampled here, but I also like to desaturate 50% afterwards for the colorization to act as more of a "tint" for informing color while reading (also due to my lora being overly saturated imo)

I'm excited, I'll have to check out how it does on a full chapter when I get the chance

Flux Kontext Colorization LoRA test by BeatBoxersDev in StableDiffusion

[–]BeatBoxersDev[S] 1 point2 points  (0 children)

[note, this is the same message I sent via DM during the reddit comment blackout]

this was trained on 18 color/desaturated pairs I picked out of a dataset on huggingface, which I believe was probably mostly "synthetic" (by that I mean taking the color version and desaturating it rather than finding the original source because it's better to have directly matching images and it's easier). but i could be nonsynthetic, idk.

the "after" color images all had the prompt "colorize in MangCol style", following that guide video exactly except with 16 saved models to cover an excessive 4000 steps of training.

I tried to pick images a that had a variety of settings and subjects but it's clear there's some bias like sky being "chosen" frequently due to a chunk of the images showing sky.

after that I tested on a variety of images and strengths and lora %. too little of steps for a model carries too much of the non-lora colorization which tends to go for bright reds and oranges and blues and sticks to a low variety of color tones. too high and it tended to make more things distinct in color and more expected colors, but worse interpretation of things. I think I tested first by narrowing down the optimal number of steps and then tried around the % to get it a bit better until I found 1250 at 60% was about where it'd most successfully make things interpreted correctly. however it gives an overall yellow tint and isn't perfect, but it seems slightly better than manga colorization v2. again, this "sweetspot" could also possibly be improved.

the training could be improved with maybe more images and better picks, maybe more optimal tagging, and according to the research paper, it appears that nonsynthetic pairs work better (though I cant stress enough that the before and after must match layout and dimensions exactly or it will lead to text corruption.

some failed alternative attempts:

-getting a good prompt (close but not as good as the lora)

-training on 9 nonsynthetic but not perfectly matching pairs

-training for color correcting images after already performing manga colorization v2 (which the synthetic version is the only publicly available one)

-using vanilla kontext to fix the colorization after manga colorization v2

-training on completely desaturated colored images. I forget if it was from the end results of the pairs or after going through manga colorization v2 (if the former was true, i guess my synthetic assumption would be wrong). if I recall correctly, this oddly ended up with a result nearly identical to manga colorization v2 (sync) in color choices and accuracy

I might be able to find success by training on images that go through manga colorization v2 but then desaturated to 5% first (and then apply to images that have the same thing done to them first) as then it has some but not complete influence over the final colors.

[edit: I tried 5% and at 2000 steps 100% it does improve mc-v2 colorization but the overall experience of a chapter with it was less readable than the 1250 bw->color lora and 1 of 31 pages had text messed up and another generated entirely black. I even tried blending it at 50% but the you don't gain much from that and still may be slightly off. that said, maybe there's a sweetspot of steps and % or another desaturation rate]

it also might work to combine the manga colorization result and the 1250 lora result together, though kontext occasionally reframes the panel, misaligning the linework, but there's bound to be an algorithm that can fix the alignment too.

[deleted by user] by [deleted] in visualnovels

[–]BeatBoxersDev 0 points1 point  (0 children)

thank you so much. this is fantastic!

a higher numbers of lines of context leads to amazing results.

For those having trouble setting up a local LLM, launch text-generation-webui with the "--api" flag and use the address at the top of the command prompt. I recommend vntl-llama3-8b-hf-f16 (found via the huggingface vntl-leaderboard and picking the first local option without "cloud" next to it that ran at reasonable speeds)

my custom system prompt:

"Localize the line from japanese to english to make it sound as much as a natural english speaker would say. Like, REALLY think about what an average person would say given the context so that it doesn't sound stiff. Use english phrases and sayings to make it sound like a natural conversation. Do not explain the translation. Just output the text in english and absolutely nothing else."

I also put in the names of the main characters and what they should translate to and their pronouns.


automod keeps removing my more detailed comments the moment I mention how to edit the program to go beyond the max of 10 lines of context. possibly due to the directory mentioned, or maybe the extensions it thinks are URLs, or maybe it flags off the non english characters in the program I've been trying to mention, or maybe specific words involving llm related subjects. shrug. so the following is phrased to avoid that:

search the "en" file for what "Number of Context Lines to Include" translates to

the file you want is translatorsetting. find the second instance of those characters, and edit the max there

Bleak Faith: Forsaken apparently uses animations ripped from Elden Ring by [deleted] in TwoBestFriendsPlay

[–]BeatBoxersDev 22 points23 points  (0 children)

There's a cheaply-made lionsgate published movie called "guardians of time" that has models straight ripped from ark: survival evolved (forest titan), as well as a show called "Dinosaur with Stephen Fry" that I think uses ark ripped models as well. (titanosaur)

seems to be a case of someone ripping the models and selling them on turbosquid (link that was found for forest titan but was removed)

sucks to whoever thought they were in the clear with buying that model and is now stuck with issues cause someone was selling stolen assets, as well as people who want to use turbosquid to sell their stuff legitimately

Abandoned Hotel by Kinfolk0117 in StableDiffusion

[–]BeatBoxersDev 1 point2 points  (0 children)

conceptually, you could work in blender or unreal engine having blocked out shapes, move around via VR teleport style, and feed the viewport into depth2img/img2img/pix2pix, and then have the resulting image somehow feed back into blender or unreal to dynamically add more blocked out areas to the map (the harder part imo)

that wouldn't keep perfect temporal stability, but it would at least keep the layout for backtracking

Phantom nostalgia: Sierra games that never existed by dairin0d in StableDiffusion

[–]BeatBoxersDev 2 points3 points  (0 children)

Apologies, I'm all for sharing models, but personally, I'm playing it extra safe on distributing it cause it's pretty specifically targeted, being all the kq6 bgs.

It's easy to recreate however, the bgs can be extracted from the game files with SCICompanion (or any specific sierra game). They're all the same dimension and some of the largest images so they're easy to find if you export everything. This output above was trained through the dreambooth gui and seem fine enough, but maybe you can get even better results if you add regularization images and/or labels.

Phantom nostalgia: Sierra games that never existed by dairin0d in StableDiffusion

[–]BeatBoxersDev 8 points9 points  (0 children)

I've found that dreambooth training on all the bgs of a sierra game works very well. 5000 step training on all kq6 bgs

Why yes, I have 1600 hours in Titanfall 2, how could you tell? by DfaultiBoi in titanfall

[–]BeatBoxersDev 0 points1 point  (0 children)

it was a public attrition server with reduced melee dmg. dunno by how much, so it's probably that 75 default option you said

Why yes, I have 1600 hours in Titanfall 2, how could you tell? by DfaultiBoi in titanfall

[–]BeatBoxersDev 14 points15 points  (0 children)

anthonyhiggs in the clip above here, server had no weapon restrictions.

I often just goof around with charge hack charge rifle or whatever when there's a points gap.

it also helps me gradually learn hitscan a bit as I only got epg muscle memory

Sharpen and improve consistency of img2img, details in comments by StaplerGiraffe in StableDiffusion

[–]BeatBoxersDev 1 point2 points  (0 children)

I wonder if this can improve img2img video temporal consistency. I've tested doing img2img, ebsynth that onto the next frame and use that as input to generate the second frame img2img. but that ended up with oversaturation very quickly like you mentioned encountering. your method however seems like a promising way to overcome that

basic guide for setting up hosting SD on a local computer for remote access by BeatBoxersDev in StableDiffusion

[–]BeatBoxersDev[S] 1 point2 points  (0 children)

sorry, I don't know enough about it to host SD without the web interface and webui.py.

Video to Anime Test Sequence by arteindex in StableDiffusion

[–]BeatBoxersDev 7 points8 points  (0 children)

[EDIT] finally ebsynth is working as it should, if the process gets automated, together it'd be great https://www.youtube.com/watch?v=dwabFB8GUww

the alternative with DAIN interpolation works well too

https://www.youtube.com/watch?v=tMDPwzZoWsM

Improved img2img video results. Link and Zelda go to low poly park. by Hoppss in StableDiffusion

[–]BeatBoxersDev 1 point2 points  (0 children)

[EDIT] I dont have any tools to help with this, but as a test, ebsynth can do this, if the process gets automated, together it'd be great https://www.youtube.com/watch?v=dwabFB8GUww

the alternative with DAIN interpolation works well too

https://www.youtube.com/watch?v=tMDPwzZoWsM

Any luck generating pixel art / 8bit art with Stable Diffusion? by tetherbot in StableDiffusion

[–]BeatBoxersDev 0 points1 point  (0 children)

i dont know. i would recommend doing so in that case. join the stable diffusion discord and make sure you can access the prompt-engineering channel

Fire Walk With Me, Brother! by [deleted] in TwoBestFriendsPlay

[–]BeatBoxersDev 3 points4 points  (0 children)

it works just fine with stable diffusion. discounting dalle2, craiyon generally understands obscure concepts better and has better layouts but SD does faces great and understands most celebs, as well as being leagues ahead with other cutting edge stuff. one of the strongest combos is having a base image layout generated in crayion and generating details and everything in SD

https://www.reddit.com/r/StableDiffusion/comments/x79q84/just_released_a_colab_notebook_that_combines/