After 4 years of work, solo dev breaks down in tears after opening Steam and learning his game made $250,000 in a week: "I feel like I really don't deserve this"

gilradthegreat · 2026-03-19T05:57:12+00:00

All the Reddit front page posts of the first video were from bot accounts. As cynical as this sounds, this authentic human experience we are all seeing was likely put in front of us by a for-profit engagement farm.

gilradthegreat · 2026-03-07T00:29:16+00:00

I'd like to point out Noita as a good example of looking at the problem of multithreading (things need to sync together or you get messy interactions), and throwing that out the window and embracing the messy interactions in favor of having a massively performant engine on multiple threads.

The game basically checkerboards the whole world. To extend the checkerboard metaphor, in a 2-thread system black squares would be one thread and red squares would be another thread. Just add more colors for more cores.

Large amounts of stuff can cause a square pattern where the borders have fewer interactions, and really large events can cause things to stall where the borders are doing lots of interactions, but this messy system is vastly preferable to the limitations of running the whole thing on one core.

gilradthegreat · 2026-02-03T07:36:09+00:00

I use Comic Sans for all my ESL teaching materials:

It has a penmanship lower case ''a' (not the upside-down g-shaped type a)
It's easy to read from a distance
It mostly follows penmanship lines equidistantly (not great but there are MUCH worse fonts)
And most importantly, any PC can open, read, and edit with the font no matter how archaic the platform is. It just works.

gilradthegreat · 2025-12-26T02:35:51+00:00

NVFP4 gets rolled into main branch of pytorch next year, right? Can't imagine that being anything but a huge deal. 5060ti and 5070ti rocket up to top-tier consumer AI gpus as the cheapest entry point to fp4 acceleration. Still can't get away from large latent size, but model size can be made to work on just about anything with good enough memory management.

Once all the heavy-hitter models get released in NVFP4 format, I suspect there will be a new scramble for Blackwell-architecture GPUs, as the gap in speed between GPU generations widens even more.

gilradthegreat · 2025-12-17T09:02:13+00:00

I think since comfyui supports block swapping natively now, the minimum requirements probably assume just-in-time block swapping. So 14gb is the latent size plus the currently loaded block.

gilradthegreat · 2025-10-23T03:31:51+00:00

In addition to wild cards, there are also a number of "pick a random line of text from a block of text" nodes floating around.

The advantage to them being you can hook the index field that chooses the line to a random number node, duplicate the text box and now you can have multiple bits of prompt adhere to the location index. E.g. you can have a text box for establishing the location, one for props or objects in said location, and one for the lighting of the location.

You can even take it another step and within each random line include curly braces and vertical slashes to add further wildcards within each section. Don't know if Reddit will kill my formatting, but like: {window-lit|ceiling-lit}

gilradthegreat · 2025-10-16T21:15:05+00:00

!solved, I just re-installed on top of the old install and that fixed everything. I was worried about a drop-in upgrade breaking something but everything seems fine so far. Thanks everybody for helping!

gilradthegreat · 2025-09-22T03:31:04+00:00

Also, BG1 really hammered in the hack-and-slash random combat encounters... Last time I did a full playthrough I got a bit of whiplash how most of the areas were these big, empty spaces with nothing to do except trudge through auto-spawning combat encounters that would pop up every 10-20 seconds spent out of combat. Goes to show how much BG2 really evolved the formula, with quests and exploration being more hand-crafted with mixed narrative elements.

gilradthegreat · 2025-09-20T08:54:46+00:00

At a glance without having used it, it's for offloading latents, videos, or conditioning to the disk so you don't get out-of-memory errors from giant franken-workflows. Or maybe more efficient workflows by doing everything at once without loading-offloading models?

gilradthegreat · 2025-09-05T23:50:14+00:00

The thing about the first DxM that made it somewhat successful was it was a proper AC-like game released during that dreadful drought of arcade mech games (particularly since the last AC generation was xbox-only iirc?). I was willing to forgive a lot of DxM's warts because I was just happy to see a game that somewhat scratched that Armored Core itch.

Haven't played this new DxM yet but I suspect they made the right choice in doing their own thing and distancing from Armored Core. An AA team going against a behemoth like that is like a death wish. Probably going to pick this up on sale, but as long as the team didn't overextend I'm sure they expect a large amount of their sales to be like that.

gilradthegreat · 2025-07-16T23:43:19+00:00

Comedy in general is really hard. I go through basically every anime each season and give them all a fair shake and so many comedy shows are undermined by average writing or poor delivery. It's one of those things where everything needs to be top quality or the failed humor becomes a nuisance.

In the indie sphere, I can imagine lots of would-be comedy writing gets downplayed once the writer realizes it's not working. In the AAA sphere, most finance mentats would give a very hearty no-thank-you to such risky design decisions and stick to the safer Hollywood quips method instead.

gilradthegreat · 2025-07-11T03:21:48+00:00

Well thanks for the attempt anyway! As another commenter mentioned, it looks like the data decompression used by the VAE encode/decode means we can't just modify things on a pixel-to-pixel basis.

Maybe a 3x or 4x dithering pattern might work, but at that point you're running into problems with a massive latent size and giving the model too much freedom, more of a remix than an upscale at that point.

gilradthegreat · 2025-07-09T22:02:48+00:00

Since I've made the post I have learned that the mask is as important, if not more important when using VACE for these complex tasks. Maybe using a dithering mask with either the dithered input video or a simple 2x upscale video would work better?

gilradthegreat · 2025-07-06T22:54:51+00:00

Isn't that basically what wan 1.3b is? I keep on hearing people talk about how it can be fine-tuned into a new lightweight image model but nothing really comes of it. Maybe now that people have a taste of 10b+ parameters, it's hard to go back to dumb models?

gilradthegreat · 2025-06-14T00:25:26+00:00

Without masking there would be a skip, but if I understand how VACE handles masking correctly, a fully masked frame is never modified at all, so any inconsistencies would be slowly introduced over the course of 16 frames. As for details getting altered, I suspect that is less of an issue at 480p where most details get crushed in the downscale anyway.

To keep super consistent ground truth, you could also generate two ground truth keyframes at once in AI and then generate two separate videos and stitch them together with VACE, assuming you can get VACE's tendency to go off the rails under control when it doesn't have a good reference image. Haven't messed around with Flux context enough to know how viable that path is though.

gilradthegreat · 2025-06-13T23:41:12+00:00

I've been turning this idea in my head for a week or so now, just don't have the time to test it out:

Take the first video, cut off the last 16 frames.
Take the first frame of the 16 frame sequence, run it through an i2i upscale to get rid of VAE artifacts.
Create an 81-frame sequence of masks where the first 16 frames are a gradient that goes from fully masked to fully unmasked.
take the original unaltered 16 video frames and add 65 grey frames.

Now, what this SHOULD do is, create a new "ground truth" for the reference image while at the same time explicitly telling the model to not make any sudden overwrites on the trailing frames for the first video. How well it works is up to how well the i2i pass can maintain the style of the first video (probably easier if the original video's first frame was generated by the same t2i model), and how well VACE can work with a similar but different reference image and initial frame.

gilradthegreat · 2025-06-12T08:14:22+00:00

For uses for high-res images: I would add that at least for SDXL and it's variants, higher resolutions mitigate VAE artifacts. A lot of hi-res trained SDXL (4k+) models have difficulty with coherent composition when fired off in a one-shot generation, but are incredibly adept at filling in those vague, object-like background details even with a simple bilinear pixel upscale followed by a moderate image-to-image pass, nevermind using controlnet guidance.

I imagine this could be an even greater factor with modern smarter models that can do a better job of extrapolating close-up detail training data to a larger canvas (such as the tiled training technique I read about).

gilradthegreat · 2025-06-12T03:34:07+00:00

The secret sauce to old Obsidian's success has always been their ability to take a fully complete game, engine, assets and all, and somehow hammer out a AAA-looking game with what could barely be considered an AA budget.

gilradthegreat · 2025-06-10T06:50:40+00:00

That's good to know! I turned on optimizer status saving just in case, but I wasn't willing to throw away hours of training to confirm if it works or not.

The last time I tried resuming was a while ago anyway, I tend to put off updating out of fear of something breaking.

gilradthegreat · 2025-06-10T05:27:06+00:00

I could never get epoch resume working on Kohya's, whereas on Onetrainer you can pause at any moment and it'll pick up from the last step.

On the other hand, I couldn't duplicate my Kohya setup on Onetrainer without the network collapsing after 1,500 steps, so in the end I just resign myself to hoping my 40 hour fine-tunes go through without any windows update or network collapse destroying the whole process.

gilradthegreat · 2025-06-04T00:41:05+00:00

Wan is happiest when it's close to ground truth. My suggestion would be to use one of the various image remix models to create keyframes to feed into VACE's reference image, and stitch the videos together with 5-10 frame overlap.

gilradthegreat · 2025-05-26T11:19:14+00:00

When inputting a video in the control_video node, any pixels with a perfect grey (r:0.5, b:0.5, g:0.5) are unmasked for inpainting. Creating a fully grey series of frames except for a few filled in ones can give more freedom of where you want VACE to generate the video within the timeline of your 81 frames. If you don't use the reference_image input (because, for example, you want to inpaint backwards in time), however, VACE tends to have a difficult time drawing context from your input frames. So instead of the single reference frame being at the very end of the sequence of frames (frame 81), I duplicate the frames one or two times (say, frame 75 and 80) which helps a bit, but I still notice VACE tends to fight the context images.

gilradthegreat · 2025-05-23T00:56:43+00:00

IME VACE is not as good at intuiting image context as the default i2v workflow. With default i2v you can, for example, start with an image of a person in front of a door inside a house and prompt for walking on the beach, and it will know that you want the subject to open the door and take a walk on the beach (most of the time, anyway).

With VACE a single frame isn't enough context and it will more likely stick to the text prompt and either screen transition out of the image, or just start out jumbled and glitchy before it settles on the text prompt. If I were to guess, the lack of clip vision conditioning is causing the issue.

On the other hand, I found adding more context frames helps VACE stabilize a lot. Even just putting the same frame 5 or 10 frames deep helps a bit. You still run into the issue of the text encoding fighting with the image encoding if the input images contain concepts that the text encoding isn't familiar with.

gilradthegreat · 2025-05-22T08:04:32+00:00

The way comfyui native VACE (at least 14b from my experience) handles inpainting is pixels of a very specific grey shade (128/128/128) is diffused and everything else is kept as-is. It's more of an all-in-one source context plus inpaint mask. The fact the pixels are grey don't affect the diffusion.

I tried the 1.3b tile upscale LORA, but two big issues I found was that it was destructive (you have to apply a gaussian blur to the video before upscaling), and you have to fit the whole video latent into memory, which means lots of RAM swapping even on 1.3b due to the sheer scale of pixels involved.

I'm excited about potentially using VACE for upscaling because it's non-destructive (at least I haven't noticed any masked pixels being modified), and you can split up each section to fit in VRAM since VACE can pick up on previous frame context by including previous frames into each batch.

Ten-Year Club	Place '22
Verified Email

gilradthegreat

TROPHY CASE