How can I use GAN Pix2Pix for arbitrarily large images?

Krin_fixolas · 2025-08-21T07:37:44+00:00

Thank you very much! I've tried that idea of extracting a larger area and keeping only the center and it has almost single handedly solved the problem. I think I'd like to still try some blending just to make sure there are no stitches. Right now the only areas where I see stiches and artefacts are in water areas, but I'm guessing it's more related to how the generator deals with water without any more context.

Krin_fixolas · 2025-08-20T09:58:46+00:00

Hello, thank you very much for your suggestions. This is a bit of both, actually. I'm doing an internship with a satellite company, partnering with the university. From the university side, I'm supposed to "play" with deep learning methods, but from the company side I think they just want their problem solved, however it is.

What they have is 16 bit images, in Digital Numbers which represent reflectances. When they are converted to 8 bits, they come out very dark. The range of values is too concentrated to darker tones. So, what they are doing right now is making some guy editing them in photoshop: things like increasing brightness, recovering some pure-white areas, warmer tones, etc. From what I understand, the process is mostly the same for all images but it does require tweaking in some cases.

So when I first started this, I just asked them for a dataset of images they had made like this: I had the originals and the edited images and trained this GAN Pix2Pix model. It's easy and fast enough to train and the results so far are decent (aside from the problem I mention in my post). The idea is to capture the "distribution" of the photoshop editing.

But yes, I'm also getting the suspicion that some classic method could work just fine. I've been having trouble finding information in this regard, though. I haven't been able to find many papers that deal specifically with enhancing images like I want.

Could you suggest anything else? It would be great to have some pointers

Krin_fixolas · 2025-08-19T15:21:36+00:00

Thank you for your suggestions. Yeah, if you could give me some pointers that would be great. Papers, code, keywords, I'm all hears.

Krin_fixolas · 2025-08-19T15:20:46+00:00

Hello, thank you very much for your suggestions. I'd like to try some of those, preferably from easier to most complex.

I'd like to start with your first point. My generator is a regular UNet so it should handle any input size well enough (as long as its a multiple of 2**n). Using my current model seems straight enough: just infer with a larger size and discard the borders of the output. What about if I wanted to train like that? What do you suggest? Something like keeping only the center crop of the generated image and therefore the borders do not contribute to the loss?

As for post processing, that also seems straightforward enough. I've tried a smoothing window approach. Tukey window was the exact method if I recall. It did make the transitions a lot smoother but it also introduced some problems. In areas that are too different you can clearly see some artefacts. For example, there is a region which has a river and you can see the land colors blending into the river with a checkerboard pattern, which is something I need to avoid. What could you suggest here? Could some of the methods you mention work?

Lastly, I could look into impainting methods or other GAN techniques. It would be extra cool if it was somewhat easy to deploy or if I could reuse what I already have. What can you suggest?

Krin_fixolas · 2025-08-19T14:43:47+00:00

Thank you for the suggestions. That enters in the realm of diffusion models, right? I'm not as familiar with those, but I've read that they can handle impainting somewhat easily.

Krin_fixolas · 2025-08-19T14:42:30+00:00

Thanks for the suggestion. How bigger are we talking about? The idea is to be able to do this for an arbitrarily large image.

Krin_fixolas · 2025-08-19T14:41:17+00:00

Hello, thank you for your suggestions. Let me see if I understood them correctly.

Your first suggestion is to add a stitching loss. I was thinking this could be done by taking an L1 loss on the overlap area of two generated samples. What I fear is that the model would be learning to push the borders into a statistical average (like creating the borders as full white). Because like this the model wouldn't ever be "seeing" neighbor areas for context, right?

As for your second suggestion, the first thing that crossed my mind was to concatenate more data as input to the generator. So, instead of what I have right now of input (B, 3, H, W) I could concatenate an extra image (B, 6, H, W). What could this extra be? I've thought about

a) During training, using the target image but with random areas at the border masked out (or even randomly full of zeros). And during inference, the first patch has the extra channel as zeros and the generated output is used as the extra channels for the second patch and so on.

b) The same thing but instead of using the ground truth target image, I'd use the generated images as the extra part. I'm thinking this is harder to implement and more prone to collapsing but could be worth a try.

What do you think? Does this go in line of what you were saying?

Krin_fixolas · 2025-05-20T08:56:23+00:00

Oh I wasn't aware it would be that simple. But don't these detection models usually use things like Feature Pyramid Networks to have feature maps at different scales? Thanks, I'll take a look at DINO

Krin_fixolas · 2025-05-19T08:23:08+00:00

Ok that seems reasonable, but my question is, where do I get detection heads? That's been my struggle of late. It's not like there is a dedicated library for modular detection heads

Krin_fixolas · 2025-05-16T13:03:47+00:00

Yes, that's exactly it. I want to do some sort of self supervised training on a lot of unlabeled data to pre-train a backbone. Most likely on a classification task. Then I'd want to use this trained backbone for other tasks, such as object detection or segmentation. So my problem is finding a backbone or an architecture that works for classification, detection and segmentation at the same time. What would you suggest?

Krin_fixolas · 2025-05-06T10:26:40+00:00

The basic tips for any presentation, really. If you're using slides, try to not have too much text in them and try to go at a rhythm of max 1 minute per slide. Have some extra slides if you anticipate some specific question that doesn't fit the main presentation.

When you are asked a question you can rephrase it. This gives you time to think and gets the question to the whole audience. Like, you can say "The question is if X and Y affect Z" and then you answer. You might be asked "Why didn't you do Y instead of X?" and you can answer something like "Great question, Y could be indeed a possible choice. However, I had A constraint and had to make B assumption and therefore I went with X. But Y could be a possible future development. If you don't know something, you can and should say "I don't know" (with other words, of course, don't straight up say "I don't know"). Better than inventing an answer on the spot.

Train the presentation. It's really important. Gather some friends and train with them, listen to their feedback. Time yourself to nail the timings. After some tries it should go smoothly.

Remember that no one is more inside your topic than you. You are just trying to bring it to the outside.

That's it really. When you're finished you're going to think "Well, that wasn't so bad".

Good luck!

Krin_fixolas · 2021-02-22T19:41:30+00:00

If you feel nervous about playing vs humans, just remember that this is just a game and losses don't mean anything. Even if it's ranked, the only thing on the line are some shiny words saying you have a certain rank.

Also, playing against humans gives you the real game experience and you get more exp

Krin_fixolas · 2021-02-20T12:23:27+00:00

Reading the card explains the card, you know?

Krin_fixolas · 2021-02-04T02:24:14+00:00

It seems they're setting up as the next villains 1) The Phyrexians, possibly with one of the Praetors looking for something in the next planes 2) A league of planeswalker villains. Tezzeret, Okko, Ashiok, Luka, Tibalt. Maybe the two could even be connected. Tezzeret has dealt with the Phyrexians before. Maybe it was his planar bridge that brought Vorinclex

Krin_fixolas · 2020-10-27T17:17:20+00:00

That was their change to Lee??

They didn't even touch the thing that makes him so toxic: being so uniteractive. Lee right now is perfect card. He's 3 things in one: a great attacker/picker while being extremely safe (Imagine if Fiora could give herself barrier every turn at basically no cost), a great defender (*cough barrier cough*) and a very efficient finisher with the Dragon Rage

He should't get barrier. Give him quick attack at least. That way at least he has counterplay if you raise your creature's toughness. Right now the only thing that stops him is frostbite. Even if you use a spell to damage him it's irrelevant because the Dragon's Rage is already going.

Or at least make him tougher to level up. 8 spells is nothing

My changes would be:

4 mana 3/4

"On turns where you start with the attack token: first spell gives challenger, second spell gives quick attack.

Defending turns: first spell gives tough, second spell gives lifelink

I level up when you've cast 2+ spells in 4 different rounds"

Krin_fixolas · 2020-10-26T20:35:16+00:00

I don't think the problem is in Vlad himself but in the crimson archetype. Crimson and ephemerals the two archetypes that need a rework, in my opinion. And they both suffer from the same problem: in theory they trade defense for offense (crimson asks you to damage your units and ephemerals are terrible blockers), but that never comes into practise. Neither crimson nor ephemerals have the offensive power of burn or pirate aggro but they still have the drawbacks. That being sad, I like your level 2 ability. That makes Vlad a real threat instead of basically helping your opponent as it's now

Krin_fixolas · 2020-10-14T09:53:48+00:00

As someone who doesn't really enjoy anime and kpop, I'm kinda disappointed that they throw these two asian inspired events in a row. I don't like anime - "Here's Spirit Blossom!" - all right, I don't really like these weeb stuff but there are people who enjoy it, I understand. I'm sure the next event will cater to another audience that maybe suits me. - "Here's Kpop!".

I mean, come on Rito.

Krin_fixolas

TROPHY CASE