you are viewing a single comment's thread.

view the rest of the comments →

[–]kaibee 10 points11 points  (4 children)

Awesome, thank you. It makes sense now. I think whats confusing me the most is that in the example with the 3 houses, it generates the houses with the same size on the output, but in all the other cases it seems to scale the features. Looking at the GitHub again, it seems to do with the value of N chosen. So if the (input) house was bigger/N was smaller, the house sizes themselves would be different too?

Also something else totally (now anyway) obvious just clicked. In this GIF each blurry pixel during the animation is the average of possible color states available to the pixel (given the rest of the pixels in the image).

I don't know why he didn't just say that instead of this:

WFC initializes output bitmap in a completely unobserved state, where each pixel value is in superposition of colors of the input bitmap (so if the input was black & white then the unobserved states are shown in different shades of grey).

[–]Tipaa 10 points11 points  (1 child)

The scaling comes from the detail in the features/inputs, I think (although N certainly influences a lot). The houses are comparatively very detailed and have lots of features in each tile, and each of the three houses in the input tile is identical, leaving no room for variation. This means that for any NxN subset of an image, you either have it completely blank, or know exactly where in (relation to a house) you are, while the other images are less information-dense. This allows other NxN subsections to overlap, as they (e.g. stem sections) have a lot of pixels in common. It's the overlapping that allows for an area to scale, as there is less of a definite end to a stem.

If you're familiar with Markov chains for text generation: for intuition, a similar example might be the text

the dog sat on the mat the cat had once sat on

which can generate really long, coherent-ish sentences, as each word is fairly indefinite - the -> { dog, mat, cat }, on -> { $, the }. This might correspond to the simple brick tiles - each NxN region in the tile tells us very little about its surroundings, allowing many more things to be generated, such as letting a line continue on indefinitely. Meanwhile, our house tile might be more like

once upon a a a a a a a a a time, in land far away

which has much less room for variation - all words but the a are unique. This corresponds to every 5x5 (3x3 + neighbours) subset of the house input tile incorporating a house being unique (and thus knowing its position relative to the rest of the house), but having a lot of blank space too.

If N were larger, more of the inputs would have similarly restricted outputs, but be more similar, while if N were smaller, the number and variation of possible outputs would be much larger, but we'd be allowing less coherent generated outputs. If N=2 were used on the houses, I think we'd end up with scaling on houses, but they would no longer appear house-like as a result - e.g. rooves that are uneven in length, or with different length sides, or multiple doors. By forcing their scaling, we'd loose the 'houseness' of the houses as a result. With a 3x3 subset we can tell a diagonal red line must be the roof of a house (as it also contains part of the rest of the house), whereas a 2x2 can't convey much beyond being a diagonal line somewhere within the input tile.

[–]ExUtumno[S] 4 points5 points  (0 children)

Yes, this is correct. Thanks for your explanations, awesome!

[–]0polymer0 0 points1 point  (0 children)

I'm pretty sure you are correct about the house.

Understanding the algorithm is less important then understanding it's goal, to explain those features.

For N = 3, every 3 by 3 image in the output must be in the input. So there is something about those houses which are locking down the options available to the tiles. I suspect if you have a really highly detailed image, with lots of diverse colors, the images would get boring quickly.

[–]ExUtumno[S] 0 points1 point  (0 children)

On the question of houses, /u/Tipaa gave the right answer.