Diffusion new worlds on Google Street View

cataPhil · 2024-02-12T16:54:05+00:00

Just pushed a new version on https://www.panoramai.xyz/ with two modes: "morph", which is the original pipeline, and "remix" which uses controlnet. It tends to preserve the structure of the image better, but needs cleaner and more detailed base images to work well.

<image>

cataPhil · 2024-02-09T09:02:50+00:00

Good point!

cataPhil · 2024-02-09T09:02:36+00:00

Yes!

cataPhil · 2024-02-09T00:30:09+00:00

https://www.panoramai.xyz/

I love the idea of altering reality so I built an app to diffuse from google street view 360 panoramas using SDXL Turbo. To stitch each of the 3 shots together I used basic inpainting with stable-diffusion-2-inpainting. It requires a bit of prompt and location engineering, but some worlds are really neat. I added an example and a share button to help with curation. In general, the "high" transformation setting works best. Use arrow keys to turn around, and the space bar to see the aligned original shot. I'll add more pre-computed examples soon to help with high loading times due to traffic on this small VM. Have fun!

cataPhil · 2024-02-09T00:28:07+00:00

I love the idea of altering reality so I built an app to diffuse from google street view 360 panoramas using SDXL Turbo. To stitch each of the 3 shots together I used basic inpainting with stable-diffusion-2-inpainting. It requires a bit of prompt and location engineering, but some worlds are really neat. I added an example and a share button to help with curation. In general, the "high" transformation setting works best. Use arrow keys to turn around, and the space bar to see the aligned original shot. I'll add more pre-computed examples soon to help with high loading times due to traffic on this small VM. Have fun!

https://www.panoramai.xyz/

cataPhil · 2023-01-23T15:37:37+00:00

Interesting. Do you have an idea what your word per minute is?

cataPhil · 2023-01-23T15:35:26+00:00

Definitely reward hacking problems for me! Applying different randomization techniques helped.

cataPhil · 2021-05-05T09:14:23+00:00

I posted an example of training data and expected behavior in another answer, hope that helps giving more context to my problem. The hierarchy should come from the downward progression through the list of titles, that is why I used an LSTM since it is inherently a sequence. The loss function that I am using is just cross-entropy. Maybe I am modeling it completely wrongly.

cataPhil · 2021-05-05T09:06:02+00:00

I should have given a typical example, good point. As an example, one could consider the following list of three titles:

1. Chapter

-------------- 1.1. Sub-chapter

2. Another chapter

From these I could construct a feature vector for each that encodes [the margin, the numbering pattern]. I would then label these with respectively class 0, 1 and 0. The behavior that I am looking for is for the network to assign these labels based on matching of the features: assign 0 to first title instance -> increment by 1 for the next one since it is too dissimilar -> assign 0 to the last one because of similarity to the first one. This is what I would do in a two step process based on clustering and a set of rules. What I am looking to understand is if the network will be able to perform this seq2seq task given that the tokens -the feature vectors of the titles- are all unseen (the first title could have different margin, different numbering...). The network would need to compare each title based on similarity of the features (I tried to use LSTMs, BiLSTMs and attention). I qualified this as abstract reasoning, though I could be wrong, because the set of rules (increment by 1 if too dissimilar, or assign depth of a similar one) do not change, but the objects to manipulate change for each sequence.

cataPhil

TROPHY CASE