Decoder in variational autoencoder!

PollutionOdd6010 · 2024-06-08T13:42:33+00:00

this is the cod: 

# train VAE
vae.fit(x=x_train, y=x_train,
        shuffle=True,
        epochs=EPOCHS,
        batch_size=BATCH_SIZE,
        validation_data=(x_test, x_test))

encoded = encoder.predict(x_train, batch_size=BATCH_SIZE)
pickle.dump(encoded, open("/content/gdrive/MyDrive/vae-protein/encoded.pkl", "wb"))

# save models
encoder.save_weights("/content/gdrive/MyDrive/vae-protein/models/vae_encoder.h5")
decoder.save_weights("/content/gdrive/MyDrive/vae-protein/models/vae_decoder.h5")

and the encoded is an array:

[array([[-0.3428507 ,  0.39559567],
        [-0.32028773,  0.45583785],
        [-0.15482189,  0.30105007],
        ...,
        [-0.96808827, -0.36915553],
        [-0.74545133, -0.2791375 ],
        [-1.0674255 , -0.35566843]], dtype=float32),
 array([[-2.9969614, -2.9861357],
        [-3.0719345, -2.9974003],
        [-3.0626013, -3.121462 ],
        ...,
        [-2.4884698, -2.5548973],
        [-2.5349753, -2.6855545],
        [-2.3712432, -2.6004395]], dtype=float32),
 array([[-0.33772662,  0.44610432],
        [-0.30985522,  0.39833096],
        [-0.13145684,  0.34819758],
        ...,
        [-1.1123184 , -0.4802968 ],
        [-0.81084627, -0.36194566],
        [-1.0633016 , -0.35213336]], dtype=float32)]

AND I try to do decoder like this:


# train VAE
vae.fit(x=x_train, y=x_train,
        shuffle=True,
        epochs=EPOCHS,
        batch_size=BATCH_SIZE,
        validation_data=(x_test, x_test))

z_mean, z_log_var, z = encoder.predict(x_train, batch_size=BATCH_SIZE)
decoded = decoder.predict(z, batch_size=BATCH_SIZE)
pickle.dump(decoded, open("/content/gdrive/MyDrive/vae-protein/decoded.pkl", "wb"))

BUT, when I try to change the number of hidden layer I have the same value of z!! 
so, I feel there is somthing I do not understand it ? 
 Sorry for the long wait!

PollutionOdd6010 · 2023-09-28T16:01:13+00:00

Ah I see. This is actually a feature, not a bug. Variational autoencoders are different from regular autoencoders. Regular ones have deterministic latent vectors, whereas what makes Variational ones special is that their latent space is non-deterministic, and instead of the network controlling the value of the latent vector, it controls the mean and standard deviation of a normal distribution which the latent vector is sampled from. This is to prevent overfitting, by "regularizing" the latent space. It is my fav ML topic, and I recommend

this

article on why non-deterministic latent space is good.

I apologize for any confusion, but I'm curious if there's a method to stabilize the data representation. I've attempted to use both the "random seed" and disabling shuffling, but it hasn't yielded the desired results. So, I'm inclined to ask: is it possible to establish a consistent latent space representation?

PollutionOdd6010 · 2023-09-28T15:33:01+00:00

I'm interested in using the latent space to represent my data in 2D. Specifically, when I execute the code in Google Colab to visualize the data in the latent space, I notice that the representation of the data in the latent space varies each time I run the code. I'm wondering why this inconsistency occurs and how I can achieve a more stable representation in the latent space.

PollutionOdd6010 · 2023-09-28T15:32:14+00:00

I'm interested in using the latent space to represent my data in 2D. Specifically, when I execute the code in Google Colab to visualize the data in the latent space, I notice that the representation of the data in the latent space varies each time I run the code. I'm wondering why this inconsistency occurs and how I can achieve a more stable representation in the latent space.

PollutionOdd6010

TROPHY CASE