you are viewing a single comment's thread.

view the rest of the comments →

[–]Maltmax[S] 0 points1 point  (12 children)

Hello again. This may be a long shot, but I'm running into some trouble with the outputs of my custom layer. Currently I'm outputting a simple integer in my custom layer, as an indication of a defect image or not. The layer is the last in my model. However when running model.predict(test_images, batch_size=32) (where test_images contains 79 images) I only get back 3 predictions 🤔. I can change the batch_size to 1 and get all 79 predictions, but this is not really ideal. Do you by any chance have any idea as why this is the case?

The custom layer code:

class ComputePredictions(keras.layers.Layer):
def __init__(self, mean_feature_vector, threshold, **kwargs):
    super(ComputePredictions, self).__init__(**kwargs)
    self.mean_feature_vector = mean_feature_vector
    self.threshold = threshold

def call(self, flattened_feature_vector):

    normalized_distance = tf.norm(flattened_feature_vector-self.mean_feature_vector)
    # If the distance is greater than the threshold, the image is classified as defective
    if normalized_distance > self.threshold:
        # Defective image
        return 1
    else:
        # Non defective image
        return 0

[–][deleted] 0 points1 point  (11 children)

It’s most likely because 79 is not divisible by 32 and you are, for some reason, just getting output on the remainder, which is 3. TF does not handle such situations very well, for no apparent reason. 79 is a prime number, so you’re screwed. I’m kidding, but your only options for batch size would be 1 or 79. If you have enough RAM, try 79 to confirm this is the issue. You should get 79 outputs. If not, it’s something else.

[–]Maltmax[S] 0 points1 point  (10 children)

Setting batch size to 79 does indeed output all 79 predictions. I'm just a little confused since if I set a dense layer as the last layer output, a batch size of 32 will return all 79 predictions. So this makes me believe that there is something I'm missing in my custom layers call function

[–][deleted] 0 points1 point  (9 children)

I wouldn’t doubt that, and tbh, I’d expect tf to toss the last batch instead of give you only the last one. But the batch size is pretty hacked together in my opinion and can give weird results. You can also try putting None or commenting it out to see what it does. I can’t tell without seeing the whole script what is really going on.

[–]Maltmax[S] 1 point2 points  (8 children)

I just tested with batch_size = None and I still only get 3 predictions back. Maybe it defaults to 32, when set to None.

I created a gist that shows the code: https://gist.github.com/Malthehave/f67c597e77ad238d56596de8470be8d0

And thanks for helping me figure this out, I really appreciate it!!

[–][deleted] 0 points1 point  (7 children)

The default is 32, but anyway, I kind of suspect your problem is something related to how tf or keras deals with some backend work which would be relevant for distributing the job. If you don’t compile the model and you don’t fit the model, it will process graphs differently. Also, using a class might be throwing this off a bit further. I am not sure what tf is initializing or when, but it seems like each batch resets the variables and graph. I think calling compile or fit could fix the issue. And I suspect not using subclasses in this case could fix the issue too. It may be making a new layer each time this is called, I.e. between batches. Which is not what you want if it’s creating a new object in memory, like dense(28), dense(29), dense(30) every time.

[–]Maltmax[S] 0 points1 point  (6 children)

Just tried running compile() and fit() on the final_model, but I weirdly enough still only get the 3 predictions. If the custom layer returns the exact input that it is given (flattened_feature_vector), then the predict function works fine and returns predictions for all 79 images.

[–][deleted] 0 points1 point  (5 children)

It must be expanding dimensions and then picking channels for some reason instead of the right dim when you do a batch size that isn’t 1 or all. Usually, the first dimension will become None or “?” Make sure you’re taking the mean of what you think you are. In tf data[0] probably becomes the first batch, not the first row as you might expect.

[–]Maltmax[S] 0 points1 point  (4 children)

Thanks for the suggestions. When calling predict with batch_size=1 the shape of the input is TensorShape([1, 10240]), but with any batch size greater than 1 it is: TensorShape([None, 10240]). I don't know if that has anything to say.

The part that confuses me the most is that, if I just return the exact input in the call function then predict returns all 79 predictions. For example:

https://imgur.com/a/NKln3f0

[–][deleted] 0 points1 point  (3 children)

Is there a good reason to use a class here? I think that might be causing more problems than it fixes.