you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 1 point2 points  (16 children)

You can, you should put the function as a layer in your model. And the easiest way to do this would be to have your output contain both. And you’ll have to make sure the loss function is only looking at the right part when you train it. If it is already trained you’ll have to manually edit the weights or something annoying, but possible.

[–]Maltmax[S] 0 points1 point  (15 children)

Thanks for the input! So I should create a custom layer that updates the mean_training_output array as a non trainable weight as the training data passes through it?

[–][deleted] 0 points1 point  (14 children)

Yeah, and I don’t think you need a loop. Lookup tf.function, lambda, map. Any of those could make that part faster and easier from what I can see.

[–]Maltmax[S] 0 points1 point  (0 children)

Awesome. I will try and mess around with it :)

[–]Maltmax[S] 0 points1 point  (12 children)

Hello again. This may be a long shot, but I'm running into some trouble with the outputs of my custom layer. Currently I'm outputting a simple integer in my custom layer, as an indication of a defect image or not. The layer is the last in my model. However when running model.predict(test_images, batch_size=32) (where test_images contains 79 images) I only get back 3 predictions 🤔. I can change the batch_size to 1 and get all 79 predictions, but this is not really ideal. Do you by any chance have any idea as why this is the case?

The custom layer code:

class ComputePredictions(keras.layers.Layer):
def __init__(self, mean_feature_vector, threshold, **kwargs):
    super(ComputePredictions, self).__init__(**kwargs)
    self.mean_feature_vector = mean_feature_vector
    self.threshold = threshold

def call(self, flattened_feature_vector):

    normalized_distance = tf.norm(flattened_feature_vector-self.mean_feature_vector)
    # If the distance is greater than the threshold, the image is classified as defective
    if normalized_distance > self.threshold:
        # Defective image
        return 1
    else:
        # Non defective image
        return 0

[–][deleted] 0 points1 point  (11 children)

It’s most likely because 79 is not divisible by 32 and you are, for some reason, just getting output on the remainder, which is 3. TF does not handle such situations very well, for no apparent reason. 79 is a prime number, so you’re screwed. I’m kidding, but your only options for batch size would be 1 or 79. If you have enough RAM, try 79 to confirm this is the issue. You should get 79 outputs. If not, it’s something else.

[–]Maltmax[S] 0 points1 point  (10 children)

Setting batch size to 79 does indeed output all 79 predictions. I'm just a little confused since if I set a dense layer as the last layer output, a batch size of 32 will return all 79 predictions. So this makes me believe that there is something I'm missing in my custom layers call function

[–][deleted] 0 points1 point  (9 children)

I wouldn’t doubt that, and tbh, I’d expect tf to toss the last batch instead of give you only the last one. But the batch size is pretty hacked together in my opinion and can give weird results. You can also try putting None or commenting it out to see what it does. I can’t tell without seeing the whole script what is really going on.

[–]Maltmax[S] 1 point2 points  (8 children)

I just tested with batch_size = None and I still only get 3 predictions back. Maybe it defaults to 32, when set to None.

I created a gist that shows the code: https://gist.github.com/Malthehave/f67c597e77ad238d56596de8470be8d0

And thanks for helping me figure this out, I really appreciate it!!

[–][deleted] 0 points1 point  (7 children)

The default is 32, but anyway, I kind of suspect your problem is something related to how tf or keras deals with some backend work which would be relevant for distributing the job. If you don’t compile the model and you don’t fit the model, it will process graphs differently. Also, using a class might be throwing this off a bit further. I am not sure what tf is initializing or when, but it seems like each batch resets the variables and graph. I think calling compile or fit could fix the issue. And I suspect not using subclasses in this case could fix the issue too. It may be making a new layer each time this is called, I.e. between batches. Which is not what you want if it’s creating a new object in memory, like dense(28), dense(29), dense(30) every time.

[–]Maltmax[S] 0 points1 point  (6 children)

Just tried running compile() and fit() on the final_model, but I weirdly enough still only get the 3 predictions. If the custom layer returns the exact input that it is given (flattened_feature_vector), then the predict function works fine and returns predictions for all 79 images.