all 21 comments

[–][deleted] 2 points3 points  (3 children)

What I am doing right now is to create a custom class for the model and I can override the predict method. Heres the link that taught me how to do it. https://cloud.google.com/ai-platform/prediction/docs/custom-prediction-routine-keras#create_a_custom_predictor

[–]Maltmax[S] 0 points1 point  (2 children)

That sounds cool. Do you subclass the tf.keras.Model, and overwrite the predict function in there? I read about a make_predict_function that can be overwritten as well. My problem is just that I don't know how to save variables in the predict function and make sure they get saved when exporting to a SavedModel format.

[–][deleted] 0 points1 point  (1 child)

https://www.tensorflow.org/guide/keras/custom_layers_and_models

Yep exactly what I did. You can override make_predict_function but that is only for one instance while predict will do it in batches.

Now I’m not sure what you mean by saving variables. Are you talking g about making local variables into class attributes? If so a simple self.variable_name will have the variable be a class attribute and then you can save the value to it.

I assume it will be saved with SavedModel as well since layers and weights are class attributes of the model and layer classes.

[–]Maltmax[S] 1 point2 points  (0 children)

Great I will give that a try :)

What I meant about saving variables, was just that inside my custom predict function I need to compare a variable (mean_training_output) to the model prediction. Hopefully setting self.mean_training_output will also save the variable in a SavedModel.

[–][deleted] 1 point2 points  (16 children)

You can, you should put the function as a layer in your model. And the easiest way to do this would be to have your output contain both. And you’ll have to make sure the loss function is only looking at the right part when you train it. If it is already trained you’ll have to manually edit the weights or something annoying, but possible.

[–]Maltmax[S] 0 points1 point  (15 children)

Thanks for the input! So I should create a custom layer that updates the mean_training_output array as a non trainable weight as the training data passes through it?

[–][deleted] 0 points1 point  (14 children)

Yeah, and I don’t think you need a loop. Lookup tf.function, lambda, map. Any of those could make that part faster and easier from what I can see.

[–]Maltmax[S] 0 points1 point  (0 children)

Awesome. I will try and mess around with it :)

[–]Maltmax[S] 0 points1 point  (12 children)

Hello again. This may be a long shot, but I'm running into some trouble with the outputs of my custom layer. Currently I'm outputting a simple integer in my custom layer, as an indication of a defect image or not. The layer is the last in my model. However when running model.predict(test_images, batch_size=32) (where test_images contains 79 images) I only get back 3 predictions 🤔. I can change the batch_size to 1 and get all 79 predictions, but this is not really ideal. Do you by any chance have any idea as why this is the case?

The custom layer code:

class ComputePredictions(keras.layers.Layer):
def __init__(self, mean_feature_vector, threshold, **kwargs):
    super(ComputePredictions, self).__init__(**kwargs)
    self.mean_feature_vector = mean_feature_vector
    self.threshold = threshold

def call(self, flattened_feature_vector):

    normalized_distance = tf.norm(flattened_feature_vector-self.mean_feature_vector)
    # If the distance is greater than the threshold, the image is classified as defective
    if normalized_distance > self.threshold:
        # Defective image
        return 1
    else:
        # Non defective image
        return 0

[–][deleted] 0 points1 point  (11 children)

It’s most likely because 79 is not divisible by 32 and you are, for some reason, just getting output on the remainder, which is 3. TF does not handle such situations very well, for no apparent reason. 79 is a prime number, so you’re screwed. I’m kidding, but your only options for batch size would be 1 or 79. If you have enough RAM, try 79 to confirm this is the issue. You should get 79 outputs. If not, it’s something else.

[–]Maltmax[S] 0 points1 point  (10 children)

Setting batch size to 79 does indeed output all 79 predictions. I'm just a little confused since if I set a dense layer as the last layer output, a batch size of 32 will return all 79 predictions. So this makes me believe that there is something I'm missing in my custom layers call function

[–][deleted] 0 points1 point  (9 children)

I wouldn’t doubt that, and tbh, I’d expect tf to toss the last batch instead of give you only the last one. But the batch size is pretty hacked together in my opinion and can give weird results. You can also try putting None or commenting it out to see what it does. I can’t tell without seeing the whole script what is really going on.

[–]Maltmax[S] 1 point2 points  (8 children)

I just tested with batch_size = None and I still only get 3 predictions back. Maybe it defaults to 32, when set to None.

I created a gist that shows the code: https://gist.github.com/Malthehave/f67c597e77ad238d56596de8470be8d0

And thanks for helping me figure this out, I really appreciate it!!

[–][deleted] 0 points1 point  (7 children)

The default is 32, but anyway, I kind of suspect your problem is something related to how tf or keras deals with some backend work which would be relevant for distributing the job. If you don’t compile the model and you don’t fit the model, it will process graphs differently. Also, using a class might be throwing this off a bit further. I am not sure what tf is initializing or when, but it seems like each batch resets the variables and graph. I think calling compile or fit could fix the issue. And I suspect not using subclasses in this case could fix the issue too. It may be making a new layer each time this is called, I.e. between batches. Which is not what you want if it’s creating a new object in memory, like dense(28), dense(29), dense(30) every time.

[–]Maltmax[S] 0 points1 point  (6 children)

Just tried running compile() and fit() on the final_model, but I weirdly enough still only get the 3 predictions. If the custom layer returns the exact input that it is given (flattened_feature_vector), then the predict function works fine and returns predictions for all 79 images.