Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some custom objects are not being serialized with push_to_hub_keras #595

Open
osanseviero opened this issue Jan 14, 2022 · 9 comments
Open
Assignees

Comments

@osanseviero
Copy link
Contributor

Self-contained code example:

from huggingface_hub import from_pretrained_keras

model = from_pretrained_keras("keras-io/vit-small-ds")

Error

/usr/local/lib/python3.7/dist-packages/keras/utils/generic_utils.py in class_and_config_for_serialized_keras_object(config, module_objects, custom_objects, printable_module_name)
560 if cls is None:
561 raise ValueError(
--> 562 f'Unknown {printable_module_name}: {class_name}. Please ensure this '
563 'object is passed to the custom_objects argument. See '
564 'https://www.tensorflow.org/guide/keras/save_and_serialize'

ValueError: Unknown optimizer: Addons>AdamW. Please ensure this object is passed to the custom_objects argument. See https://www.tensorflow.org/guide/keras/save_and_serialize#registering_the_custom_object for details.

@osanseviero
Copy link
Contributor Author

Similarly with

from huggingface_hub import from_pretrained_keras

model = from_pretrained_keras("carlosaguayo/vit-base-patch16-224-in21k-euroSat")

@ariG23498
Copy link
Contributor

Hey @osanseviero @merveenoyan

I was successful in uploading the custom objects with push_to_hub_keras API. The main steps are:

  • Have a get_config method for all the custom layers.
  • Serialization of tensors that are used in the custom layers.
class MultiHeadAttentionLSA(tf.keras.layers.MultiHeadAttention):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        # The trainable temperature term. The initial value is
        # the square root of the key dimension.
        self.tau = tf.Variable(
            math.sqrt(float(self._key_dim)),
            trainable=True
        )
        # Build the diagonal attention mask
        diag_attn_mask = 1 - tf.eye(NUM_PATCHES)
        self.diag_attn_mask = tf.cast([diag_attn_mask], dtype=tf.int8)
    
    def get_config(self):
        config = super().get_config()
        config.update({
            "tau": self.tau.numpy(),                       #<---- IMPORTANT
            "diag_attn_mask": self.diag_attn_mask.numpy(), #<---- IMPORTANT
        })
        return config
  • Not using augmentation layers inside the model. TensorFlow 2.7 has a problem with serializing the augmentation layers. Here I have used the map function to map the tf.data and used the augmentation as a preprocessing step rather than inside the mode.

Colab Notebook used

Colab Notebook

Usage of the pre-trained vit-ds-small model

loaded_model = from_pretrained_keras("keras-io/vit-small-ds")

_, accuracy, top_5_accuracy = loaded_model.evaluate(test_ds)
print(f"Test accuracy: {round(accuracy * 100, 2)}%")
print(f"Test top 5 accuracy: {round(top_5_accuracy * 100, 2)}%")

Note: You have to use the test augmentation to preprocess the input images before sending it to the model.

@osanseviero
Copy link
Contributor Author

osanseviero commented Jan 17, 2022

Hi @ariG23498! This is great and very insightful! 🤗

The augmentation layer is TF 2.7 specific and hopefully it should work with incoming versions. That way the end-user does not need to know to know any pre/post processing steps and everything is within the saved model.

I wonder if there's any way to achieve this programmatically instead of expecting users having to implement this (cc @Rocketknight1 or @gante might have some ideas). Worst case, we could throw some warning when there are custom layers being saved and pointing to documentation. WDYT?

@merveenoyan
Copy link
Contributor

@osanseviero I feel like this might be not specific to 2.7. I saw @ariG23498 downgraded TF to 2.6 to save the model, so model was still saved (his first issue was related to that and we fixed it that way) but custom layer was still needed to be registered for us to load the model, the error we got about AdamW was related to that (see below) and it has nothing to do with 2.7.
If you have a custom object you need to register it using the methods he mentioned. See here.
We might indeed ask user to register the custom object with a warning if they'd like to host their model on Hub. Regardless of this, I'm looking for ways to see if we can infer it from the SavedModel format.

@ariG23498
Copy link
Contributor

Worst case, we could throw some warning when there are custom layers being saved and pointing to documentation. WDYT?

Yep. This is the only way to go. Even TensorFlow throws an error while trying to save a custom model. It is a platform specific error that needs to be checked by the user and not the HF team IMO.

@gante
Copy link
Member

gante commented Jan 17, 2022

I agree with what @ariG23498 wrote about custom layers, its flexibility (which makes it hard to automate) can be a boon. And the creators of custom layers are mostly power user anyways, in my experience :D

For completeness of discussion, there is a point yet to be addressed in this discussion. The error that @osanseviero originally points at can be avoided by importing tfa, which is needed to load the optimizer. In other words:

  • this works
import tensorflow_addons as tfa
from huggingface_hub import from_pretrained_keras
loaded_model = from_pretrained_keras("keras-io/vit-small-ds")
loaded_model.summary()
  • this doesn't work
from huggingface_hub import from_pretrained_keras
loaded_model = from_pretrained_keras("keras-io/vit-small-ds")
loaded_model.summary()

Digging deeper, we can see that push_to_hub_keras uses tf.keras.models.save_model under the hood (here). We might want to set its include_optimizer argument to False, which removes the optimizer object before serialization, preventing errors like this from optimizers that are not in the standard tensorflow library.

What do you think?

EDIT: at the very least, we can throw a warning when the user is pushing a model to the hub with these kind of optimizers.

@ariG23498
Copy link
Contributor

Hey @gante

I love the insights that you bring to the table.

I think #598 covers the issue that you are talking about. Do let me know what you think.

@merveenoyan
Copy link
Contributor

@gante @ariG23498 @osanseviero
save_traces registers every custom layer to SavedModel by default and we don't need to register custom objects, it's my bad. I thought the error @ariG23498 got was related to that because the error message indicated this.
Now, we only need to change include_optimizer and maybe we could change signatures for TF Lite users, related to #598.

@merveenoyan
Copy link
Contributor

For this one I'm planning to test from_pretrained_keras() on models with custom objects and see the error pattern, and if it's raised, prompt user to pass the custom objects with custom_objects into **kwargs instead when loading. Seems the only reasonable way given user has to implement things.

BTW weird enough, when you import AdamW without actually compiling the model again this issue
ValueError: Unknown optimizer: Addons>AdamW. Please ensure this object is passed to the custom_objects argument. See https://www.tensorflow.org/guide/keras/save_and_serialize#registering_the_custom_object for details.
goes away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants