Skip to content

Add demo for run_models experimental method. #365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions g3doc/_book.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ upper_tabs:
path: /cloud/tutorials/distributed_training_nasnet_with_tensorflow_cloud
- title: Hyperparameter tuning on Google Cloud
path: /cloud/tutorials/hp_tuning_cifar10_using_google_cloud
- title: Running vision models from TF Model Garden on GCP with TF Cloud
path: /cloud/tutorials/experimental/running_vision_models_from_tf_model_garden_on_gcp_with_tf_cloud
- heading: "Guides"
- title: Cloud `run` guide
path: /cloud/guides/run_guide
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,341 @@
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "Running vision models from TF Model Garden on GCP with TF Cloud",
"provenance": [],
"collapsed_sections": [],
"toc_visible": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "ApxORpbFShVP"
},
"source": [
"##### Copyright 2021 The TensorFlow Cloud Authors."
]
},
{
"cell_type": "code",
"metadata": {
"id": "eR70XKMMmC8I",
"cellView": "form"
},
"source": [
"#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# https://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License."
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "wKcTRRxsAmDl"
},
"source": [
"# Running vision models from TF Model Garden on GCP with TF Cloud\n",
"\n",
"<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
" <td>\n",
" <!-- <a target=\"_blank\" href=\"https://www.tensorflow.org/cloud/tutorials/overview.ipynb\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a> MSSING HREF -->\n",
" </td>\n",
" <td>\n",
" <!-- <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/cloud/blob/master/g3doc/tutorials/overview.ipynb\"\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a> MSSING HREF -->\n",
" </td>\n",
" <td>\n",
" <!-- <a target=\"_blank\" href=\"https://github.com/tensorflow/cloud/blob/master/g3doc/tutorials/overview.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View on GitHub</a> MSSING HREF -->\n",
" </td>\n",
" <td>\n",
" <!-- <a href=\"https://storage.googleapis.com/tensorflow_docs/cloud/tutorials/overview.ipynb\"><img src=\"https://www.tensorflow.org/images/download_logo_32px.png\" />Download notebook</a> MSSING HREF -->\n",
" </td>\n",
" <td>\n",
" <!-- <a href=\"https://kaggle.com/kernels/welcome?src=https://github.com/tensorflow/cloud/blob/master/g3doc/tutorials/overview.ipynb\" target=\"blank\"> <img width=\"90\" src=\"https://www.kaggle.com/static/images/site-logo.png\" alt=\"Kaggle logo\" />Run in Kaggle</a>MSSING HREF MSSING HREF -->\n",
" </td>\n",
"</table>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "FAUbwFuJB3bw"
},
"source": [
"In this example we will use [run_models](https://github.com/tensorflow/cloud/blob/690c3eee65dadee8af260a19341ff23f42f1f070/src/python/tensorflow_cloud/core/experimental/models.py#L30) from the experimental module of TF Cloud to train a ResNet model from [TF Model Garden](https://github.com/tensorflow/models/tree/master/official) on [imagenette from TFDS](https://www.tensorflow.org/datasets/catalog/imagenette)."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "EFCSAVDbC8-W"
},
"source": [
"## Install Packages\n",
"\n",
"We need the nightly version of tensorflow-cloud that we can get from github, the official release of tf-models-official, and keras 2.6.0rc0 for compatibility."
]
},
{
"cell_type": "code",
"metadata": {
"id": "r4sSs1azu-Ti"
},
"source": [
"!pip install -q 'git+https://github.com/tensorflow/cloud.git@refs/pull/352/head#egg=tensorflow-cloud&subdirectory=src/python' tf-models-official keras==2.6.0rc0"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "N3NC5vrDslsf"
},
"source": [
"## Import required modules"
]
},
{
"cell_type": "code",
"metadata": {
"id": "sdkgm_6PvHkk",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "c17384b4-07f1-493c-edf8-5eadde79524f"
},
"source": [
"import os\n",
"import sys\n",
"\n",
"import tensorflow_cloud as tfc\n",
"from tensorflow_cloud.core.experimental.models import run_models\n",
"\n",
"print(tfc.__version__)"
],
"execution_count": 2,
"outputs": [
{
"output_type": "stream",
"text": [
"0.1.17.dev\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Ka6MHtF-tTU1"
},
"source": [
"## Project Configurations\n",
"Setting project parameters. For more details on Google Cloud Specific parameters please refer to [Google Cloud Project Setup Instructions](https://www.kaggle.com/nitric/google-cloud-project-setup-instructions/)."
]
},
{
"cell_type": "code",
"metadata": {
"id": "OFPPSLF9vx4H"
},
"source": [
"# Set Google Cloud Specific parameters\n",
"\n",
"# TODO: Please set GCP_PROJECT_ID to your own Google Cloud project ID.\n",
"GCP_PROJECT_ID = 'YOUR_PROJECT_ID' #@param {type:\"string\"}\n",
"\n",
"# TODO: set GCS_BUCKET to your own Google Cloud Storage (GCS) bucket.\n",
"GCS_BUCKET = 'YOUR_GCS_BUCKET_NAME' #@param {type:\"string\"}\n",
"\n",
"# DO NOT CHANGE: Currently only the 'us-central1' region is supported.\n",
"REGION = 'us-central1'\n",
"\n",
"# OPTIONAL: You can change the job name to any string.\n",
"JOB_NAME = 'run_models_demo' #@param {type:\"string\"}"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "F1_shlH4tUM5"
},
"source": [
"## Authenticating the notebook to use your Google Cloud Project\n",
"\n",
"This code authenticates the notebook, checking your valid Google Cloud credentials and identity. It is inside the `if not tfc.remote()` block to ensure that it is only run in the notebook, and will not be run when the notebook code is sent to Google Cloud.\n",
"\n",
"Note: For Kaggle Notebooks click on \"Add-ons\"->\"Google Cloud SDK\" before running the cell below."
]
},
{
"cell_type": "code",
"metadata": {
"id": "EeW7IHBgtPJD"
},
"source": [
"if not tfc.remote():\n",
"\n",
" # Authentication for Kaggle Notebooks\n",
" if \"kaggle_secrets\" in sys.modules:\n",
" from kaggle_secrets import UserSecretsClient\n",
" UserSecretsClient().set_gcloud_credentials(project=GCP_PROJECT_ID)\n",
"\n",
" # Authentication for Colab Notebooks\n",
" if \"google.colab\" in sys.modules:\n",
" from google.colab import auth\n",
" auth.authenticate_user()\n",
" os.environ[\"GOOGLE_CLOUD_PROJECT\"] = GCP_PROJECT_ID"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "EQrVntO2twh1"
},
"source": [
"## Set up TensorFlowCloud run\n",
"\n",
"Set up parameters for tfc.run(). The chief_config, worker_count and worker_config will be set up individually for each distribution strategy. For more details refer to [TensorFlow Cloud overview tutorial](https://colab.research.google.com/github/tensorflow/cloud/blob/master/g3doc/tutorials/overview.ipynb)"
]
},
{
"cell_type": "code",
"metadata": {
"id": "o539iLTKv9a3"
},
"source": [
"with open('requirements.txt','w') as f:\n",
" f.write('git+https://github.com/tensorflow/cloud.git@refs/pull/352/head#egg=tensorflow-cloud&subdirectory=src/python\\n'+\n",
" 'tf-models-official\\n'+\n",
" 'keras==2.6.0rc0')\n",
"\n",
"run_kwargs = dict(\n",
" requirements_txt = 'requirements.txt',\n",
" docker_config=tfc.DockerConfig(\n",
" parent_image=\"gcr.io/deeplearning-platform-release/tf2-gpu.2-5\",\n",
" image_build_bucket=GCS_BUCKET\n",
" ),\n",
" chief_config=tfc.COMMON_MACHINE_CONFIGS[\"P100_4X\"],\n",
" job_labels={'job': JOB_NAME}\n",
")"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "hd4luG7nt3_0"
},
"source": [
"## Run the training using run_models"
]
},
{
"cell_type": "code",
"metadata": {
"id": "_aVt71qpxHUe"
},
"source": [
"values = run_models(\n",
" 'imagenette',\n",
" 'resnet',\n",
" GCS_BUCKET,\n",
" 'train',\n",
" 'validation',\n",
" **run_kwargs,\n",
" )"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "Ku7oBH8iuc2X"
},
"source": [
"## Training Results\n",
"\n",
"### Reconnect your Colab instance\n",
"\n",
"Most remote training jobs are long running. If you are using Colab, it may time out before the training results are available.\n",
"\n",
"In that case, **rerun the following sections in order** to reconnect and configure your Colab instance to access the training results.\n",
"\n",
"1. Import required modules\n",
"2. Project Configurations\n",
"3. Authenticating the notebook to use your Google Cloud Project\n",
"\n",
"**DO NOT** rerun the rest of the code.\n",
"\n",
"### Load Tensorboard\n",
"While the training is in progress you can use Tensorboard to view the results. Note the results will show only after your training has started. This may take a few minutes."
]
},
{
"cell_type": "code",
"metadata": {
"id": "rhVCh8x9upRY"
},
"source": [
"if not tfc.remote():\n",
" %load_ext tensorboard\n",
" tensorboard_logs_dir = values['tensorboard_logs']\n",
" %tensorboard --logdir $tensorboard_logs_dir"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "kOU5Gu4Ku1Qc"
},
"source": [
"### Load your trained model\n",
"\n",
"Once training is complete, you can retrieve your model from the GCS Bucket you specified above."
]
},
{
"cell_type": "code",
"metadata": {
"id": "rHoQnqKhu2Y8"
},
"source": [
"import tensorflow as tf\n",
"if not tfc.remote():\n",
" trained_model = tf.keras.models.load_model(values['saved_model'])\n",
" trained_model.summary()"
],
"execution_count": null,
"outputs": []
}
]
}