This repository contains the Terraform scripts that build and deploy the BigQuery Antipattern Recognition tool to Cloud Run Jobs. On each execution, the tool is configured to perform antipattern recognition for all jobs run during the previous 24 hours and to write the results to a BigQuery table. Optionally, a Cloud Scheduler cron job can be deployed to run the tool on a schedule.
Following resources are created when running the code:
- Cloud Run Job
- Service Account for Cloud Run Job
- Cloud Scheduler (optional)
- Service Account for Cloud Scheduler
- Artifact Registry
- Table in BigQuery Dataset (optional)
Follow the instructions below to set up the tool through Google Cloud Shell or a local terminal.
Before you begin, ensure you have met the following requirements:
-
Terraform: Terraform should already be installed in the Google Cloud Shell. If running locally, follow the instructions here to install it. Make sure you're using Terraform version 1.3.0 or later.
-
Google Cloud SDK: You should have the Google Cloud SDK installed and configured on your local terminal. If you don't have it installed, you can do so here.
-
Google Cloud Project: You should have a Google Cloud project. If you don't have one, you can create one here.
-
Permissions: Ensure that you have the necessary permissions to create and manage resources in the Google Cloud project.
-
BigQuery: You should have BigQuery enabled with Dataset in your Google Cloud project. You can enable it here.
-
Clone the repository:
Clone the repository using the following command:
git clone https://github.com/GoogleCloudPlatform/bigquery-antipattern-recognition.git
-
Navigate to the Terraform directory:
cd bigquery-antipattern-recognition/terraform/
-
Update the variables in the tfvars file:
Open the
terraform.tfvars
file in your preferred text editor and update the values as per your requirements.terraform.tfvars
Here's an example of the variable declaration:
project_id = "" # The ID of the Google Cloud Project where all resources will be created region = "" # The region in which the Artifact Registry, Cloud Run and Cloud Scheduler services will be deployed repository = "" # The name of the Artifact Registry repository cloud_run_job_name = "" # The name of the Cloud Run job that will be created output_table = "" # The BigQuery table that will be used for storing the results from the Anti Pattern Detector apply_scheduler = "" # Whether to apply scheduler or not (true or false) scheduler_frequency = "" # Schedule frequency for the Cloud Scheduler job, in cron format. Default value is "0 5 * * *" bigquery_dataset_name = "" # Name of the existing BigQuery dataset where output table will be created create_output_table = "" # Determines whether the output table is created in the BigQuery Dataset. The default value is true.
Eg:
project_id = "demo-prj-873454" region = "us-central1" repository = "bigquery-antipattern-recognition" cloud_run_job_name = "bigquery-antipattern-recognition" output_table = "antipattern_output_table" apply_scheduler = true scheduler_frequency = "0 5 * * *" bigquery_dataset_name = "antipattern" create_output_table = true
-
Initialize Terraform:
Run the terraform init command to download and initialize the necessary provider plugins for Terraform.
terraform init
-
Apply Terraform configuration:
Apply the Terraform configuration using the following command. This will create all the required resources in Google Cloud.
terraform apply
To review the changes before applying, you can use
terraform plan
.Note: Make sure to confirm the action by typing
yes
when Terraform asks for approval.
If you have suggestions or improvements, feel free to submit a pull request or create an issue.