The YAML configuration and template system is what makes Azure DMS Batch uniquely powerful and flexible.
With just a simple YAML file, you can define and submit complex Azure Batch jobs without writing any code:
# Basic job configuration
resource_group: my_resource_group
job_name: test_job
batch_account_name: my_batch_account
storage_account_name: my_storage_account
template_name: win_dsm2
# VM configuration
vm_size: standard_ds2_v2
num_hosts: 1
# Command to execute
command: 'echo "This is a test job"'
Submit your job with a single command:
dmsbatch submit-job --file my_config.yml
- Simple Job Definition - Define all aspects of your job in a clean, readable YAML format
- Pre-built Templates - Use specialized templates for different workloads (DSM2, SCHISM, MPI, etc.)
- Minimal Configuration - Override only the parameters you need; inherit sensible defaults from templates
- Dynamic Substitution - Use variable references and custom tags for flexible configurations
- Template-Based Architecture - Standardized approach for different model types
- Job YAML Configuration Guide - How to write job configuration YAML files
- Template System Documentation - Deep dive into how the template system works
Models are processes that take input and process via files and environment variables and run an executable producing output
(input(s) --> EXE --> output(s))
Azure Batch runs for a model, i.e., a executable that runs independently based on a set of input files and environment variables and produces a set of output files.
Use the environment.yml with conda to create an environment called azure
conda env create -f environment.yml
or
pip install -r requirements.txt
Git clone this project
git clone https://github.com/CADWRDeltaModeling/azure_dms_batch.git
Change directory to the location of this project and then install using
pip install --no-deps -e .
Setup can be done via az commands. Here we setup a batch account with associated storage
az login
See the Azure docs for details. To use the commands below, enter your values (replacing the angle brackets and values)
az group create --name <resource_group_name> --location <location_name>
az storage account create --resource-group <resource_group_name> --name <storage_account_name> --location <location_name> --sku Standard_LRS
az batch account create --name <batch_account_name> --storage-account <storage_account_name> --resource-group <resource_group_name> --location <location_name>
You can also create the batch account and associated account as explained here https://docs.microsoft.com/en-us/azure/batch/batch-account-create-portal
This is needed later when deciding what machine sizes to use
az batch location list-skus --location <location_name> --output table
You can also browse the availability by region as not all VMs are available in every region
This page is to guide selection of VMs by different attributes
This is needed later when deciding what machine sizes to use
az batch location list-skus --location <location_name> --output table
You can also browse the availability by region as not all VMs are available in every region
This page is to guide selection of VMs by different attributes
set AZ_BATCH_ACCOUNT=<batch_account_name>
set AZ_BATCH_ACCESS_KEY=<batch_account_key>
set AZ_BATCH_ENDPOINT=<batch_account_url>
az batch pool supported-images list --output table
A sample output is included for quick reference
Azure allows you to do most things via the command line interface (cli) or the web console. However I have found the following desktop apps useful for working with these services.
Batch Explorer is a desktop tool for managing batch jobs, pools and application packages
Storage Explorer is a desktop tool for working with storage containers
For all new projects, we recommend using the YAML configuration system instead of writing Python code directly:
# Submit a job using a YAML configuration file
dmsbatch submit-job --file my_config.yml
This approach is much simpler than the notebook examples and provides all the same capabilities. We have several example configuration files in the sample_configs directory, such as:
- sample_dsm2_ptm.yml - DSM2 Particle Tracking Model
- sample_container_echo.yml - Container-based job
- sample_schism_pp.yml - SCHISM post-processing
See the detailed documentation for SCHISM specific run setup in README-schism-batch.md
Note: For MPI workloads, YAML configuration is now the recommended approach. See the SCHISM-specific configuration guide and the template system documentation for details.
Note: For information on how to submit parameterized runs, see the architecture documentation.
An example notebook for PTM batch runs that vary based on environment variables demonstrates this capability. It also shows an example where a large file needs to be uploaded and shared with all the running tasks.
Note: For information on how BeoPEST is implemented, see the architecture documentation.
This notebook showing an implementation of the beopest run scheme demonstrates how this works.
See the sample notebooks for examples The samples explain step by step and can be used as a template for writing your own batch run
See the simplest example notebook for running dsm2 hydro and outputting its version
See the slightly more involved example notebook for running dsm2 hydro with input and output file handling which uploads the input files as a zip and then uploads the output directory next to the uploaded input files at the end of the run
Note: While these notebooks demonstrate how to use the Azure DMS Batch API directly, we recommend using the YAML configuration system for new projects.
- Job YAML Configuration Guide - How to write job configuration YAML files
- Template System Documentation - Details on how the template system works
- Script Templates Documentation - In-depth information on script templates
- SCHISM-specific Configuration - For SCHISM model workloads
- Architecture Documentation - Implementation details for developers