AzFleet (Azure Fleet) is a set of tools that enables you to run IO tests on a fleet of Linux or Windows VMs on Azure. It consists of a set of PowerShell scripts that you can use to create pools of VMs and execute jobs that simulate IO workloads. It is implemented in a set of Powershell, Bash, and Python scripts as well as ARM templates.
You only need a client machine that can connect to your Azure environment. Ensure that you have installed powershell for Azure and can connect to the Azure cloud.
Follow the guidance in the Azure documentation:
From github download all the files from the tools directory in this repository. You can run this script to download the tools:
[Net.ServicePointManager]::SecurityProtocol = "tls12, tls11, tls"
Invoke-WebRequest -Uri https://github.com/bekimd-ms/azfleet/archive/master.zip -OutFile azfleet.zip
Expand-Archive -Path .\azfleet.zip -DestinationPath .\azfleet
Copy-Item -Path .\azfleet\azfleet-master\toolsvmss\* .\azfleet -Recurse -Force
Remove-Item -Recurse -Path .\azfleet\azfleet-master\
Remove-Item .\azfleet.zip
TODO: It is currently not possible to run the tools in disconnected mode. If there is enough interest the tools and the process can be easily modified to achieve this.
You canstart with your first test workload by following this sequence of powershell commands.
Open a PowerShell console and login into your Azure Stack environment as described in the Azure Stack documents referenced above.
Set the location, username and password variables
$location = [your Azure Stack region name]
$groupname = [name of the resource gruop that will contain all the resources]
$username = [name of the admin user for the VMs]
$password = [password of the admin user for the VMs]
Create a resource group and deploy the controller VM. The template will deploy a vnet that all VMs will share.
.\deploygroup.ps1 -GroupName $groupname -Location $location
After you deploy the group create a configuration file that contains the name of the group and the name of storage account used for test data. The name of the storage account is "grp" + [name of the group] + "sa" Here is an example of a config file:
{
resourcegroup: "azfleet",
storageaccount: "grpazfleetsa"
}
Add the following environmental variable with the name of your config file:
$env:AZFLEET_CONFIG="azfleet.conf"
Create two pools of 2 VMs. One pool contains Linux VMs and one pool contains Windows VMs.
.\deploypool.ps1 -vmPool linpool1 -vmCount 2 -vmOS linux -vmSize Standard_F2s_v2 -vmDataDisks 1 -vmDataDiskGB 128 -vmAdminUserName $username -vmAdminPassword $password
.\deploypool.ps1 -vmPool winpool1 -vmCount 2 -vmOS windows -vmSize Standard_F2s_v2 -vmDataDisks 1 -vmDataDiskGB 128 -vmAdminUserName $username -vmAdminPassword $password
Use the "-site" parameter if you want to deploy to a EdgeZone site.
After the template deployments complete check that the VMs are ready to execute jobs.
.\control.ps1 pool get
This command should return list of all VMs in each of the pool and their status.
Pool: lin1 READY
Name State IP OS Size Timestamp
---- ----- -- -- ---- ---------
lin1-vm0 READY 10.0.0.6 Linux Standard_F2s_v2 10/23/2018 12:07:19 +00:00
lin1-vm1 READY 10.0.0.5 Linux Standard_F2s_v2 10/23/2018 12:07:25 +00:00
Pool: win1 READY
Name State IP OS Size Timestamp
---- ----- -- -- ---- ---------
win1-vm0 READY 10.0.0.8 Windows Standard_F2s_v2 10/23/2018 12:07:21 +00:00
win1-vm1 READY 10.0.0.7 Windows Standard_F2s_v2 10/23/2018 12:07:25 +00:00
When all the VMs are ready you can start your first job:
.\control job start "lin1=randrw8k-lin.job win1=randrw8k-win.job"
This command will start the job and output the job ID and other information.
Creating job: 20181023-200756
Copying file: randrw8k-lin.job to storage: .\\workload\\randrw8k-lin.job
Executing job: randrw8k-lin.job on pool: lin1
Starting task: EXECUTE|fio| on node: lin1-vm0
Starting task: EXECUTE|fio| on node: lin1-vm1
Copying file: randrw8k-win.job to storage: .\\workload\\randrw8k-win.job
Executing job: randrw8k-win.job on pool: win1
Starting task: EXECUTE|fio| on node: win1-vm0
Starting task: EXECUTE|fio| on node: win1-vm1
To get the status of the job use the job ID from the output of the previous command:
.\control job get 20181023-200756
While the jobs are executing this will show the status of each VM.
Job 20181023-200756 is: EXECUTING
Node State LastUpdateTime Output
---- ----- -------------- ------
lin1_lin1-vm0 EXECUTING 10/23/2018 12:09:20 +00:00 20181023-200756lin1_lin1-vm0
lin1_lin1-vm1 EXECUTING 10/23/2018 12:09:26 +00:00 20181023-200756lin1_lin1-vm1
win1_win1-vm0 EXECUTING 10/23/2018 12:09:21 +00:00 20181023-200756win1_win1-vm0
win1_win1-vm1 EXECUTING 10/23/2018 12:09:25 +00:00 20181023-200756win1_win1-vm1
When the jobs are completed summary results for the run will be shown.
Job 20181023-200756 is: COMPLETED
JobParams: wl=randrw 60:40; bs=64K; iodepth=64; jobs=4; filesize=4G; runtime=120; engine=libaio
Node State RIOPSmean RMbsmean RLatmean RLat50p RLat90p RLat99p WIOPSmean WMbsmean WLatmean WLat50p WLat90p WLat99p UsrCPU SysCPU
---- ----- --------- -------- -------- ------- ------- ------- --------- -------- -------- ------- ------- ------- ------ ------
lin1_lin1-vm0 COMPLETED 4097 262 37.194 1.942 110.625 124.256 2736 175 37.815 1.909 111.673 126.353 1 4
lin1_lin1-vm1 COMPLETED 4100 262 37.153 1.843 110.625 124.256 2737 175 37.820 2.114 111.673 126.353 1 4
JobParams: wl=randrw 60:40; bs=64K; iodepth=64; jobs=4; filesize=4G; runtime=120; engine=windowsaio
Node State RIOPSmean RMbsmean RLatmean RLat50p RLat90p RLat99p WIOPSmean WMbsmean WLatmean WLat50p WLat90p WLat99p UsrCPU SysCPU
---- ----- --------- -------- -------- ------- ------- ------- --------- -------- -------- ------- ------- ------- ------ ------
win1_win1-vm0 COMPLETED 1905 122 80.167 44.827 200.278 471.859 1270 81 81.061 45.351 202.375 476.054 0 1
win1_win1-vm1 COMPLETED 1939 124 78.961 52.167 202.375 387.973 1292 83 79.326 52.167 202.375 387.973 0 1
Run the deploygroup.ps1 script to create the resource group, deploy a virtual network, and other shared objects for all the VMs that you will use for workload test.
.\deploygroup.ps1 -GroupName azfleet -Location $location
You only need to run this once for a new resource group.
### Pools
Pools are sets of VMs with identical configuration. Each pool is created as VM scale set with a load balancer.
All the VMs in a pool have the same OS, size, number of data disks and sizes of data disks.
You can increase or decrease the size of the pool. You can stop and restart all the VMs in the pool.
When executing a workload job you target a pool. All the VMs in the pool will execute the same job with the same parameters.
#### Deploy a new pool
To deploy a new pool run the deploypool.ps1 script.
```powershell
.\deploypool.ps1 -vmPool pool1
-vmCount 2
-vmOS linux
-vmSize Standard_F2s_v2
-vmDataDisks 4
-vmDataDiskGB 128
-vmAdminUserName $username
-vmAdminPassword $password
To get the information about the VMs in the pool run the getpool.ps1 script.
.\getpool.ps1 -vmPool pool1
To stop(deallocate) all the VMs in a pool run the stoppool.ps1 script.
.\stoppool.ps1 -vmPool pool1
To start all the VMs in a pool run the startpool.ps1 script.
.\stoppool.ps1 -vmPool pool1
You can scale the pool with the scalepool.ps1 script.
To set the size of the pool to 5 VMs:
.\scalepool.ps1 -vmPool pool1 -vmCount 5
To remove all the VMs in the pool and the associated load balancer run the removepool.ps1 script.
.\removepool.ps1 -vmPool pool1
AzFleet uses fio to run tests on both windows and linux VMs.
The tool uses the fio job definitions decsribed here (e.g.https://github.com/axboe/fio/tree/master/examples ).
Two default jobs for windows and linux are available in the workloads directory. You can create additional job definitions that must be stored in the workloads directory.
To execute and control jobs use the control.ps1 script.
You can get the status of the VMs in the pool by executing the following command:
.\control.ps1 pool get
This will output a list of VMs that are registered in the pool and their current status.
Execute the workload by using the control.ps1 script and passing a workload definition.
.\control.ps1 job start "pool1=jobfile1.job pool2=jobfile2.job"
This command executes the jobfile1.job on pool1 and jobfile2.job on pool2. This will add a new job and the VM agents of the respective pools will start executing it.
You can use control.ps1 script to list all previously completed and currently active jobs.
.\control.ps1 job get
You can use control.ps1 script to check the status of a job.
.\control.ps1 job get
While the VM agents are still executing the job the script will output list of the VMs and their current status. When all VM agents have completed the job the script will output the summarized results for all the VMs.
The following tables describes the columnes in the job report.
| Column | Description |
|:------------- |:---------------------- |
| Node | Name of the VM executing the job |
| State | Current state of the VM |
| RIOPSmean | Read IOPS mean value |
| RMbsmean | Read throughput mean value in MB/s |
| RLatmean | Read latency mean value (us) |
| RLat50p | Read latency 50 percentile |
| RLat90p | Read latency 90 percentile |
| RLat99p | Read latency 99 percentile |
| WIOPSmean | Write IOPS mean value |
| WMbsmean | Write throughput mean value in MB/s |
| WLatmean | Write latency mean value (us) |
| WLat50p | Write latency 50 percentile |
| WLat90p | Write latency 90 percentile |
| WLat99p | Write latency 99 percentile |
| UsrCPU | User-mode CPU utilization |
| SysCPU | System-mode CPU utilization |