|
| 1 | +# Custom Windows CI Agent |
| 2 | + |
| 3 | +We maintain our own custom virtual machine image in Azure that we use in Azure Pipelines to run tests on a Windows machine. |
| 4 | +The default Windows agents provided by Azure Pipelines are not legally allowed to have Docker Desktop installed, which is why we need to maintain our own agent image. |
| 5 | + |
| 6 | +These are the components that are used to manage our Windows agent: |
| 7 | +* A virtual machine scale set hosted that is registered with Azure Pipelines so that new agents can be created as needed. |
| 8 | +* A generalized image from which new virtual machines can be created. |
| 9 | + The scale set is configured to create new agents using this image. |
| 10 | +* The original virtual disk that the generalized image was created from. |
| 11 | + We keep this around so that it's easier to update the agent image over time, without having to repeat the entire vm setup. |
| 12 | + Don't keep the entire vm around, that's much more expensive. |
| 13 | + |
| 14 | +## Configuration |
| 15 | +The image has the following additional software: |
| 16 | + |
| 17 | +* WSL2 |
| 18 | +* Docker Desktop for Windows |
| 19 | +* Azure Pipelines Agent |
| 20 | + |
| 21 | +It is configured to automatically start Docker Desktop when the agent starts (normally Docker Desktop only starts when a user logs in). |
| 22 | +The service is called "Docker Desktop" and is equivalent to double-clicking on the Docker Desktop icon, and it runs as our CI user, porterci. |
| 23 | +There is another service that is "Docker Desktop Service", but that only runs the backend and isn't sufficient for commands like `docker ps` to work. |
| 24 | +Both are needed. |
| 25 | + |
| 26 | +The agent has a custom user defined, porterci, which has been configured with access to the Docker engine. |
| 27 | +When the Azure Pipelines agent executes jobs, the jobs run under the porterci user account. |
| 28 | + |
| 29 | +The vm is configured with environment variables and scripts so that Azure Pipelines can manage the virtual machine and start jobs. |
| 30 | + |
| 31 | +## Maintaining the image |
| 32 | +Right now we only update the image when it stops working for us. |
| 33 | +For example, if we need a newer version of Docker Desktop installed, or need to adjust a configuration setting. |
| 34 | +We do not regularly re-image the agent with security updates, and instead have the agent configured to install updates as needed. |
| 35 | + |
| 36 | +NOTE: Only Microsoft employees can update the image, because our custom Windows agent infrastructure is all hosted in Azure on an internal subscription. |
| 37 | + |
| 38 | +1. Log into the Azure subscription and locate the disk used to generate the current agent image. |
| 39 | + For example, porter-windows-agent-20220810. |
| 40 | +2. Create a virtual machine from the disk using Standard D4s v3 (4 vcpus, 16 GiB memory). |
| 41 | + * It doesn't matter what you use for the admin account, since it will be removed when the vm is generalized later. |
| 42 | + Use your name and preferred password. |
| 43 | + * Select the Windows Client model. |
| 44 | +3. Log into the machine as the administrator account that you specified when you created the virtual machine. |
| 45 | + Use Bastion from inside your web browser to connect, not RDP. |
| 46 | +4. To get into the Docker Desktop user interface, go to "Services" and first stop the "Docker Desktop" service. |
| 47 | + Then double-click on the Docker Desktop icon on the desktop to start a new instance with the user interface attached. |
| 48 | +5. Make any necessary changes to the virtual machine. |
| 49 | +6. Restart the machine and log in as porterci, validate that you can still run `docker ps`. |
| 50 | +7. Shut down the machine and go to the virtual machine's disk in the Azure Portal. |
| 51 | + Create a snapshot of the disk named after the virtual machine. |
| 52 | + This snapshot is what you will use to create a vm the next time you need to update the agent image. |
| 53 | +8. Start the machine and log in as the administrator and run the following command. |
| 54 | + ⚠️ The machine will log you out after the command runs, and you cannot log into the machine again afterwards! |
| 55 | + ``` |
| 56 | + C:\Windows\System32\Sysprep\sysprep.exe /unattend:C:\unattend.xml /oobe /generalize /mode:vm /shutdown |
| 57 | + ``` |
| 58 | +9. Stop the vm, where the name will look like `porter-windows-agent-DATE` and DATE is YYYYMMDD. |
| 59 | + ``` |
| 60 | + az vm stop --resource-group $RESOURCE_GROUP --name $NAME |
| 61 | + ``` |
| 62 | +10. Generalize the vm so that it can be used as a template for making new agents. |
| 63 | + ``` |
| 64 | + az vm generalize --resource-group $RESOURCE_GROUP --name $NAME |
| 65 | + ``` |
| 66 | +11. Create a managed image from the vm. |
| 67 | + ``` |
| 68 | + az image create --resource-group $RESOURCE_GROUP \ |
| 69 | + --name $NAME --os-type windows \ |
| 70 | + --source "/subscriptions/$SUBSCRIPTION/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.Compute/virtualMachines/$NAME" |
| 71 | + ``` |
| 72 | +12. Update the virtual machine scale set to use the new managed image. |
| 73 | + ``` |
| 74 | + az vmss update --resource-group $RESOURCE_GROUP \ |
| 75 | + --name porter-windows \ |
| 76 | + --set virtualMachineProfile.storageProfile.imageReference.id=/subscriptions/$SUBSCRIPTION/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.Compute/images/$NAME |
| 77 | + ``` |
| 78 | +13. Update any existing agents to use the new image. |
| 79 | + ``` |
| 80 | + az vmss update-instances --resource-group $RESOURCE_GROUP \ |
| 81 | + --name porter-windows --instance-ids="*" |
| 82 | + ``` |
| 83 | + This command takes about 15 minutes to complete. |
| 84 | + You can watch the progress by viewing the instances of the vmss in the portal. |
| 85 | +
|
| 86 | +## Initial Creation |
| 87 | +
|
| 88 | +These are only **notes** from when I initially created the first vm and vmss. |
| 89 | +I don't remember all the steps anymore, but they may be helpful if we ever need to start over again. |
| 90 | +
|
| 91 | +**Create the virtual machine scale set** |
| 92 | +``` |
| 93 | +az vmss create \ |
| 94 | + --resource-group $RESOURCE_GROUP --name porter-windows \ |
| 95 | + --image "/subscriptions/$SUBSCRIPTION/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.Compute/images/$NAME" \ |
| 96 | + --vm-sku Standard_D4s_v3 \ |
| 97 | + --public-ip-per-vm \ |
| 98 | + --admin-username porter --admin-password "$PASSWORD" \ |
| 99 | + --instance-count 0 --disable-overprovision \ |
| 100 | + --single-placement-group false --platform-fault-domain-count 1 \ |
| 101 | + --upgrade-policy-mode manual --load-balancer "" |
| 102 | +``` |
| 103 | +
|
| 104 | +I think I could do this better by using an image gallery, then when the gallery is updated with a new image, the vmss would automatically use it. |
| 105 | +But I had trouble getting that to work. |
| 106 | +
|
| 107 | +**Configure Azure Pipelines Agent** |
| 108 | +See https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/v2-windows?view=azure-devops |
| 109 | +
|
| 110 | +Allow powershell to run on the machine |
| 111 | +```powershell |
| 112 | +Set-ExecutionPolicy -ExecutionPolicy Unrestricted -Scope LocalMachine -Force |
| 113 | +``` |
| 114 | + |
| 115 | +Change who the Azure Pipelines agent service runs as with unattended configuration (this is the user the jobs run as) |
| 116 | +https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/v2-windows?view=azure-devops#windows-only-startup |
| 117 | +```powershell |
| 118 | +[Environment]::SetEnvironmentVariable("VSTS_AGENT_INPUT_WINDOWSLOGONACCOUNT", "porterci", 'Machine') |
| 119 | +[Environment]::SetEnvironmentVariable("VSTS_AGENT_INPUT_WINDOWSLOGONPASSWORD", "$PASSWORD", 'Machine') |
| 120 | +``` |
| 121 | + |
| 122 | +I think this was a useful snippet for getting a service to run as a particular user but isn't the actual command that I ran |
| 123 | +``` |
| 124 | +$svc=Get-CimInstance win32_service -Filter 'Name="browser"' |
| 125 | +$svc|Invoke-CimMethod -MethodName Change -Arguments @{StartName='domain\user';StartPassword='Pass@W0rd'} |
| 126 | +``` |
| 127 | + |
| 128 | +Configure the agent with the porter administrator account. |
| 129 | +``` |
| 130 | +.\config.cmd --unattended --url https://dev.azure.com/getporter ` |
| 131 | + --auth PAT --token $TOKEN --pool windows --agent manual-agent ` |
| 132 | + --runasservice --windowslogonaccount porter --windowslogonpassword "$PASSWORD" |
| 133 | +``` |
0 commit comments