Summary
roc interrupt (v3.1.0) fails when the FIS role aws-fis-itn does not yet exist. The auto-create step submits a malformed IAM trust/assume-role policy document, so iam:CreateRole returns MalformedPolicyDocument: Syntax error at position (8,9) and the command aborts before starting the FIS experiment.
Version
roc v3.1.0 (roc_v3.1.0_darwin_arm64)
- Region: eu-central-1
- RunsOn Flex (Terraform) deployment
Repro
Run roc interrupt against a job on a running spot instance, in an account where the aws-fis-itn role doesn't already exist:
roc interrupt <JOB_URL> --stack <stack> --delay 30s --debug
Observed output
Found instance i-xxxx for job xxxx
Using AWS region: eu-central-1
Testing basic AWS connectivity...
✓ AWS connectivity verified (Account: xxxx)
Performing pre-flight checks...
Testing FIS service access...
✓ FIS service access verified
Verifying instance i-xxxx details...
Instance lifecycle: spot, state: running
✓ Instance i-xxxx is a running spot instance
Triggering spot interruption on instance i-xxxx with 30s delay in region eu-central-1...
Creating IAM role: aws-fis-itn
failed to trigger spot interruption in region eu-central-1: failed to create FIS role: failed to create role: operation error IAM: CreateRole, https response error StatusCode: 400, RequestID: xxxx, MalformedPolicyDocument: Syntax error at position (8,9)
All pre-flight checks pass (FIS access, instance is a running spot instance) — it only fails at CreateRole. The (8,9) position points at the assume-role policy document roc generates for the role.
Workaround
Pre-create the role manually; roc reuses it if it already exists. A minimal working definition:
- Trust policy: allow
fis.amazonaws.com to sts:AssumeRole
- Permissions:
ec2:SendSpotInstanceInterruptions on arn:aws:ec2:*:*:instance/*
After creating it by hand, roc interrupt proceeds and the FIS experiment runs as expected.
Likely cause
The trust-policy JSON generated for the aws-fis-itn role appears malformed (possibly an encoding/escaping issue, e.g. an empty/invalid Principal or a stray character at the document start). Worth checking the literal string passed to CreateRole's AssumeRolePolicyDocument.
Summary
roc interrupt(v3.1.0) fails when the FIS roleaws-fis-itndoes not yet exist. The auto-create step submits a malformed IAM trust/assume-role policy document, soiam:CreateRolereturnsMalformedPolicyDocument: Syntax error at position (8,9)and the command aborts before starting the FIS experiment.Version
rocv3.1.0 (roc_v3.1.0_darwin_arm64)Repro
Run
roc interruptagainst a job on a running spot instance, in an account where theaws-fis-itnrole doesn't already exist:Observed output
All pre-flight checks pass (FIS access, instance is a running spot instance) — it only fails at
CreateRole. The(8,9)position points at the assume-role policy document roc generates for the role.Workaround
Pre-create the role manually; roc reuses it if it already exists. A minimal working definition:
fis.amazonaws.comtosts:AssumeRoleec2:SendSpotInstanceInterruptionsonarn:aws:ec2:*:*:instance/*After creating it by hand,
roc interruptproceeds and the FIS experiment runs as expected.Likely cause
The trust-policy JSON generated for the
aws-fis-itnrole appears malformed (possibly an encoding/escaping issue, e.g. an empty/invalidPrincipalor a stray character at the document start). Worth checking the literal string passed toCreateRole'sAssumeRolePolicyDocument.