The monitors catch specific errors and in some cases will not re-queue the job if an error occurs. For all other errors, the Job Queue service default handling is used, which will also not re-queue the job.
We should consider, for all error cases which will lead to the discarding of the job, sending a standard FAILED result for the Infrastructure or Lifecycle task, so Brent knows it has finished in error. Otherwise, Brent will continue to wait for a response, and will only update the transition as failed when a timeout occurs, with no knowledge of the error.
Raising this as a bug, rather than a new feature request, as it seems to be a mistake in the implementation of the monitors.
The monitors catch specific errors and in some cases will not re-queue the job if an error occurs. For all other errors, the Job Queue service default handling is used, which will also not re-queue the job.
We should consider, for all error cases which will lead to the discarding of the job, sending a standard FAILED result for the Infrastructure or Lifecycle task, so Brent knows it has finished in error. Otherwise, Brent will continue to wait for a response, and will only update the transition as failed when a timeout occurs, with no knowledge of the error.
Raising this as a bug, rather than a new feature request, as it seems to be a mistake in the implementation of the monitors.