Skip to content

FleetAutoscaler webhook and Wasm handlers panic when response body omits 'response' field #4555

@MusGaas

Description

@MusGaas

Summary

applyWebhookPolicy and applyWasmPolicy in
pkg/fleetautoscalers/fleetautoscalers.go both dereference
Response.Scale without first checking if Response is nil.

When a webhook or Wasm policy returns a JSON body that omits the
response field (e.g., {}), json.Unmarshal succeeds with
Response as a nil pointer. The subsequent dereference panics,
crashing the controller.

Both HA controller replicas pull the same FleetAutoscaler from the
shared workqueue and crash on the same item, producing CrashLoopBackOff
on both replicas. FleetAutoscaler reconciliation halts cluster-wide
until the resource is removed by an admin.

Affected code

pkg/fleetautoscalers/fleetautoscalers.go:

  • applyWebhookPolicy: faResp.Response.Scale accessed without nil check
  • applyWasmPolicy: review.Response.Scale accessed without nil check

Reproduction

Tested against helm install agones release 1.57.0 in a kind cluster.

  1. Deploy any minimal Fleet.
  2. Deploy an HTTP server that returns {} to all POST requests:
from http.server import HTTPServer, BaseHTTPRequestHandler

class H(BaseHTTPRequestHandler):
    def do_POST(self):
        self.send_response(200)
        self.send_header("Content-Type", "application/json")
        self.end_headers()
        self.wfile.write(b'{}')

HTTPServer(('', 8888), H).serve_forever()
  1. Apply a FleetAutoscaler pointing at that server:
apiVersion: autoscaling.agones.dev/v1
kind: FleetAutoscaler
metadata:
  name: panic-poc
spec:
  fleetName: <fleet-name>
  policy:
    type: Webhook
    webhook:
      url: "http://<server-service>.<namespace>.svc.cluster.local:8888/"
  1. Both agones-controller pods enter CrashLoopBackOff within 30s.

Stack trace

panic: runtime error: invalid memory address or nil pointer dereference
agones.dev/agones/pkg/fleetautoscalers.applyWebhookPolicy(...)
.../pkg/fleetautoscalers/fleetautoscalers.go:355 +0x974
[recovered, repanicked]

Fix

A nil check on Response before dereferencing at both sites.
A PR with the fix is forthcoming.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions