Skip to content

feat: Support Complex Custom Structures for Property Values #1015

Description

@mgazz

Is your feature request related to a problem? Please describe.

Currently, ado supports scalar values (strings, numbers), flat lists and binary blobs for property values. However, these flat lists lack semantic structure - they are just ordered sequences of values without named fields or type information for individual elements. This limitation becomes problematic when actuators need to work with properties that have multiple related attributes that should be kept together as a cohesive, self-documenting unit.

For example, in the vLLM performance actuator, a Docker image property needs to carry multiple pieces of information:

  • The actual image reference (e.g., icr.io/drl-nextgen/mgazz/vllm:v0.18.0-tt.v1.2.5)
  • The vLLM version (e.g., 0.18.0)
  • Potentially other metadata like supported features, compatibility flags, etc.

While ado does support lists, these are flat, unstructured sequences. When related attributes must be encoded as positional elements in a list (e.g., ["icr.io/...", "0.18.0"]), this approach is:

  • Not self-documenting: Users must know that position 0 is the image reference and position 1 is the version
  • Error-prone: Easy to mix up the order or forget which position means what
  • Limited: Cannot easily extend with additional attributes without breaking existing code
  • Fragile: Actuator code must use positional indexing (image_value[0], image_value[1]) throughout
  • Untyped: No way to specify that position 0 should be a string (image ref) and position 1 should be a version string with specific format

Describe the solution you'd like

Allow property values in discovery spaces to be complex, structured objects with named fields. The structure would be defined and validated by the actuator that introduces the property.

Example YAML Syntax

entitySpace:
  - identifier: "image"
    metadata:
      description: "Docker image with vLLM + terratorch"
    propertyDomain:
      variableType: "CATEGORICAL_VARIABLE_TYPE"
      values:
        - name: vllm-v0-18-0-tt-v1-2-5
          imageRef: "icr.io/drl-nextgen/mgazz/vllm:v0.18.0-tt.v1.2.5"
          vllmVersion: "0.18.0"
        - name: vllm-v0-20-1-tt-main
          imageRef: "icr.io/drl-nextgen/mgazz/vllm:v0.20.1-tt.main"
          vllmVersion: "0.20.1"
          supportsThreadpool: true

Benefits

  1. Self-documenting: Field names make it clear what each attribute represents
  2. Type-safe: Actuators can define schemas for their custom structures using Pydantic models
  3. Extensible: New fields can be added without breaking existing code (with proper defaults)
  4. Maintainable: Actuator code can access fields by name (image_value.imageRef, image_value.vllmVersion)
  5. Validated: The actuator can validate that all required fields are present and have correct types

Describe alternatives you've considered

Current Workaround: Positional Lists

As implemented in PR #1014, we currently encode multiple attributes as positional elements in a list:

- identifier: "image"
  metadata:
    description: "Docker image with vLLM + terratorch"
  propertyDomain:
    variableType: "CATEGORICAL_VARIABLE_TYPE"
    values:
      - ["icr.io/drl-nextgen/mgazz/vllm:v0.18.0-tt.v1.2.5", "0.18.0"]
      - ["icr.io/drl-nextgen/mgazz/vllm:v0.20.1-tt.main", "0.20.1"]

Actuator code must then extract values positionally:

Critical Design Consideration: Validation Without Actuator

Challenge: A discovery space with custom structured properties might be stored in the database, but the actuator that defines those structures might not be installed in the current ado environment. This creates a validation and portability problem.

Proposed Solutions:

Option A: Embedded Schema

  • Store the Pydantic JSON schema alongside the structured values in the discovery space
  • When actuator is present: validate against actuator's current schema
  • When actuator is absent: validate against embedded schema, issue warning
  • Critical Limitation: JSON schema cannot capture custom validators (e.g., "image must be from quay.io or docker.io"). Custom validation logic in Pydantic validators, field validators, or model validators cannot be serialized.
  • Pros: Basic type validation without actuator
  • Cons:
    • Incomplete validation (missing custom business logic)
    • Schema duplication
    • False sense of security (passes basic validation but might fail actuator's custom rules)

Option B: Graceful Degradation

  • When actuator is absent, treat structured values as opaque dictionaries
  • No validation, but preserve structure for later use
  • Issue clear warnings that validation cannot be performed without actuator
  • Pros:
    • Simple and honest about limitations
    • No false sense of security
    • Preserves data for later validation
  • Cons:
    • No validation without actuator
    • Potential for invalid data to persist

I propose the combination of Option A and Option B to tackle this problem.

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions