Fix ( colang v2 serialization ): prevent Unknown d_type by encoding non-registered dataclasses as dicts #1429
+49
−6
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
colang/v2_x
state serialization was type-tagging all dataclasses (e.g.,{"__type":"Foo","value":...}
), which breaks decoding for classes outside Guardrails’ known types (name_to_class
is built fromcolang_ast
+flows
). When such JSON is round-tripped viajson_to_state
, decoding raisesUnknown d_type: Foo
.This PR keeps type tags for known Guardrails classes (so they still round-trip as structured objects) and encodes unknown dataclasses as plain dicts (
{"__type":"dict","value":...}
), ensuring robust decoding for user-land or third-party dataclasses. Adds a unit test to prevent regressions.What’s affected (scope in the framework)
Module:
nemoguardrails/colang/v2_x/runtime/serialization.py
encode_to_dict
(encoding path) — changeddecode_from_dict
(decoding path) — unchanged, but now protected from unknown dataclass tagsstate_to_json
/json_to_state
— behavior preserved; round-trip is more resilientRuntime surfaces that rely on state JSON:
State
JSONChanges
Encoding rule for dataclasses:
type(obj).__name__
is inname_to_class
(i.e., Guardrails’ own Colang/flows types) → retain type tag ({"__type":"ClassName","value":...}
) to enable full object reconstruction.name_to_class
(unknown/user-land dataclass) → encode as dict ({"__type":"dict","value":...}
) to avoidUnknown d_type
on decode.Tests:
tests/test_serialization_dataclass.py
ensures an unknown dataclass is encoded as a dict payload and decodes safely.Rationale
{"__type":"CustomClass"}
whichdecode_from_dict
cannot map back (sincename_to_class
is limited), causing hard failures when logs are reloaded or states are restored.Testing
Unit test (new):
python -m pytest tests/v2_x/test_serialization_dataclass.py -q
Backward compatibility & risk
{"__type":"dict","value":...}
), so downstream consumers that already tolerate dict-encoded nodes remain compatible.Developer notes
name_to_class
is populated fromcolang_ast_module
andflows_module
. The new rule relies solely on that mapping to decide when to keep a class tag vs. downgrade to dict.Links