Skip to content

refactor: distinguish between init and attribute types in testing state classes#2331

Open
tonyandrewmeyer wants to merge 6 commits intocanonical:mainfrom
tonyandrewmeyer:scenario-state-init
Open

refactor: distinguish between init and attribute types in testing state classes#2331
tonyandrewmeyer wants to merge 6 commits intocanonical:mainfrom
tonyandrewmeyer:scenario-state-init

Conversation

@tonyandrewmeyer
Copy link
Copy Markdown
Collaborator

@tonyandrewmeyer tonyandrewmeyer commented Feb 17, 2026

When designing the Scenario 7 API we introduced kw-only args, and originally had custom __init__ for each state class to support that. We decided to change that because it felt busy and a lot of maintenance.

However, we currently have an unfortunate mismatch between some of the types accepted to create an instance of a state class and the type the corresponding attribute will be. For example, the init might accept any Mapping but we know the attribute will always be a dict. It would be nice to provide that information to users.

Now that we are using Python 3.10+, we do have some classes without this issue that can continue to use the dataclasses generated __init__. However, there are many that would be better as more explicit, and I am not convinced it's too much work to maintain.

We opened the door to this in #2274 adjusting CheckInfo. This PR applies the same improvement to the rest of the state classes.

Fixes #2152

Copy link
Copy Markdown
Contributor

@james-garner-canonical james-garner-canonical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a big fan of making this change, thanks for taking care of this. I have a number of suggestions around typing and defaults, which I've made on individual lines, though they typically apply to more lines across the PR -- but I figured I'd keep the comments fewer than they'd otherwise be ...

Do you think this change warrants some additional unit tests, or are you happy that the existing tests would catch any errors in this PR? If the latter, please mention the relevant test suites.

object.__setattr__(self, 'rotate', rotate)
object.__setattr__(self, '_tracked_revision', _tracked_revision)
object.__setattr__(self, '_latest_revision', _latest_revision)
_deepcopy_mutable_fields(self)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about inlining the deepcopy calls above? Looks like it would just be tracked_content, latest_content, and remote_grants.

I'm OK with leaving it as-is to keep the PR simpler if that's your preference.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to leave, to keep it simpler and so there's that large comment explaining things.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm increasingly skeptical of this. It would now only matter for tracked_content and latest_content if we make the remote_grants arg's values Iterable[str]. The idea of a one liner that makes everything safe is nice, but I think the abstraction makes this a lot less clear, and it would be much better to have things inlined.

For example, I was surprised to notice just now that _deepcopy_mutable_fields only copies dict and list, not the other built in mutable collection set (which I think would ideally be made a frozenset in that method, but probably can't be right now in the general case for backwards compatibility, with the same reasoning that it doesn't convert list to tuple).

My point is, you have to read it to check that we're making things immutable correctly anyway, so it doesn't actually make this simpler for readers.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we don't actually need deep copies for tracked_content and latest_content since they're flat Mapping[str, str], so we can just call dict in-line. This would be consistent with what we're now doing in Network.__init__.

@tonyandrewmeyer
Copy link
Copy Markdown
Collaborator Author

Do you think this change warrants some additional unit tests, or are you happy that the existing tests would catch any errors in this PR? If the latter, please mention the relevant test suites.

In terms of not breaking the __init__ signature, I feel like the tests from here downwards should cover that (right number of positional/keyword-only arguments in particular), and they should also cover the behaviour of __init__ in terms of forcing immutability (same file, the tests following on from the previous ones). So I feel comfortable that any regressions in this PR would be caught.

In terms of tests for the changes, I'm not super keen on having tests like:

c = CloudCredential(auth_type="foo", redacted=['a', 'b', 'c'])
assert isinstance(c.redacted, list)

I know we have some tests where we expect pyright to find issues, but I'm not sure it's the right move to add something like that for this either.

Do you have any suggestions in terms of tests?

@james-garner-canonical
Copy link
Copy Markdown
Contributor

Do you think this change warrants some additional unit tests, or are you happy that the existing tests would catch any errors in this PR? If the latter, please mention the relevant test suites.

In terms of not breaking the __init__ signature, I feel like the tests from here downwards should cover that (right number of positional/keyword-only arguments in particular), and they should also cover the behaviour of __init__ in terms of forcing immutability (same file, the tests following on from the previous ones). So I feel comfortable that any regressions in this PR would be caught.

Nice!

In terms of tests for the changes, I'm not super keen on having tests like:

c = CloudCredential(auth_type="foo", redacted=['a', 'b', 'c'])
assert isinstance(c.redacted, list)

I know we have some tests where we expect pyright to find issues, but I'm not sure it's the right move to add something like that for this either.

Do you have any suggestions in terms of tests?

I wouldn't mind seeing tests a bit like the one you're not keen on, explicitly encoding (from a user perspective) the type conversion and copying behaviour that we're implementing.

redacted = ['a', 'b', 'c']
...
c = CloudCredential(auth_type="foo", redacted=redacted, ...)
assert isinstance(c.redacted, list)  # or tuple if we go that way
assert c.redacted == redacted
assert c.redacted is not redacted
...

The existing tests probably do cover a lot of this, but a lot of them are hard to follow at a glance due to the parametrization and abstraction.

Copy link
Copy Markdown
Contributor

@dimaqq dimaqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda like this.
If it works, it's fair to merge :)

Happy to leave the details to James.

@tonyandrewmeyer
Copy link
Copy Markdown
Collaborator Author

@james-garner-canonical brought up the excellent point that these are frozen dataclasses that we want people to treat as immutable. So giving their type checker information that they have a list (rather than an immutable Sequence) or a dict (rather than an immutable Mapping) leads them to where we don't want to go.

So rejecting this instead.

@tonyandrewmeyer tonyandrewmeyer deleted the scenario-state-init branch February 19, 2026 23:26
@tonyandrewmeyer tonyandrewmeyer restored the scenario-state-init branch March 21, 2026 08:20
@tonyandrewmeyer tonyandrewmeyer marked this pull request as draft March 21, 2026 08:20
@tonyandrewmeyer tonyandrewmeyer marked this pull request as ready for review March 26, 2026 21:48
@tonyandrewmeyer
Copy link
Copy Markdown
Collaborator Author

@james-garner-canonical I've adjusted per the discussion we had earlier in the week, and this should be good for reviewing again now, thanks!

Copy link
Copy Markdown
Contributor

@james-garner-canonical james-garner-canonical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the direction here, and I definitely think it's worth making these changes to decouple the __init__ argument typing from the attribute typing.

I've flagged a number of items that I think require some further thought before merging.

"""

remote_grants: Mapping[int, set[str]] = dataclasses.field(default_factory=dict)
remote_grants: Mapping[int, set[str]]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would ideally be frozenset[str] for immutability, but that would be a backwards incompatible change.

Suggested change
remote_grants: Mapping[int, set[str]]
remote_grants: Mapping[int, set[str]] # ideally frozenset[str] but set[str] for backwards compatibility

And/or add this to the list of things to fix in the next breaking release?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though it would be neat if we could get away with making it Collection[str] in typing and frozenset at runtime without breaking anyone ...

latest_content: RawSecretRevisionContents | None = None,
id: str | None = None,
owner: Literal['unit', 'app'] | None = None,
remote_grants: Mapping[int, set[str]] = {},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't enforce set[str] for the init argument IMO, since we'd really like it to be frozenset on the attribute in future.

Suggested change
remote_grants: Mapping[int, set[str]] = {},
remote_grants: Mapping[int, Iterable[str]] = {},

This would require us to change how we set the remote_grants attribute below, like this:

        object.__setattr__(self, 'remote_grants', {k: set(v) for k, v in remote_grants.items()})

)
object.__setattr__(self, 'id', id if id is not None else _generate_secret_id())
object.__setattr__(self, 'owner', owner)
object.__setattr__(self, 'remote_grants', dict(remote_grants))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
object.__setattr__(self, 'remote_grants', dict(remote_grants))
# Ideally we'd use frozenset(v) but changing from set would be backwards incompatible.
object.__setattr__(self, 'remote_grants', {k: set(v) for k, v in remote_grants.items()})

object.__setattr__(self, 'rotate', rotate)
object.__setattr__(self, '_tracked_revision', _tracked_revision)
object.__setattr__(self, '_latest_revision', _latest_revision)
_deepcopy_mutable_fields(self)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm increasingly skeptical of this. It would now only matter for tracked_content and latest_content if we make the remote_grants arg's values Iterable[str]. The idea of a one liner that makes everything safe is nice, but I think the abstraction makes this a lot less clear, and it would be much better to have things inlined.

For example, I was surprised to notice just now that _deepcopy_mutable_fields only copies dict and list, not the other built in mutable collection set (which I think would ideally be made a frozenset in that method, but probably can't be right now in the general case for backwards compatibility, with the same reasoning that it doesn't convert list to tuple).

My point is, you have to read it to check that we're making things immutable correctly anyway, so it doesn't actually make this simpler for readers.

object.__setattr__(self, 'service_statuses', dict(service_statuses))
object.__setattr__(self, 'mounts', dict(mounts))
object.__setattr__(self, 'execs', frozenset(execs))
object.__setattr__(self, 'notices', list(notices))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're taking our typing queue from the original default factory here. What's not immediately clear to me is whether this is a case like the other list ones where we typed as Sequence but really want to guarantee list (at least for equality comparisons?), so a local comment would be nice IMO.

def __post_init__(self):
if not isinstance(self.execs, frozenset):
# Allow passing a regular set (or other iterable) of Execs.
object.__setattr__(self, 'execs', frozenset(self.execs))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly it looks like we previously didn't convert check_infos huh?

Comment on lines +1519 to +1521
object.__setattr__(self, 'content', dict(content))
object.__setattr__(self, '_data_type_name', _data_type_name)
_deepcopy_mutable_fields(self)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
object.__setattr__(self, 'content', dict(content))
object.__setattr__(self, '_data_type_name', _data_type_name)
_deepcopy_mutable_fields(self)
object.__setattr__(self, 'content', copy.deepcopy(content))
object.__setattr__(self, '_data_type_name', _data_type_name)

Comment on lines -1575 to +1698
relations: Iterable[RelationBase] = dataclasses.field(default_factory=frozenset)
relations: frozenset[RelationBase]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use Collection instead for all these attributes?

Comment on lines +479 to +523
@pytest.mark.parametrize(
'component,attribute,expected_type,input_value,required_args',
[
# Mapping -> dict
(CloudCredential, 'attributes', dict, {'a': 'b'}, {'auth_type': 'foo'}),
(Secret, 'remote_grants', dict, {1: {'app'}}, {'tracked_content': {'k': 'v'}}),
(Notice, 'last_data', dict, {'k': 'v'}, {'key': 'foo'}),
(Container, 'layers', dict, {}, {'name': 'foo'}),
(Container, 'service_statuses', dict, {}, {'name': 'foo'}),
(Container, 'mounts', dict, {}, {'name': 'foo'}),
(StoredState, 'content', dict, {'k': 'v'}, {}),
# Iterable -> list
(CloudCredential, 'redacted', list, ('a', 'b'), {'auth_type': 'foo'}),
(CloudSpec, 'ca_certificates', list, ('a', 'b'), {'type': 'foo'}),
(
Network,
'bind_addresses',
list,
iter([BindAddress([Address('192.0.2.0')])]),
{'binding_name': 'foo'},
),
(Network, 'ingress_addresses', list, ('1.2.3.4',), {'binding_name': 'foo'}),
(Network, 'egress_subnets', list, ('1.2.3.0/24',), {'binding_name': 'foo'}),
(Container, 'notices', list, (Notice(key='foo'),), {'name': 'foo'}),
(State, 'deferred', list, (), {}),
# Iterable -> frozenset
(Container, 'execs', frozenset, (), {'name': 'foo'}),
(Container, 'check_infos', frozenset, (), {'name': 'foo'}),
(State, 'relations', frozenset, (Relation(endpoint='foo'),), {}),
(State, 'networks', frozenset, (Network(binding_name='foo'),), {}),
(State, 'containers', frozenset, (Container(name='foo'),), {}),
(State, 'secrets', frozenset, (Secret(tracked_content={'k': 'v'}),), {}),
(State, 'stored_states', frozenset, (), {}),
],
)
def test_init_converts_to_concrete_type(
component: type[object],
attribute: str,
expected_type: type,
input_value: Any,
required_args: dict[str, Any],
):
"""Verify that __init__ converts broader input types to concrete attribute types."""
obj = component(**required_args, **{attribute: input_value})
assert isinstance(getattr(obj, attribute), expected_type)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the intent here, but reading it breaks my brain a little.

Minor suggestion: reorder params to component, required_args, attribute, input_value, expected_type to follow the logic of the test.

More radical suggestion: have AI unroll this into a series of unparametrised tests like:

def test_cloud_credential_init_converts_args():
    obj = CloudCredential(
        auth_type='foo',  # required
        attributes={'a', 'b'},
        redacted=('a', 'b'),
    )
    assert obj.attributes == ['a']
    assert obj.redacted == ['a', 'b']

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ops[testing] State.config type, should it be dict or Mapping?

3 participants