Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support development of new checks #101

Open
iay opened this issue Nov 20, 2024 · 8 comments
Open

Support development of new checks #101

iay opened this issue Nov 20, 2024 · 8 comments
Assignees

Comments

@iay
Copy link
Member

iay commented Nov 20, 2024

New checks on metadata have historically been developed in a rather ad hoc fashion in a variety of places. One of the goals of the testbed is to provide a single place to do this in a more consistent way. This has been hard to do in the initial testbed setup because the emphasis has been on handling the 0.9/0.10/Xalan/no-Xalan combinations, but once we have progressed through at least part of #100 that will be less of a concern and it's worth coming back to this.

As a basic approach, I think we should assume that new checks are initially developed on a branch in this repository. That means that they can be developed alongside their corresponding tests and, once the check has been implemented in the production deployment, the tests developed here (as well as any required changes to other tests) can be cherry-picked onto the main branch.

(In principle, I'd prefer that to be a merge operation but you'd need to manually reconcile the two versions of the check and that seems like more complication for little benefit. It may be worth revisiting the details once we have worked through a real example or two.)

I'd like this to be compatible with a longer term move towards using policy files (essentially, lists of checks) taken directly from the production deployment, replacing the current setup where we construct such lists locally in a duplicative way which may be error-prone. My thought is therefore that the pipeline configuration is broken up into a couple of components which can be locally overridden in a development branch, along with some conventions as to where to place the development content.

The current scheme has:

  • default-validator.xml only exists in the all overlay, and defines pipeline to include only default_validator_stages.
  • default-validator.xml includes default-validator-stages.xml (again, currently only defined in the all overlay) which defines default_validator_stages as a list of checks. This is the part most likely to be replaced by a reference to a policy file from the production deployment, once we have one which fits the bill.

The simplest way of extending this current system to allow for new check development would, I think, be to change the pipeline definition from calling the single default_validator_stages to calling that and then a second CompositeStage which would contain a list of checks under development. Purely as a straw man, let's call that development_stages.

To provide a clean separation of the under-development checks within the testbed, I think a good approach would be to not define that second list in the all overlay but in a new overlay. A straw man for the overlay name would be development and it would be applied in every validator, probably quite far down the build process if not actually at the end. In the main branch, the overlay would include a single XML configuration defining the CompositeStage bean referenced by pipeline. It would be empty.

In a development branch, you'd put all of the definitions for the new check or checks into an override version of the file defining development_stages within the development overlay. If they are XSL-based checks, the XSL files can live in the appropriate place in the same overlay. The tests associated with the checks live outside the overlay in the usual place (/tests/xml/...). I hope this would make it easier to pull just those into the main branch when the production code picks up the code for the actual check.

(Long term, the production code will also pick up a reference to the new check at this point; shorter term that needs to be added into default_validator_stages manually when the tests are pulled in.)

The development branch is discarded at this point. As I mentioned earlier, in principle I'd prefer a merge here rather than a cherry-pick/delete operation but it's not yet clear to me how to get there from where we currently are. Once we have appropriate policy files we can look at that again.

@philsmart
Copy link
Member

philsmart commented Dec 5, 2024

To check. Once development has finished, (from the development branch) we pull in the tests from /test/xml/..., pull in any XSL files (if there are any) from the development overlay, and then manually update the default_validator_stages list with the new checks (always leaving the default development-stages XML as a template).

@iay
Copy link
Member Author

iay commented Dec 5, 2024

I think that's almost right. Let me clarify the process a little.

To develop a new check, you git checkout -b my-new-check a new topic branch. On that branch, under the validators/overlays/development/classes directory, you:

  • Add any new XSL files required
  • Add bean definitions for those in development-stages.xml
  • Add references to the new beans in the stages list in the development_stages bean in development-stages.xml

You also add new checks under tests/xml somewhere.

That's four buckets of changes and they have different dispositions when the new check is integrated into the production deployment:

  • The new XSL files are copied upstream under mdx/_rules or perhaps to mdx/uk. This appears in ukf-meta's main branch. It gets into ukf-testbed when the prod overlay is re-imported.
  • The new bean definitions are copied upstream, probably into mdx/validation-beans.xml. This appears in ukf-meta's main branch. It gets into ukf-testbed when the prod overlay is re-imported.
  • The references to the new check beans are:
    • Added in appropriate places upstream so that they are invoked when we want them to be. This appears in ukf-meta's main branch. As things stand, although these will appear in ukf-testbed when the prod overlay is re-imported, we won't make use of that. [X]
    • Added into the stages list in the default_validator_stages bean in validators/overlays/all/classes/default-validator-stages.xml. This is a change that needs to make it into the ukf-testbed's main branch.
  • The new checks need to make it across to the tests/xml tree in ukf-testbed's main branch.

This probably involves two new topic branches as an intermediate state: one on ukf-meta and another in ukf-testbed.

This is way more complicated than I'd like to see longer term, but it's probably as good as we can do until we have isolated the checking policy (list of stages) in ukf-meta. Then [X] above would become unnecessary.

Medium term, we might also be able to come up with a script which collapses the my-new-check branch into "just the tests" by rebasing out the other changes to reduce the amount of manual work in pulling those over.

@philsmart
Copy link
Member

philsmart commented Dec 5, 2024

Ideal, thanks. I forgot to include the ukf-meta updates (which is the whole point!); this makes it concrete. I should go through this as a test. I'll approve PR #104 so we are in the right state to test it.

@alexstuart
Copy link
Member

Ideal, thanks. I forgot to include the ukf-meta updates (which is the whole point!); this makes it concrete. I should go through this as a test. I'll approve PR #104 so we are in the right state to test it.

If you need a new check to go through the process, I suggest Add metadata checks for Pseudonymous Authorization and Anonymous Authorization entity categories (ukf-meta#388)

@iay
Copy link
Member Author

iay commented Dec 16, 2024

I should go through this as a test.

Phil volunteered, so assigning him for now. Holding this open until we've run at least one development cycle.

@philsmart
Copy link
Member

I'll try that this week.

@philsmart
Copy link
Member

@iay if we are going to add development beans (new or overrides) to development-stages.xml, should we add the p and c namespaces to it? (obviously, we do not need to, it just keeps it consistent with other bean configs).

@iay
Copy link
Member Author

iay commented Dec 20, 2024

Yes, that sounds sensible. There's no downside that I can think of.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants