Skip to content

RFD for Software Identifiers #407

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

alilleybrinker
Copy link

@alilleybrinker alilleybrinker commented May 7, 2025

This adds an RFD describing a design for adding support for more software identifier types to the CVE Record Format, including a deep dive on the thinking behind the design.


For QWG members, this is a lightly-edited version of the document which had previously been shared directly with the QWG. Building off of the RFD proposal (#405), this follows the RFD template.

If this gets merged before #405, I ask that this RFD get ID number 0002, so 0001 can be reserved for the RFD process proposal.

This proposal is based on "Option 2" for how to add more identifier types, building on the affected array, and the draft implementation may be seen in a pair of pull requests:

Normally, in a future RFD process, these implementation PRs would not happen until after the RFD was approved by the QWG and the CVE Board. As we're in the process of considering changes to the QWG's process, they have been done out of order here.

There are also leftover pull requests for "Option 1" based on the cpeApplicability object. These are no longer under consideration but are left for reference to prior designs:


This adds an RFD describing a design for adding support for more
software identifier types to the CVE Record Format, including
a deep dive on the thinking behind the design.

Signed-off-by: Andrew Lilley Brinker <[email protected]>
@alilleybrinker alilleybrinker force-pushed the alilleybrinker/software-id-rfd branch from d61f397 to c1b0d8e Compare May 7, 2025 20:28
@alilleybrinker
Copy link
Author

alilleybrinker commented May 7, 2025

First comment reserved for open questions (will be edited)

  • Whether to vendor the purl specification.
    • Current sense of the group: yes, vendor the spec
  • Whether to vendor the OmniBOR specification.
    • Current sense of the group: yes, vendor the spec
  • Whether to permit the "generic" and "swid" types for purl.
    • Current sense of the group: disallow the "generic" and "swid" types
  • Whether to limit the number of software identifiers in a single CVE record.
    • Current sense of the group: do not limit the number, rely on overall CVE record size limit
  • Whether to permit versions directly in a purl. They are currently allowed in a CPE, but by convention if provided in a CPE then version constraint fields should not be used.
    • Current sense of the group: match existing CPE behavior
  • Whether to use the vers spec for specifying version constraints instead of reusing the cpeMatch version fields.
    • Current sense of the group: do not use the vers spec
  • Whether to add more software identifiers to the affected array rather than modifying the cpeApplicability object.
    • Current sense of the group: use the affected array

Issues considered resolve are marked with a checkmark. If you believe an issue is not resolved, please raise it in a comment below.

@alilleybrinker alilleybrinker changed the title feat: Add RFD for Software Identifiers RFD for Software Identifiers May 7, 2025
alilleybrinker added a commit to alilleybrinker/cve-schema that referenced this pull request May 9, 2025
The `affected` array is an array containing `product` objects, which
must at minimum include an "identifier" (which may be a composite
identifier composed of multiple fields) along with a set of version
bounds or a default status. Products may also specify an assortment
of additional fields which further constrain the applicability of the
CVE to its intended target hardware or software.

Previously, the set of identifiers available were:

- A `vendor` and `product`
- A `collectionURL` and `packageName`

This commit adds support for a new identifier, called `packageURL`,
which uses the purl (Package URL) specification. The contents of the
commit add this as a new field on the `product` type, with a description
and examples, and also update the data constraints on the `product`
type, both to make `packageURL` an option to fulfill the identifier
requirement already in place on the type, and to ensure that the new
`packageURL` field is not mixed with the existing `collectionURL` or
`packageName` fields, as they are redundant with `packageURL` and
including both increases the possibility of data inconsistency within
a single CVE record.

This inclusion of a new `packageURL` type which can be used instead of
the existing pair of `collectionURL` and `packageName` would require
consumers of CVE records to update their logic both to accept the new
field, and to use it in places where they may today use the pair of
`collectionURL` and `packageName`.

This commit does not include a regular expression to parse Package URLs
specifically. Rather, it reuses the existing `uriType` schema. So we
can be sure after validating CVE records against this updated record
format that the `packageURL` field is a URL, but not that it is a valid
Package URL per the Package URL specification. It would be the
responsibility of CVE Services to further validate the field to ensure
values match the Package URL specification. We do not perform this
validation in-schema due to the complexity of expressing the validation
in the form of a regular expression.

This work is submitted as an alternative formulation of the design
proposed in the draft RFD on software identifiers [1], and as an
alternative to the existing proposals for making the `cpeApplicability`
structure generic [2] (instead of it being CPE-specific) and enhancing
this new generic applicability structure with support for Package
URLs [3].

If this change is accepted, then [2] and [3] should not be accepted.

[1]: CVEProject#407
[2]: CVEProject#391
[3]: CVEProject#397

Signed-off-by: Andrew Lilley Brinker <[email protected]>
alilleybrinker added a commit to alilleybrinker/cve-schema that referenced this pull request May 9, 2025
The `affected` array is an array containing `product` objects, which
must at minimum include an "identifier" (which may be a composite
identifier composed of multiple fields) along with a set of version
bounds or a default status. Products may also specify an assortment
of additional fields which further constrain the applicability of the
CVE to its intended target hardware or software.

Previously, the set of identifiers available were:

- A `vendor` and `product`
- A `collectionURL` and `packageName`

This commit adds support for a new pair of fields to support
using OmniBOR Artifact IDs as identifiers in the `affected` array:

- `artifactID`: The OmniBOR Artifact ID for an artifact.
- `artifactType`: An enum indicating whether the `artifactID` is for
  an artifact to search in a file system for, or whether it's a
  build input to search against OmniBOR Input Manifests.

The commit also adds data constraints to ensure this new identifier
pair is not used alongside fields that don't make sense to use with
OmniBOR, including the other identifier schemes, further decomposition
information like `programFiles` or `programRoutines`, and version
information.

This work is submitted as an alternative formulation of the design
proposed in the draft RFD on software identifiers [1], and as an
alternative to the existing proposals for making the `cpeApplicability`
structure generic [2] (instead of it being CPE-specific) and enhancing
this new generic applicability structure with support for OmniBOR
Artifact IDs [3].

If this change is accepted, then [2] and [3] should not be accepted.

[1]: CVEProject#407
[2]: CVEProject#391
[3]: CVEProject#396

Signed-off-by: Andrew Lilley Brinker <[email protected]>
@alilleybrinker
Copy link
Author

I've now opened PRs reflecting an alternate design to the one proposed in this RFD, #409 and #410. If the QWG's consensus is to advance with those designs, I will close this RFD PR and open a new PR with an alternate RFD describing those designs in detail.

@darakian
Copy link

darakian commented May 29, 2025

Having both of these approaches open may not have been the best idea (for me anyway 😄 ). I've added my comment about how to encode version ranges in the other PR though I suspect that would also be relevant here as well.

I will also be totally honest that I have not dug into Omnibor yet and I've been stuck in a loop where a QWG meeting happens and Omnibor is mentioned, I think Oh shoot I forgot to read up on that, I take a quick look at it and get intimated by the spec, put it down and swear that I'll get back to it tomorrow,the next day, the day after that,etc...., and then there's another QWG. I won't be read up on OmniBOR by tomorrow's QWG, but please yell at me if I'm not y next weeks meeting 👍

So, with that personal failing out of the way I'll continue to be self centered and pose the question; would it make sense to drop both purl and omnibor from this RFD and discuss the idea of how we should consider software identifiers more generally? I've been waffling on the idea of proposing a set of what I'll call the identifiers that github uses which are similar to a subset of purls or not. I lean more toward not at this point, but I do think we should consider the generic question of how do we assess a new identifier should Omnibor2 or cpeButBetterThisTime, or whatever comes along. If nothing else this vetting process could be used as a way to provide feedback upstream. Success metric could also be easier to consider if the RFD is broken up as well. That said, I'm happy to hear pushback on that too as there really are not many software identifiers worth considering today and it may just be easier to have the full conversation all at once.


With respect to this RFD as is; I am onboard with the general direction of it. I think I understand the synonym problem to be one that is more operational than schema design and I would suggest that allowing for more methods to capture affected product information may enable/embolden more CNAs to publish affected product information period. In my opinion an indicator of success could be a rise in the proportion of CVEs published with valid affected product information populated. This could also be broken out per id (and for the structure itself) to indicate success for each.

On the Related Issues or Proposals section; this could be a case of yes and. CPE has its issues today and is currently bottlenecked with NIST as a centralized naming authority. I see no reason why CPE supporters couldn't continue development of CPE and federate the namespace based on a per-vendor basis or whatever and that be compatible with the adoption of purls/omnibors. I suspect that orgs will pick whichever namespace fits them and their needs best and IMO the spec should be equipped to accept whatever high quality identifiers can be produced.

@alilleybrinker
Copy link
Author

@darakian, for reading up on OmniBOR, the project website has a more accessible introduction to how the identifiers work: https://omnibor.io/docs/artifact-ids/

As for the question of whether to split this RFD into parts: one advancing a general set of provisions for how new software IDs should be incorporated, and then others advancing specific software IDs to incorporate. My team is open to doing that, but didn't as the initial ask because we were concerned it would be too granular for the QWG. In particular, we felt that offering concrete examples with real-world identifiers helps crystallize understanding of the trade-offs of any particular design in a way that a purely abstract proposal can't.

For the success metric question, I agree there's value in assessing adopting via a statistical analysis of uptake of the new fields or general enrichment of identity information in CVE records, though we didn't choose that as the go-to metric because we wanted to leave room to also consider adoption by CVE consumers, which is fuzzier to measure and thus easier to ignore when assessing success. That said, I'd love to see the kind of analysis you mention done after maybe six months post-adoption.

Finally, on the related issues, these are more listed to identify problems we are explicitly not solving in the RFD but which the QWG could take up and pursue outside of the RFD. I agree that there's interest and a need to advance some form of improvement to CPE, likely via federation, to let it scale beyond the limits of NIST's resources; we just don't solve that question here.

@alilleybrinker
Copy link
Author

@darakian we could amend the RFD text to include a clear subsection which describes that the design proposed here for adding Package URLs and OmniBOR Artifact Identifiers is intended as a template for addition of any future identifier types, which may help satisfy the desire for a clear reference-point on how to add those types in future proposals, without needing to split the RFD into multiple separate documents.

@darakian
Copy link

darakian commented Jun 2, 2025

@darakian, for reading up on OmniBOR, the project website has a more accessible introduction to how the identifiers work: https://omnibor.io/docs/artifact-ids/

Thank you, thank you!

As for the question of whether to split this RFD into parts: one advancing a general set of provisions for how new software IDs should be incorporated, and then others advancing specific software IDs to incorporate. My team is open to doing that, but didn't as the initial ask because we were concerned it would be too granular for the QWG.

Ya, that's totally fair. I was very much waffling back and forth on if to raise the issue or not.

@darakian we could amend the RFD text to include a clear subsection which describes that the design proposed here for adding Package URLs and OmniBOR Artifact Identifiers is intended as a template for addition of any future identifier types, which may help satisfy the desire for a clear reference-point on how to add those types in future proposals, without needing to split the RFD into multiple separate documents.

That could work for sure. 👍

This rewrites the core content of the RFD to base the
proposed new fields on the `affected` array instead of basing
them on the `cpeApplicability` object as the prior version of
the RFD did. The motivation and outcomes are generally unchanged,
but the specifics of the proposed edits are now different.

Signed-off-by: Andrew Lilley Brinker <[email protected]>
@alilleybrinker
Copy link
Author

@darakian, thanks! I've amended the RFD to be based on the affected array and to include more explicit commentary on how it is intended to function as a template for the inclusion of future identifier types.

@Chris-Turner-NIST
Copy link

Finally getting a moment to read through this and realized that this may not have been covered in discussions yet...

If the intent is to create more generic places for various identifiers, it would make sense that part of this proposal should include deprecating the existing cpes array and include a new property (cpeMatchString?) that aligns with the approach proposed for PURL and OmniBOR.

I recognize that this would create two locations for CPE related data due to the current support for hasCPEApplicability, however, it would be a step in the right direction of normalizing the current structures and methodologies available within the affected array.

@alilleybrinker
Copy link
Author

@Chris-Turner-NIST I agree that it would be good both to eventually deprecate the cpes array and to introduce a field for CPEs similar to the support added for OmniBOR and Package URLs in this RFD. However, we purposefully omitted that issue in this RFD for a couple of reasons:

  1. Deprecations are harder to justify, would likely take longer to reach consensus on how to handle a deprecation proposal.
  2. The cpeApplicability block, added last year as an NVD-compatible mechanism for adding CPEs which is also semantically clearer than the cpes field in the affected array's product objects, complicates the story around "where CPEs go" in an CVE record. If the cpes field were deprecated and a new cpeMatch (or some other name) field were added in the same object, CNAs would still be presented with 3 places to put CPEs, with one deprecated.

All this to say, I fully endorse improving and simplifying handling of CPEs in the record format, and my personal preference is to do exactly what you propose. I just think it would make the most sense in a follow-up RFD.

@alilleybrinker
Copy link
Author

@Chris-Turner-NIST, I've opened an Issue recommending the creation of an RFD for improving CPE handling, based on your comment here. Happy for any additional input you may have on that! #421

alilleybrinker and others added 2 commits June 25, 2025 08:35
Co-authored-by: Andrew Pollock <[email protected]>
Per discussion in the QWG, this amends the RFD to clarify that the new
identifier fields being proposed are not able to fulfill the "identifier-like"
requirement in the `product` object inside the `affected` array. While this
may be changed in the future, for today it is the easiest path forward for
CVE data consumers, who could adopt the new fields if _desirable_ but would
not be obligated to do so.

Signed-off-by: Andrew Lilley Brinker <[email protected]>
@alilleybrinker
Copy link
Author

Note

Final Comment Period

A Final Comment Period (FCP) has been called for this proposal. This is a final opportunity to raise new concerns with the proposal.

The FCP will close at 2pm PDT / 5pm EDT July 3rd, at the end of the Quality Working Group Meeting.

@alilleybrinker
Copy link
Author

Note

Final Comment Period Has Closed

The Final Comment Period (FCP) for this proposal has closed, and the proposal has been accepted by the QWG.

Per the RFD process rules, it will now advance to the CVE Board for consideration. The Board will make the final determination as to whether to adopt or reject the proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants