Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow to use libxml2-wasm for XML validation #1184

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

SierraNL
Copy link

@SierraNL SierraNL commented Nov 26, 2024

Due to libxmljs2 not being maintained and contains a vulnerability, a replacement needed to be found. This commit replaces it with libxml2-wasm, which is a new, but maintained library, which serves the purpose of validating XML.

The implementation is as close the the previous library in regards to flags passed to libxml2, but only adapted to a different interface and the recommendation to dispose all objects.

This is my first contribution to this project, and typescript isn't my usual language, so comments are welcome.

related to: #1079

Due to libxmljs2 not being maintained and contains a vulnerability, a replacement needed to be found.
This commit replaces it with libxml2-wasm, which is a new, but maintained library, which serves the purpose of validating XML.

The implementation is as close the the previous library in regards to flags passed to libxml2, but only adapted to a different interface and the recommendation to dispose all objects.

This is my first contribution to this project, and typescript isn't my usual language, so comments are welcome.

Resolves: CycloneDX#1079
Signed-off-by: Leon Grave <[email protected]>
@SierraNL SierraNL requested a review from a team as a code owner November 26, 2024 10:45
@jkowalleck
Copy link
Member

thanks for donating this feature, @SierraNL .

let me clarify some things:

Due to libxmljs2 [...] contains a vulnerability [...]

this is not the case. The current libxml2 library contains a feature, that, if used wrong downstream, could lead to a vulnerability downstream. The downstream usage in the CycloneDX-JS-lib does not use it wrong, so no vulnerability exists.

[...] a replacement needed to be found

This is true in the long term, but we do not intend to replace libxmljs2 right away. Instead, we want to allow alternatives.
Therefore, some of your changes need to be reverted.

const schema = XmlDocument.fromString(
await readFile(schemaPath, 'utf-8'),
{
option: ParseOption.XML_PARSE_NONET | ParseOption.XML_PARSE_COMPACT,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interface for this wrapper is somewhat different, building the parse options is combining the flags you want on. In the other implementation it's an object where they could be turned on and off explicitely. So this should result in the same options.

I also added this implementation to the xmlValidator tests, and that includes an XXE test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets see how the tests turn out.

Signed-off-by: Leon Grave <[email protected]>
Signed-off-by: Leon Grave <[email protected]>
Signed-off-by: Leon Grave <[email protected]>
}

doc.dispose();
validator.dispose();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really free/dispose the validator and schema here?

Copy link
Author

@SierraNL SierraNL Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, this will go wrong the second call. I could just not dispose the validator and the schema. But the library emphasises proper disposing (https://jameslan.github.io/libxml2-wasm/v0.4/documents/Memory_Management.html). Here I'm really lacking in Typescript knowledge on how to solve this, could I use a using here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regarding using, read here: https://www.totaltypescript.com/typescript-5-2-new-keyword-using

regarding manually disposing/freeing: maybe just try it out. in the end, it all is javascript - just see what you can do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SierraNL , could you see if changing this makes the tests pass?

Signed-off-by: Leon Grave <[email protected]>
@jkowalleck jkowalleck changed the title Switch to libxml2-wasm for XML validation allow to use libxml2-wasm for XML validation Dec 2, 2024
Copy link
Member

@jkowalleck jkowalleck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please modify the README to reflect this new optional dependency:
see

* Validation of XML on _Node.js_ requires all of:
* [`libxmljs2`](https://www.npmjs.com/package/libxmljs2)
* the system might need to meet the requirements for [`node-gyp`](https://github.com/TooTallNate/node-gyp#installation), in certain cases.

- * Validation of XML on _Node.js_ requires all of:
+ * Validation of XML on _Node.js_ requires any of: 
    * [`libxmljs2`](https://www.npmjs.com/package/libxmljs2)
+   * [`libxml2-wasm@`](https://www.npmjs.com/package/libxml2-wasm@)
    * the system might need to meet the requirements for [`node-gyp`](https://github.com/TooTallNate/node-gyp#installation), in certain cases.

HISTORY.md Outdated
@@ -22,6 +22,7 @@ All notable changes to this project will be documented in this file.
* Apply latest code style guide (via [#1170], [#1181])
* Dependencies
* Support `libxmljs2@^0.35` (via [#1173])
* Support `libxml2-wasm@^0.41` as an alternative for `libxmljs2` (via [#1184])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since a release of 7.0.0in the meantime, and a forward-merge,
this needs to be updated to be in the unreleased category.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the line (and a header) to unreleased.

@jkowalleck
Copy link
Member

jkowalleck commented Jan 9, 2025

@SierraNL, do you have any updates on this?

Signed-off-by: Leon Grave <[email protected]>
@SierraNL
Copy link
Author

@SierraNL, do you have any updates on this?

Was a bit busy over the holidays, processed your comments. Can't run the tests locally, so need to see if the Github build succeeds.

@SierraNL SierraNL requested a review from jkowalleck January 11, 2025 10:06
@jkowalleck
Copy link
Member

could you rebase/merge this branch wit the latest master?

@jkowalleck
Copy link
Member

@SierraNL, do you have any updates on this?

Was a bit busy over the holidays, processed your comments. Can't run the tests locally, so need to see if the Github build succeeds.

you may want to dispatch the QA workflow on your fork, when needed.
https://github.com/SierraNL/cyclonedx-javascript-library/actions/workflows/nodejs.yml should have a button "run workflow" at the top right, which then lets you select on which branch you want to run the tests.
image

Signed-off-by: Leon Grave <[email protected]>
@SierraNL
Copy link
Author

SierraNL commented Jan 12, 2025

@SierraNL, do you have any updates on this?

Was a bit busy over the holidays, processed your comments. Can't run the tests locally, so need to see if the Github build succeeds.

you may want to dispatch the QA workflow on your fork, when needed. https://github.com/SierraNL/cyclonedx-javascript-library/actions/workflows/nodejs.yml should have a button "run workflow" at the top right, which then lets you select on which branch you want to run the tests. image

Thanks, I also managed to get the tests running locally, and the underlying reason of the test failing is this message:
Uncaught Error [ERR_REQUIRE_ASYNC_MODULE]: require() cannot be used on an ESM graph with top-level await. Use import() instead. To see where the top-level await comes from, use --experimental-print-required-tla.

I'll have a look on how this fix this, but I have the feeling the libxml2-wasm library is not compatible with the module structure of the cyclonedx library, or the way Typescript is transpiled, since imports are transpiled to require.

*/

import { readFile } from 'fs/promises';
import { ParseOption, XmlDocument, XsdValidator } from 'libxml2-wasm';
Copy link
Member

@jkowalleck jkowalleck Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did not look into details of libxml2-wasm yet.

untested code for #1184 (comment)

Suggested change
import { ParseOption, XmlDocument, XsdValidator } from 'libxml2-wasm';
const { ParseOption, XmlDocument, XsdValidator } = await import('libxml2-wasm');

Copy link
Member

@jkowalleck jkowalleck Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or, we could see if we can get libxml2-wasm CJS-compatible, which i'd prefer.
will drop them a PR as soon as possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or you could ask the library authors for help.

Copy link
Author

@SierraNL SierraNL Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did not look into details of libxml2-wasm yet.

untested code for #1184 (comment)

This isn't possible with module set to CommonJS in the tsconfig:
image

Copy link
Member

@jkowalleck jkowalleck Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be possible to move the ... = await import('libxml2-wasm');
down into exported async function? this should be a fitting solution.

export default (async function (schemaPath: string): Promise<Validator> {
  const { ParseOption, XmlDocument, XsdValidator } = await import('libxml2-wasm');
  // ...
}) satisfies Functionality

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also tried that, but it always gets transpiled into a require, which triggers the error.

Copy link
Member

@jkowalleck jkowalleck Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

So we might require dropping node14 support,
change some TS compiler options, modify some code here and there,
to make this work, right?

This will cause breaking-changes - which is no blocker, just a remark.

I would be happy working with you to make this happen. 👍

You have carte blanche - change whatever is needed to make this feature work. Just don't rush, good things may take a while.
I will take care of the change management and processes, so that this feature can be integrated.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can give it a shot, I've done some tests locally and it seems the most work is in the Mocha tests, since they're written in Javascript with requires. They all need to be rewritten to imports.

@jkowalleck jkowalleck self-requested a review January 13, 2025 09:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants