Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow to use libxml2-wasm for XML validation #1184

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
7 changes: 6 additions & 1 deletion docs/dev/decisions/XmlValidator.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,8 @@ There are several implementations for this:
* [`libxmljs3`](https://www.npmjs.com/package/libxmljs3)
* unmaintained copy of `libxmljs2`
* ! DO NOT USE !
* Any alternative? Please open a pull-request to add them.
* [`libxml2-wasm`](https://www.npmjs.com/package/libxml2-wasm)
* maintained WASM implementation of a libxml2 wrapper

At the moment of writing (2023-04-21),
`libxmljs` and `libxmljs2` are both working on several test environments. Both had the needed capabilities.
Expand All @@ -38,6 +39,10 @@ as it was more popular/used and had a more active community.

Decided to replace `libxmljs2`, as it is end of life.

#### 2024-11-26

Decided to replace `libxmljs2` with `libxml2-wasm`, since it's maintained and a functioning XML validator.

## WebBrowsers

there seams to exist no solution for validating XML according to XSD
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@
"ajv": "^8.12.0",
"ajv-formats": "^3.0.1",
"ajv-formats-draft2019": "^1.6.1",
"libxmljs2": "^0.31 || ^0.32 || ^0.33 || ^0.35",
"libxml2-wasm": "^0.4.1",
"xmlbuilder2": "^3.0.2"
},
"devDependencies": {
Expand Down
53 changes: 53 additions & 0 deletions src/_optPlug.node/__xmlValidators/libxml2-wasm.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
/*!
This file is part of CycloneDX JavaScript Library.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

SPDX-License-Identifier: Apache-2.0
Copyright (c) OWASP Foundation. All Rights Reserved.
*/

import { readFile } from 'fs/promises';
import { ParseOption, XmlDocument, XsdValidator } from 'libxml2-wasm';
Copy link
Member

@jkowalleck jkowalleck Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did not look into details of libxml2-wasm yet.

untested code for #1184 (comment)

Suggested change
import { ParseOption, XmlDocument, XsdValidator } from 'libxml2-wasm';
const { ParseOption, XmlDocument, XsdValidator } = await import('libxml2-wasm');

Copy link
Member

@jkowalleck jkowalleck Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or, we could see if we can get libxml2-wasm CJS-compatible, which i'd prefer.
will drop them a PR as soon as possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or you could ask the library authors for help.

Copy link
Author

@SierraNL SierraNL Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did not look into details of libxml2-wasm yet.

untested code for #1184 (comment)

This isn't possible with module set to CommonJS in the tsconfig:
image

Copy link
Member

@jkowalleck jkowalleck Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be possible to move the ... = await import('libxml2-wasm');
down into exported async function? this should be a fitting solution.

export default (async function (schemaPath: string): Promise<Validator> {
  const { ParseOption, XmlDocument, XsdValidator } = await import('libxml2-wasm');
  // ...
}) satisfies Functionality

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also tried that, but it always gets transpiled into a require, which triggers the error.

Copy link
Member

@jkowalleck jkowalleck Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

So we might require dropping node14 support,
change some TS compiler options, modify some code here and there,
to make this work, right?

This will cause breaking-changes - which is no blocker, just a remark.

I would be happy working with you to make this happen. 👍

You have carte blanche - change whatever is needed to make this feature work. Just don't rush, good things may take a while.
I will take care of the change management and processes, so that this feature can be integrated.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can give it a shot, I've done some tests locally and it seems the most work is in the Mocha tests, since they're written in Javascript with requires. They all need to be rewritten to imports.

import { pathToFileURL } from 'url';

import type { ValidationError } from '../../validation/types';
import type { Functionality, Validator } from '../xmlValidator';

/** @internal */
export default (async function (schemaPath: string): Promise<Validator> {
const schema = XmlDocument.fromString(
await readFile(schemaPath, 'utf-8'),
{
option: ParseOption.XML_PARSE_NONET | ParseOption.XML_PARSE_COMPACT,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interface for this wrapper is somewhat different, building the parse options is combining the flags you want on. In the other implementation it's an object where they could be turned on and off explicitely. So this should result in the same options.

I also added this implementation to the xmlValidator tests, and that includes an XXE test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets see how the tests turn out.

url: pathToFileURL(schemaPath).toString()
});
const validator = XsdValidator.fromDoc(schema);

return function (data: string): null | ValidationError {
const doc = XmlDocument.fromString(data, { option: ParseOption.XML_PARSE_NONET | ParseOption.XML_PARSE_COMPACT });
let errors = null;
try {
validator.validate(doc);
}
catch (validationErrors) {
errors = validationErrors;
}

doc.dispose();
validator.dispose();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really free/dispose the validator and schema here?

Copy link
Author

@SierraNL SierraNL Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, this will go wrong the second call. I could just not dispose the validator and the schema. But the library emphasises proper disposing (https://jameslan.github.io/libxml2-wasm/v0.4/documents/Memory_Management.html). Here I'm really lacking in Typescript knowledge on how to solve this, could I use a using here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regarding using, read here: https://www.totaltypescript.com/typescript-5-2-new-keyword-using

regarding manually disposing/freeing: maybe just try it out. in the end, it all is javascript - just see what you can do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SierraNL , could you see if changing this makes the tests pass?

schema.dispose();

return errors;
}
}) satisfies Functionality
48 changes: 0 additions & 48 deletions src/_optPlug.node/__xmlValidators/libxmljs2.ts

This file was deleted.

2 changes: 1 addition & 1 deletion src/_optPlug.node/xmlValidator.ts
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ export default opWrapper<Functionality>('XmlValidator', [
/* eslint-disable @typescript-eslint/no-unsafe-member-access, @typescript-eslint/no-unsafe-return, @typescript-eslint/no-require-imports
-- needed */

['libxmljs2', () => require('./__xmlValidators/libxmljs2').default]
['libxml2-wasm', () => require('./__xmlValidators/libxml2-wasm').default]
// ... add others here, pull-requests welcome!

/* eslint-enable @typescript-eslint/no-unsafe-member-access, @typescript-eslint/no-unsafe-return, @typescript-eslint/no-require-imports */
Expand Down