Skip to content

Commit

Permalink
Implement "output.hashCharacters" option to define character set for …
Browse files Browse the repository at this point in the history
…file hashes (#5371)

* Add documentation

* Add new hashing functions and update hashes

The hashes changed due to how they are now encoded

* Add new hashing functions in JavaScript

* Implement new output.hashCharacters option
  • Loading branch information
lukastaegert authored Feb 9, 2024
1 parent 63a91a6 commit 57277bf
Show file tree
Hide file tree
Showing 476 changed files with 852 additions and 440 deletions.
2 changes: 1 addition & 1 deletion browser/src/wasm.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
// eslint-disable-next-line import/no-unresolved
export { parse, xxhashBase64Url } from '../../wasm/bindings_wasm.js';
export { parse, xxhashBase64Url, xxhashBase36, xxhashBase16 } from '../../wasm/bindings_wasm.js';

// eslint-disable-next-line import/no-unresolved
import { parse } from '../../wasm/bindings_wasm.js';
Expand Down
1 change: 1 addition & 0 deletions cli/help.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ Basic options:
--generatedCode.objectShorthand Use shorthand properties in generated code
--no-generatedCode.reservedNamesAsProps Always quote reserved names as props
--generatedCode.symbols Use symbols in generated code
--hashCharacters <name> Use the specified character set for file hashes
--no-hoistTransitiveImports Do not hoist transitive imports into entry chunks
--no-indent Don't indent result
--inlineDynamicImports Create single bundle when using dynamic imports
Expand Down
2 changes: 2 additions & 0 deletions docs/command-line-interface/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ export default {
externalImportAttributes,
footer,
generatedCode,
hashCharacters,
hoistTransitiveImports,
inlineDynamicImports,
interop,
Expand Down Expand Up @@ -398,6 +399,7 @@ Many options have command line equivalents. In those cases, any arguments passed
--generatedCode.objectShorthand Use shorthand properties in generated code
--no-generatedCode.reservedNamesAsProps Always quote reserved names as props
--generatedCode.symbols Use symbols in generated code
--hashCharacters <name> Use the specified character set for file hashes
--no-hoistTransitiveImports Do not hoist transitive imports into entry chunks
--no-indent Don't indent result
--inlineDynamicImports Create single bundle when using dynamic imports
Expand Down
22 changes: 18 additions & 4 deletions docs/configuration-options/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -539,7 +539,7 @@ The pattern to use for naming custom emitted assets to include in the build outp
- `[extname]`: The file extension of the asset including a leading dot, e.g. `.css`.
- `[ext]`: The file extension without a leading dot, e.g. `css`.
- `[hash]`: A hash based on the content of the asset. You can also set a specific hash length via e.g. `[hash:10]`.
- `[hash]`: A hash based on the content of the asset. You can also set a specific hash length via e.g. `[hash:10]`. By default, it will create a base-64 hash. If you need a reduced character sets, see [`output.hashCharacters`](#output-hashcharacters)
- `[name]`: The file name of the asset excluding any extension.
Forward slashes `/` can be used to place files in sub-directories. When using a function, `assetInfo` is a reduced version of the one in [`generateBundle`](../plugin-development/index.md#generatebundle) without the `fileName`. See also [`output.chunkFileNames`](#output-chunkfilenames), [`output.entryFileNames`](#output-entryfilenames).
Expand Down Expand Up @@ -585,7 +585,7 @@ See also [`output.intro/output.outro`](#output-intro-output-outro).
The pattern to use for naming shared chunks created when code-splitting, or a function that is called per chunk to return such a pattern. Patterns support the following placeholders:
- `[format]`: The rendering format defined in the output options, e.g. `es` or `cjs`.
- `[hash]`: A hash based only on the content of the final generated chunk, including transformations in [`renderChunk`](../plugin-development/index.md#renderchunk) and any referenced file hashes. You can also set a specific hash length via e.g. `[hash:10]`.
- `[hash]`: A hash based only on the content of the final generated chunk, including transformations in [`renderChunk`](../plugin-development/index.md#renderchunk) and any referenced file hashes. You can also set a specific hash length via e.g. `[hash:10]`. By default, it will create a base-64 hash. If you need a reduced character sets, see [`output.hashCharacters`](#output-hashcharacters)
- `[name]`: The name of the chunk. This can be explicitly set via the [`output.manualChunks`](#output-manualchunks) option or when the chunk is created by a plugin via [`this.emitFile`](../plugin-development/index.md#this-emitfile). Otherwise, it will be derived from the chunk contents.
Forward slashes `/` can be used to place files in sub-directories. When using a function, `chunkInfo` is a reduced version of the one in [`generateBundle`](../plugin-development/index.md#generatebundle) without properties that depend on file names and no information about the rendered modules as rendering only happens after file names have been generated. You can however access a list of included `moduleIds`. See also [`output.assetFileNames`](#output-assetfilenames), [`output.entryFileNames`](#output-entryfilenames).
Expand Down Expand Up @@ -661,7 +661,7 @@ Promise.resolve()
The pattern to use for chunks created from entry points, or a function that is called per entry chunk to return such a pattern. Patterns support the following placeholders:
- `[format]`: The rendering format defined in the output options, e.g. `es` or `cjs`.
- `[hash]`: A hash based only on the content of the final generated entry chunk, including transformations in [`renderChunk`](../plugin-development/index.md#renderchunk) and any referenced file hashes. You can also set a specific hash length via e.g. `[hash:10]`.
- `[hash]`: A hash based only on the content of the final generated entry chunk, including transformations in [`renderChunk`](../plugin-development/index.md#renderchunk) and any referenced file hashes. You can also set a specific hash length via e.g. `[hash:10]`. By default, it will create a base-64 hash. If you need a reduced character sets, see [`output.hashCharacters`](#output-hashcharacters)
- `[name]`: The file name (without extension) of the entry point, unless the object form of input was used to define a different name.
Forward slashes `/` can be used to place files in sub-directories. When using a function, `chunkInfo` is a reduced version of the one in [`generateBundle`](../plugin-development/index.md#generatebundle) without properties that depend on file names and no information about the rendered modules as rendering only happens after file names have been generated. You can however access a list of included `moduleIds`. See also [`output.assetFileNames`](#output-assetfilenames), [`output.chunkFileNames`](#output-chunkfilenames).
Expand Down Expand Up @@ -863,6 +863,20 @@ const foo = 42;
exports.foo = foo;
```
### output.hashCharacters
| | |
| -------: | :------------------------------ |
| Type: | `"base64" \| "base32" \| "hex"` |
| CLI: | `--hashCharacters <name>` |
| Default: | `"base64"` |
This determines the character set that Rollup is allowed to use in file hashes.
- the default `"base64"` will use url-safe base-64 hashes with potential characters `ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_`.
- `"base36"` will only use lower-case letters and numbers `abcdefghijklmnopqrstuvwxyz0123456789`.
- `"hex"` will create hexadecimal hashes with characters `abcdef0123456789`.
### output.hoistTransitiveImports
| | |
Expand Down Expand Up @@ -1480,7 +1494,7 @@ The location of the generated bundle. If this is an absolute path, all the `sour
The pattern to use for sourcemaps, or a function that is called per sourcemap to return such a pattern. Patterns support the following placeholders:

- `[format]`: The rendering format defined in the output options, e.g. `es` or `cjs`.
- `[hash]`: A hash based only on the content of the final generated sourcemap. You can also set a specific hash length via e.g. `[hash:10]`.
- `[hash]`: A hash based only on the content of the final generated sourcemap. You can also set a specific hash length via e.g. `[hash:10]`. By default, it will create a base-64 hash. If you need a reduced character sets, see [`output.hashCharacters`](#output-hashcharacters)
- `[chunkhash]`: The same hash as the one used for the corresponding generated chunk (if any).
- `[name]`: The file name (without extension) of the entry point, unless the object form of input was used to define a different name.

Expand Down
1 change: 1 addition & 0 deletions docs/javascript-api/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,7 @@ const outputOptions = {
externalImportAttributes,
footer,
generatedCode,
hashCharacters,
hoistTransitiveImports,
inlineDynamicImports,
interop,
Expand Down
6 changes: 6 additions & 0 deletions docs/repl/stores/options.ts
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,11 @@ export const useOptions = defineStore('options2', () => {
name: 'output.globals',
required: () => true
});
const optionOutputHashCharacters = getSelect({
defaultValue: 'base64',
name: 'output.hashCharacters',
options: () => ['base64', 'base36', 'hex']
});
const optionOutputHoistTransitiveImports = getBoolean({
available: alwaysTrue,
defaultValue: true,
Expand Down Expand Up @@ -432,6 +437,7 @@ export const useOptions = defineStore('options2', () => {
optionOutputGeneratedCodeReservedNamesAsProperties,
optionOutputGeneratedCodeSymbols,
optionOutputGlobals,
optionOutputHashCharacters,
optionOutputHoistTransitiveImports,
optionOutputIndent,
optionOutputInlineDynamicImports,
Expand Down
2 changes: 2 additions & 0 deletions native.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,5 @@
export function parse(code: string, allowReturnOutsideFunction: boolean): Buffer
export function parseAsync(code: string, allowReturnOutsideFunction: boolean, signal?: AbortSignal | undefined | null): Promise<Buffer>
export function xxhashBase64Url(input: Uint8Array): string
export function xxhashBase36(input: Uint8Array): string
export function xxhashBase16(input: Uint8Array): string
4 changes: 3 additions & 1 deletion native.js
Original file line number Diff line number Diff line change
Expand Up @@ -93,10 +93,12 @@ const requireWithFriendlyError = id => {
}
};

const { parse, parseAsync, xxhashBase64Url } = requireWithFriendlyError(
const { parse, parseAsync, xxhashBase64Url, xxhashBase36, xxhashBase16 } = requireWithFriendlyError(
existsSync(join(__dirname, localName)) ? localName : `@rollup/rollup-${packageBase}`
);

module.exports.parse = parse;
module.exports.parseAsync = parseAsync;
module.exports.xxhashBase64Url = xxhashBase64Url;
module.exports.xxhashBase36 = xxhashBase36;
module.exports.xxhashBase16 = xxhashBase16;
9 changes: 8 additions & 1 deletion native.wasm.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
const { parse, xxhashBase64Url } = require('./wasm-node/bindings_wasm.js');
const {
parse,
xxhashBase64Url,
xxhashBase36,
xxhashBase16
} = require('./wasm-node/bindings_wasm.js');

exports.parse = parse;
exports.parseAsync = async (code, allowReturnOutsideFunction, _signal) =>
parse(code, allowReturnOutsideFunction);
exports.xxhashBase64Url = xxhashBase64Url;
exports.xxhashBase36 = xxhashBase36;
exports.xxhashBase16 = xxhashBase16;
8 changes: 7 additions & 1 deletion rust/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 10 additions & 0 deletions rust/bindings_napi/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -52,3 +52,13 @@ pub fn parse_async(
pub fn xxhash_base64_url(input: Uint8Array) -> String {
xxhash::xxhash_base64_url(&input)
}

#[napi]
pub fn xxhash_base36(input: Uint8Array) -> String {
xxhash::xxhash_base36(&input)
}

#[napi]
pub fn xxhash_base16(input: Uint8Array) -> String {
xxhash::xxhash_base16(&input)
}
10 changes: 10 additions & 0 deletions rust/bindings_wasm/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,13 @@ pub fn parse(code: String, allow_return_outside_function: bool) -> Vec<u8> {
pub fn xxhash_base64_url(input: Uint8Array) -> String {
xxhash::xxhash_base64_url(&input.to_vec())
}

#[wasm_bindgen(js_name=xxhashBase36)]
pub fn xxhash_base36(input: Uint8Array) -> String {
xxhash::xxhash_base36(&input.to_vec())
}

#[wasm_bindgen(js_name=xxhashBase16)]
pub fn xxhash_base16(input: Uint8Array) -> String {
xxhash::xxhash_base16(&input.to_vec())
}
2 changes: 1 addition & 1 deletion rust/xxhash/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,5 @@ edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
base64 = '0.21.7'
base-encode = "0.3.1"
xxhash-rust = { version = "0.8.8", features = ["xxh3"] }
20 changes: 17 additions & 3 deletions rust/xxhash/src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,21 @@
use base64::{engine::general_purpose, Engine as _};
use base_encode::to_string;
use xxhash_rust::xxh3::xxh3_128;

const CHARACTERS_BASE64: &[u8; 64] =
b"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_";

const CHARACTERS_BASE36: &[u8; 36] = b"abcdefghijklmnopqrstuvwxyz0123456789";

const CHARACTERS_BASE16: &[u8; 16] = b"abcdef0123456789";

pub fn xxhash_base64_url(input: &[u8]) -> String {
let hash = xxh3_128(input).to_le_bytes();
general_purpose::URL_SAFE_NO_PAD.encode(hash)
to_string(&xxh3_128(input).to_le_bytes(), 64, CHARACTERS_BASE64).unwrap()
}

pub fn xxhash_base36(input: &[u8]) -> String {
to_string(&xxh3_128(input).to_le_bytes(), 36, CHARACTERS_BASE36).unwrap()
}

pub fn xxhash_base16(input: &[u8]) -> String {
to_string(&xxh3_128(input).to_le_bytes(), 16, CHARACTERS_BASE16).unwrap()
}
8 changes: 5 additions & 3 deletions src/Chunk.ts
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ import getIndentString from './utils/getIndentString';
import { getNewArray, getOrCreate } from './utils/getOrCreate';
import { getStaticDependencies } from './utils/getStaticDependencies';
import type { HashPlaceholderGenerator } from './utils/hashPlaceholders';
import { replacePlaceholders } from './utils/hashPlaceholders';
import { DEFAULT_HASH_SIZE, replacePlaceholders } from './utils/hashPlaceholders';
import { makeLegal } from './utils/identifierHelpers';
import {
defaultInteropHelpersByInteropType,
Expand Down Expand Up @@ -533,7 +533,8 @@ export default class Chunk {
{
format: () => format,
hash: size =>
hashPlaceholder || (hashPlaceholder = this.getPlaceholder(patternName, size)),
hashPlaceholder ||
(hashPlaceholder = this.getPlaceholder(patternName, size || DEFAULT_HASH_SIZE)),
name: () => this.getChunkName()
}
);
Expand Down Expand Up @@ -566,7 +567,8 @@ export default class Chunk {
chunkhash: () => this.getPreliminaryFileName().hashPlaceholder || '',
format: () => format,
hash: size =>
hashPlaceholder || (hashPlaceholder = this.getPlaceholder(patternName, size)),
hashPlaceholder ||
(hashPlaceholder = this.getPlaceholder(patternName, size || DEFAULT_HASH_SIZE)),
name: () => this.getChunkName()
}
);
Expand Down
4 changes: 4 additions & 0 deletions src/rollup/types.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -683,6 +683,8 @@ type AddonFunction = (chunk: RenderedChunk) => string | Promise<string>;

type OutputPluginOption = MaybePromise<OutputPlugin | NullValue | false | OutputPluginOption[]>;

type HashCharacters = 'base64' | 'base36' | 'hex';

export interface OutputOptions {
amd?: AmdOptions;
assetFileNames?: string | ((chunkInfo: PreRenderedAsset) => string);
Expand All @@ -708,6 +710,7 @@ export interface OutputOptions {
freeze?: boolean;
generatedCode?: GeneratedCodePreset | GeneratedCodeOptions;
globals?: GlobalsOption;
hashCharacters?: HashCharacters;
hoistTransitiveImports?: boolean;
indent?: string | boolean;
inlineDynamicImports?: boolean;
Expand Down Expand Up @@ -758,6 +761,7 @@ export interface NormalizedOutputOptions {
freeze: boolean;
generatedCode: NormalizedGeneratedCodeOptions;
globals: GlobalsOption;
hashCharacters: HashCharacters;
hoistTransitiveImports: boolean;
indent: true | string;
inlineDynamicImports: boolean;
Expand Down
18 changes: 11 additions & 7 deletions src/utils/FileEmitter.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,10 @@ import type {
OutputChunk
} from '../rollup/types';
import { BuildPhase } from './buildPhase';
import { getXxhash } from './crypto';
import type { GetHash } from './crypto';
import { getHash64, hasherByType } from './crypto';
import { getOrCreate } from './getOrCreate';
import { defaultHashSize } from './hashPlaceholders';
import { DEFAULT_HASH_SIZE } from './hashPlaceholders';
import { LOGLEVEL_WARN } from './logging';
import {
error,
Expand Down Expand Up @@ -50,7 +51,7 @@ function generateAssetFileName(
{
ext: () => extname(emittedName).slice(1),
extname: () => extname(emittedName),
hash: size => sourceHash.slice(0, Math.max(0, size || defaultHashSize)),
hash: size => sourceHash.slice(0, Math.max(0, size || DEFAULT_HASH_SIZE)),
name: () =>
emittedName.slice(0, Math.max(0, emittedName.length - extname(emittedName).length))
}
Expand Down Expand Up @@ -155,6 +156,7 @@ interface FileEmitterOutput {
bundle: OutputBundleWithPlaceholders;
fileNamesBySource: Map<string, string>;
outputOptions: NormalizedOutputOptions;
getHash: GetHash;
}

export class FileEmitter {
Expand Down Expand Up @@ -254,9 +256,11 @@ export class FileEmitter {
bundle: OutputBundleWithPlaceholders,
outputOptions: NormalizedOutputOptions
): void => {
const getHash = hasherByType[outputOptions.hashCharacters];
const output = (this.output = {
bundle,
fileNamesBySource: new Map<string, string>(),
getHash,
outputOptions
});
for (const emittedFile of this.filesByReferenceId.values()) {
Expand All @@ -270,7 +274,7 @@ export class FileEmitter {
if (consumedFile.fileName) {
this.finalizeAdditionalAsset(consumedFile, consumedFile.source, output);
} else {
const sourceHash = getXxhash(consumedFile.source);
const sourceHash = getHash(consumedFile.source);
getOrCreate(consumedAssetsByHash, sourceHash, () => []).push(consumedFile);
}
} else if (consumedFile.type === 'prebuilt-chunk') {
Expand All @@ -290,7 +294,7 @@ export class FileEmitter {
let referenceId = idBase;

do {
referenceId = getXxhash(referenceId).slice(0, 8).replaceAll('-', '$');
referenceId = getHash64(referenceId).slice(0, 8).replaceAll('-', '$');
} while (
this.filesByReferenceId.has(referenceId) ||
this.outputFileEmitters.some(({ filesByReferenceId }) => filesByReferenceId.has(referenceId))
Expand Down Expand Up @@ -439,13 +443,13 @@ export class FileEmitter {
private finalizeAdditionalAsset(
consumedFile: Readonly<ConsumedAsset>,
source: string | Uint8Array,
{ bundle, fileNamesBySource, outputOptions }: FileEmitterOutput
{ bundle, fileNamesBySource, getHash, outputOptions }: FileEmitterOutput
): void {
let { fileName, needsCodeReference, referenceId } = consumedFile;

// Deduplicate assets if an explicit fileName is not provided
if (!fileName) {
const sourceHash = getXxhash(source);
const sourceHash = getHash(source);
fileName = fileNamesBySource.get(sourceHash);
if (!fileName) {
fileName = generateAssetFileName(
Expand Down
28 changes: 19 additions & 9 deletions src/utils/crypto.ts
Original file line number Diff line number Diff line change
@@ -1,17 +1,27 @@
import { xxhashBase64Url } from '../../native';
import { xxhashBase16, xxhashBase36, xxhashBase64Url } from '../../native';
import type { HashCharacters } from '../rollup/types';

let textEncoder: TextEncoder;
export function getXxhash(input: string | Uint8Array) {
let buffer: Uint8Array;

export type GetHash = (input: string | Uint8Array) => string;

export const getHash64: GetHash = input => xxhashBase64Url(ensureBuffer(input));
export const getHash36: GetHash = input => xxhashBase36(ensureBuffer(input));
export const getHash16: GetHash = input => xxhashBase16(ensureBuffer(input));

export const hasherByType: Record<HashCharacters, GetHash> = {
base36: getHash36,
base64: getHash64,
hex: getHash16
};

function ensureBuffer(input: string | Uint8Array): Uint8Array {
if (typeof input === 'string') {
if (typeof Buffer === 'undefined') {
textEncoder ??= new TextEncoder();
buffer = textEncoder.encode(input);
} else {
buffer = Buffer.from(input);
return textEncoder.encode(input);
}
} else {
buffer = input;
return Buffer.from(input);
}
return xxhashBase64Url(buffer);
return input;
}
Loading

0 comments on commit 57277bf

Please sign in to comment.