-
Notifications
You must be signed in to change notification settings - Fork 431
[code-scanning-fix] Fix js/http-to-file-access: validate Content-Type and size for LFS PDF download #41635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[code-scanning-fix] Fix js/http-to-file-access: validate Content-Type and size for LFS PDF download #41635
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -112,7 +112,25 @@ async function readPdfBytes() { | |
| throw new Error(`Failed to download slide deck PDF: ${response.status} ${response.statusText}`); | ||
| } | ||
|
|
||
| // Validate the Content-Type header before consuming the body to ensure we | ||
| // are actually receiving a PDF and not arbitrary data. | ||
| const contentType = response.headers.get("content-type") ?? ""; | ||
| if (!contentType.startsWith("application/pdf") && !contentType.startsWith("application/octet-stream")) { | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [/diagnose] 💡 Suggested fixconst mimeType = (response.headers.get("content-type") ?? "").split(";")[0].trim();
if (mimeType !== "application/pdf" && mimeType !== "application/octet-stream") {
throw new Error(`Unexpected content-type for slide deck: ${mimeType}`);
}This correctly handles
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
💡 NotesThis may be intentional if |
||
| throw new Error(`Unexpected content-type for slide deck: ${contentType}`); | ||
| } | ||
|
Comment on lines
+115
to
+120
|
||
|
|
||
| // Guard against unexpectedly large downloads. | ||
| const MAX_BYTES = 50 * 1024 * 1024; // 50 MB | ||
| const contentLength = response.headers.get("content-length"); | ||
| if (contentLength !== null && Number(contentLength) > MAX_BYTES) { | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [/diagnose] 💡 Suggested fixconst parsedLength = contentLength !== null ? Number(contentLength) : NaN;
if (Number.isFinite(parsedLength) && parsedLength > MAX_BYTES) {
throw new Error(`Slide deck download size ${parsedLength} exceeds limit of ${MAX_BYTES} bytes`);
}Using
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Content-Length guard silently skipped for malformed header values: 💡 Suggested fixReplace with an explicit finite-number check: const contentLengthNum = contentLength !== null ? Number(contentLength) : null;
if (contentLengthNum !== null && (!Number.isFinite(contentLengthNum) || contentLengthNum > MAX_BYTES)) {
throw new Error(`Slide deck download size ${contentLength} exceeds limit of ${MAX_BYTES} bytes`);
}This rejects both non-numeric headers and values exceeding the cap. Without this, a server returning |
||
| throw new Error(`Slide deck download size ${contentLength} exceeds limit of ${MAX_BYTES} bytes`); | ||
| } | ||
|
|
||
| const downloadedBytes = Buffer.from(await response.arrayBuffer()); | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 50 MB guard fires only after the full body is already in memory: 💡 Suggested fixStream the response and count bytes as chunks arrive: const chunks = [];
let totalBytes = 0;
const reader = response.body.getReader();
while (true) {
const { done, value } = await reader.read();
if (done) break;
totalBytes += value.length;
if (totalBytes > MAX_BYTES) {
await reader.cancel();
throw new Error(`Downloaded slide deck size exceeds limit of ${MAX_BYTES} bytes`);
}
chunks.push(value);
}
const downloadedBytes = Buffer.concat(chunks.map(c => Buffer.from(c)));This caps memory usage at MAX_BYTES regardless of whether |
||
| if (downloadedBytes.length > MAX_BYTES) { | ||
| throw new Error(`Downloaded slide deck size ${downloadedBytes.length} exceeds limit of ${MAX_BYTES} bytes`); | ||
| } | ||
|
Comment on lines
+122
to
+132
|
||
|
|
||
| if (!isPdf(downloadedBytes)) { | ||
| throw new Error(`Downloaded slide deck from ${url} is not a real PDF.`); | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[/tdd] The three new validation paths (Content-Type mismatch, oversized Content-Length, oversized post-download body) each throw distinct errors, but no regression tests cover them. The project has a
.test.jspattern for scripts (changeset.test.js,generate-schema-docs.test.js) that a companionensure-docs-slide-pdf.test.jscould follow.💡 Sketch of missing test cases
The core function would need to export
readPdfBytes(or a testable inner function), then:These tests confirm that each guard correctly triggers the fallback rather than writing invalid data.