-
Notifications
You must be signed in to change notification settings - Fork 0
Recognize more base64 URIs to match Chrome behavior #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the DataUri class to handle base64 encoding detection more flexibly by checking the entire metadata section for "base64", not just the formal ;base64 extension. The refactoring also improves error handling for invalid base64 data.
Key changes:
- Splits base64 detection into two concepts:
claims_to_be_base64?(has;base64extension) vsmetadata_contains_base64?(anywhere in metadata) - Invalid base64 data now returns the original data instead of raising an error, except when using the formal
;base64extension - Adds memoization for base64 decoding to avoid repeated decode attempts
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| lib/facet_rails_common/data_uri.rb | Refactors base64 detection and decoding logic with new helper methods and flexible metadata checking |
| test/data_uri_test.rb | Adds comprehensive test coverage for various base64 detection scenarios and error cases |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
lib/facet_rails_common/data_uri.rb
Outdated
| header_end = match.begin(:data) | ||
| return false if header_end.nil? | ||
|
|
||
| uri[0...header_end].match?(/base64/i) |
Copilot
AI
Oct 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex /base64/i could match 'base64' as part of unrelated words (e.g., 'mybase64file'). Consider using a word boundary or more specific pattern like /;base64/i or /[;=]base64/i to avoid false positives in MIME types or parameter values.
| header_end = match.begin(:data) | |
| return false if header_end.nil? | |
| uri[0...header_end].match?(/base64/i) | |
| # Check if the extension group is present and equals ';base64' | |
| extension == ';base64' |
- Added handling for implicit data URIs, defaulting mimetype to 'text/plain'. - Introduced methods for JSON detection and parsing. - Updated tests to cover new functionality and validate behavior for various data URIs.
…arsing - Cache the result of JSON detection to improve performance. - Change the default max nesting for JSON parsing from 200 to 100. - Add a new test to verify behavior of implicit data URIs with commas.
Enhance DataUri class to support implicit data URIs and JSON parsing
| "#{mimetype}#{parameters}" | ||
| end | ||
|
|
||
| def base64? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Mediatype Output Incorrectly Includes Parameters Array
The mediatype method concatenates mimetype with the @parameters array directly. Since @parameters is an array (set on line 35), string interpolation will produce incorrect output like "text/plain[\"foo=bar\", \"baz=qux\"]" instead of "text/plain;foo=bar;baz=qux". The parameters array should be joined with semicolons: "#{mimetype}#{parameters.map { |p| ";#{p}" }.join}" or similar.
- Added :match attribute to the DataUri class for improved functionality. - Removed unnecessary initialization of @match to nil in the constructor.
- Simplified the initialization of instance variables for mimetype, parameters, extension, and data. - Consolidated logic for handling implicit data URIs to improve readability and maintainability. - Added a new test to ensure proper error handling for legacy base64 extensions.
| end | ||
| return unless claims_to_be_base64? | ||
|
|
||
| raise ArgumentError, 'malformed base64 content' unless data_valid_base64? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Base64 validation misses implicit claims.
The validation in validate_base64_content only triggers when claims_to_be_base64? returns true (checking for the ;base64 extension). For data:, URIs where "base64" appears in the data portion (like data:,text/plain;charset=utf-8;base64,invaliddata), the extension is nil, so claims_to_be_base64? returns false and validation doesn't happen. This causes the test test_implicit_form_with_legacy_base64_extension_raises to fail since it expects malformed base64 content to be detected and rejected even in this legacy format.
- Introduced tests for implicit form with legacy base64 to ensure valid decoding and default behavior. - Added validation tests for invalid base64 content and uppercase base64 parameters. - Included handling for empty base64 content and cases with no mimetype, ensuring correct defaults and decoding.
- Updated the esip6? method to utilize legacy behavior for Ethscriptions. - Added new tests to validate the handling of legacy implicit forms with the 'rule=esip6' parameter and other rules.
| parameters.include?("rule=esip6") | ||
| # Use legacy behavior to support Ethscriptions | ||
| raw_parameters = String(data_uri.match[:parameters]).split(';') | ||
| raw_parameters.include?("rule=esip6") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: esip6? method fails for legacy implicit form URIs
The esip6? method accesses match[:parameters] directly to check for the rule=esip6 parameter. However, for implicit form URIs like data:,text/plain;rule=esip6,..., the regex captures everything after data:, as the data group, leaving match[:parameters] empty. This causes esip6? to return false even when rule=esip6 appears in the content, causing the test_esip6_legacy_implicit_form test to fail.
Note
Expand
DataUriparsing to recognize more base64 forms, handle implicit data URIs, add JSON helpers, and support ESIP6 detection with comprehensive tests.lib/facet_rails_common/data_uri.rb):;base64extension is present; return original data when invalid.data:,form: defaultmimetypetotext/plain, clear parameters/extension, preserve payload (including commas).mimetype,parameters,extension,data; refinebase64?,decoded_data, and addclaims_to_be_base64?,data_valid_base64?.json?(by mimetype or payload prefix) andparse_jsonwith options.esip6?to use legacy raw parameter parsing for compatibility.test/data_uri_test.rb):Written by Cursor Bugbot for commit b2ecfd1. This will update automatically on new commits. Configure here.