Skip to content

Conversation

@RogerPodacter
Copy link
Member

@RogerPodacter RogerPodacter commented Oct 30, 2025

Note

Expand DataUri parsing to recognize more base64 forms, handle implicit data URIs, add JSON helpers, and support ESIP6 detection with comprehensive tests.

  • Library (lib/facet_rails_common/data_uri.rb):
    • Broaden base64 detection to any header mention (case-insensitive) and validate only when ;base64 extension is present; return original data when invalid.
    • Handle implicit data:, form: default mimetype to text/plain, clear parameters/extension, preserve payload (including commas).
    • Expose mimetype, parameters, extension, data; refine base64?, decoded_data, and add claims_to_be_base64?, data_valid_base64?.
    • Add JSON helpers: json? (by mimetype or payload prefix) and parse_json with options.
    • Update esip6? to use legacy raw parameter parsing for compatibility.
  • Tests (test/data_uri_test.rb):
    • Add extensive unit tests covering base64 detection/validation, implicit form behavior, JSON parsing, and ESIP6 parameter handling.

Written by Cursor Bugbot for commit b2ecfd1. This will update automatically on new commits. Configure here.

@RogerPodacter RogerPodacter requested a review from Copilot October 30, 2025 18:13
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the DataUri class to handle base64 encoding detection more flexibly by checking the entire metadata section for "base64", not just the formal ;base64 extension. The refactoring also improves error handling for invalid base64 data.

Key changes:

  • Splits base64 detection into two concepts: claims_to_be_base64? (has ;base64 extension) vs metadata_contains_base64? (anywhere in metadata)
  • Invalid base64 data now returns the original data instead of raising an error, except when using the formal ;base64 extension
  • Adds memoization for base64 decoding to avoid repeated decode attempts

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
lib/facet_rails_common/data_uri.rb Refactors base64 detection and decoding logic with new helper methods and flexible metadata checking
test/data_uri_test.rb Adds comprehensive test coverage for various base64 detection scenarios and error cases

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@RogerPodacter RogerPodacter requested a review from Copilot October 30, 2025 18:16
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@RogerPodacter RogerPodacter requested a review from Copilot October 30, 2025 18:22
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 107 to 110
header_end = match.begin(:data)
return false if header_end.nil?

uri[0...header_end].match?(/base64/i)
Copy link

Copilot AI Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex /base64/i could match 'base64' as part of unrelated words (e.g., 'mybase64file'). Consider using a word boundary or more specific pattern like /;base64/i or /[;=]base64/i to avoid false positives in MIME types or parameter values.

Suggested change
header_end = match.begin(:data)
return false if header_end.nil?
uri[0...header_end].match?(/base64/i)
# Check if the extension group is present and equals ';base64'
extension == ';base64'

Copilot uses AI. Check for mistakes.
- Added handling for implicit data URIs, defaulting mimetype to 'text/plain'.
- Introduced methods for JSON detection and parsing.
- Updated tests to cover new functionality and validate behavior for various data URIs.
…arsing

- Cache the result of JSON detection to improve performance.
- Change the default max nesting for JSON parsing from 200 to 100.
- Add a new test to verify behavior of implicit data URIs with commas.
Enhance DataUri class to support implicit data URIs and JSON parsing
"#{mimetype}#{parameters}"
end

def base64?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Mediatype Output Incorrectly Includes Parameters Array

The mediatype method concatenates mimetype with the @parameters array directly. Since @parameters is an array (set on line 35), string interpolation will produce incorrect output like "text/plain[\"foo=bar\", \"baz=qux\"]" instead of "text/plain;foo=bar;baz=qux". The parameters array should be joined with semicolons: "#{mimetype}#{parameters.map { |p| ";#{p}" }.join}" or similar.

Fix in Cursor Fix in Web

- Added :match attribute to the DataUri class for improved functionality.
- Removed unnecessary initialization of @match to nil in the constructor.
- Simplified the initialization of instance variables for mimetype, parameters, extension, and data.
- Consolidated logic for handling implicit data URIs to improve readability and maintainability.
- Added a new test to ensure proper error handling for legacy base64 extensions.
end
return unless claims_to_be_base64?

raise ArgumentError, 'malformed base64 content' unless data_valid_base64?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Base64 validation misses implicit claims.

The validation in validate_base64_content only triggers when claims_to_be_base64? returns true (checking for the ;base64 extension). For data:, URIs where "base64" appears in the data portion (like data:,text/plain;charset=utf-8;base64,invaliddata), the extension is nil, so claims_to_be_base64? returns false and validation doesn't happen. This causes the test test_implicit_form_with_legacy_base64_extension_raises to fail since it expects malformed base64 content to be detected and rejected even in this legacy format.

Fix in Cursor Fix in Web

- Introduced tests for implicit form with legacy base64 to ensure valid decoding and default behavior.
- Added validation tests for invalid base64 content and uppercase base64 parameters.
- Included handling for empty base64 content and cases with no mimetype, ensuring correct defaults and decoding.
- Updated the esip6? method to utilize legacy behavior for Ethscriptions.
- Added new tests to validate the handling of legacy implicit forms with the 'rule=esip6' parameter and other rules.
parameters.include?("rule=esip6")
# Use legacy behavior to support Ethscriptions
raw_parameters = String(data_uri.match[:parameters]).split(';')
raw_parameters.include?("rule=esip6")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: esip6? method fails for legacy implicit form URIs

The esip6? method accesses match[:parameters] directly to check for the rule=esip6 parameter. However, for implicit form URIs like data:,text/plain;rule=esip6,..., the regex captures everything after data:, as the data group, leaving match[:parameters] empty. This causes esip6? to return false even when rule=esip6 appears in the content, causing the test_esip6_legacy_implicit_form test to fail.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants