Skip to content

Conversation

@2color
Copy link
Member

@2color 2color commented Jul 17, 2025

Background

Arkadiy from the Internet Archive mentioned he wanted more control over the default HTTPGatewayRouter.

This demonstrates how you would add active probing to ensure CORS.

Description

This PR adds active CORS probing functionality to the HTTPGatewayRouter to ensure only CORS-compatible gateways are returned as providers. The implementation includes caching of probe results to avoid repeated checks.

Key changes:

  • Introduces CORS probing with configurable cache duration
  • Transforms gateway storage from simple PeerInfo array to GatewayStatus objects with metadata
  • Filters providers based on CORS compatibility before yielding them

@2color 2color changed the title feat: add gatewaey status probing in router feat: add gateway status probing in router Jul 21, 2025
@2color 2color changed the title feat: add gateway status probing in router feat: add active probing to HTTPGatewayRouter Jul 21, 2025
@2color 2color requested a review from Copilot July 21, 2025 10:05
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds active CORS probing functionality to the HTTPGatewayRouter to ensure only CORS-compatible gateways are returned as providers. The implementation includes caching of probe results to avoid repeated checks.

Key changes:

  • Introduces CORS probing with configurable cache duration
  • Transforms gateway storage from simple PeerInfo array to GatewayStatus objects with metadata
  • Filters providers based on CORS compatibility before yielding them

})

const corsHeaders = response.headers.get('access-control-allow-origin')
const hasCors = corsHeaders === '*' || corsHeaders?.includes(window.location.origin)
Copy link

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code assumes it's running in a browser environment by accessing window.location.origin, but this router may be used in Node.js environments where window is undefined. This will cause a runtime error in non-browser environments.

Suggested change
const hasCors = corsHeaders === '*' || corsHeaders?.includes(window.location.origin)
const origin = (typeof window !== 'undefined' && window.location?.origin) || ''
const hasCors = corsHeaders === '*' || corsHeaders?.includes(origin)

Copilot uses AI. Check for mistakes.
return gateway.corsSupported
}

const gatewayUrl = gateway.peerInfo.multiaddrs[0]?.toString().replace('/http', 'http').replace('/https', 'https')
Copy link

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string replacement logic is fragile and could produce incorrect URLs. For example, a multiaddr containing '/http' in the middle would be incorrectly transformed. Consider using proper multiaddr parsing or URL construction methods.

Suggested change
const gatewayUrl = gateway.peerInfo.multiaddrs[0]?.toString().replace('/http', 'http').replace('/https', 'https')
const multiaddr = gateway.peerInfo.multiaddrs[0]
const gatewayUrl = multiaddr ? uriToMultiaddr(multiaddr).toString() : null

Copilot uses AI. Check for mistakes.

private async checkCorsSupport (gatewayUrl: string): Promise<boolean> {
try {
const response = await fetch(`${gatewayUrl}/ipfs/bafkqaaa`, {
Copy link

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded CID 'bafkqaaa' should be defined as a named constant to improve maintainability and make its purpose clear (appears to be an empty file CID for testing).

Suggested change
const response = await fetch(`${gatewayUrl}/ipfs/bafkqaaa`, {
const response = await fetch(`${gatewayUrl}/ipfs/${EMPTY_FILE_CID}`, {

Copilot uses AI. Check for mistakes.

const corsHeaders = response.headers.get('access-control-allow-origin')
const hasCors = corsHeaders === '*' || corsHeaders?.includes(window.location.origin)

Copy link

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for accepting 404 status as a valid response should be documented. It's not immediately clear why a 404 response indicates CORS support.

Suggested change
// Some HTTP gateways may return a 404 status for non-existent resources while still supporting CORS.
// The presence of CORS headers is the primary determinant of CORS support, and a 404 status is treated
// as valid in this context to account for such behavior.

Copilot uses AI. Check for mistakes.
@achingbrain
Copy link
Member

@parkan I'm missing a little context here - do we need this in the HTTP Gateway routing? I ask because the gateways it will return are from a hard-coded list supplied by the user (defaults), so if they aren't compatible with the client (due to lack of CORS support), they should be omitted from the list.

Checking CORS support for providers found via the delegated/libp2p routing makes more sense to me as these are arbitrary peers found by searching the network and not from a pre-supplied list?

@parkan
Copy link

parkan commented Oct 12, 2025

@parkan I'm missing a little context here - do we need this in the HTTP Gateway routing? I ask because the gateways it will return are from a hard-coded list supplied by the user (defaults), so if they aren't compatible with the client (due to lack of CORS support), they should be omitted from the list.

Checking CORS support for providers found via the delegated/libp2p routing makes more sense to me as these are arbitrary peers found by searching the network and not from a pre-supplied list?

the specific situation that surfaced this feature was a transient misconfiguration of one of the specified http peers that broke CORS, which resulted in spamming them on every content load despite all requests failing

this happens periodically in our experience (likewise with TLS certificate expirations) and results in degradation from the user perspective, even when content is available with other peers

I'm not 100% sure specifically CORS probing (versus let's say denylisting a peer more broadly in case of fatal errors) is the right solution here, but it would have helped

FWIW ipfs/helia-verified-fetch#261 has reduced the chattiness of ls operations significantly, which lowered the impact of events like the above

@parkan
Copy link

parkan commented Oct 12, 2025

@achingbrain I should also note that I'm going to come back and review our delegated routing setup in the coming days/weeks, we're now running our own someguy which would in theory be where the error propagation ends (excepting cached entries)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants