Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

♻️ Simplify the proxying scheme. #404

Merged
merged 1 commit into from
Dec 16, 2024
Merged

♻️ Simplify the proxying scheme. #404

merged 1 commit into from
Dec 16, 2024

Conversation

cowboyd
Copy link
Member

@cowboyd cowboyd commented Dec 16, 2024

Motivation

Our proxy scheme was based on the idea that the proxy path is passed to the proxied website vias the X-Base-Url header. That way, the proxy site can generate itself with the correct based urls.

So, for example if the site linked to:

<a href="/images/file.gif" />

Then it would generate the html:

<a href="/basepath/images/file.gif" />

With the same applying to links, images, stylesheets, scripts etc.....

The idea was that this would save computation since the HTML could be "rebased" at the source. That way we wouldn't have to read the entire stream, parse it as html, re-write the links, and then re-serialize it.

The onyl problem is that lots of links get generated from a lot of different contexts that we may or may not be able to control depending on which frameworks we use. In the case where we control the framework, we can access those places and generate the urls correctly.

However, for things like docusaurus and fresh, it ain't so easy.

Approach

This side steps all that, and just fetches the upstream website as-is, parses the HTML, and rewrites the links in the proxy itself. While nominally more expensive because it involves a parse and reserialization of the HTML, it is much less complex, and requires ZERO changes to the upstream site. It serves every request the way it would always do it.

In summary, we're moving like this
https://github.com/thefrontside/effection/blob/v3/www/plugins/rebase.ts#L39-L53 over onto the consumer. It's safer, and just much more robust.

Since we're going to be capturing a static snapshot of the website anyway, it doesn't matter if the dynamic version is a bit less efficient. And even in the case where we do care, it would be much eaiser just to put a proper HTTP caching mechanism in place.

@cowboyd cowboyd requested review from taras, elrickvm and jbolda December 16, 2024 20:54
Copy link

netlify bot commented Dec 16, 2024

Deploy Preview for frontside canceled.

Name Link
🔨 Latest commit 968cbe5
🔍 Latest deploy log https://app.netlify.com/sites/frontside/deploys/67609af73c8aa40008cbdcf7

Our proxy scheme was based on the idea that the proxy path is passed
to the proxied website vias the `X-Base-Url` header. That way, the
proxy site can generate itself with the correct based urls.

So, for example if the site linked to:

`<a href=/images/file.gif/>`

Then it would generate the html:

`< href=/basepath/images/file.gif`/>`

With the same applying to links, images, stylesheets, scripts etc.....

The idea was that this would save computation since the HTML could be
"rebased" at the source. That way we wouldn't have to read the entire
stream, parse it as html, re-write the links, and then re-serialize
it.

The onyl problem is that lots of links get generated from a lot of
different contexts that we may or may not be able to control depending
on which frameworks we use. In the case where we control the
framework, we can access those places and generate the urls correctly.

However, for things like docusaurus and fresh, it ain't so easy.

This side steps all that, and just fetches the upstream website as-is,
parses the HTML, and rewrites the links in the proxy itself. While
nominally more expensive because it involves a parse and
reserialization of the HTML, it is much less complex, and requires
ZERO changes to the upstream site. It serves every request the way it
would always do it.

In summary, we're moving like this
https://github.com/thefrontside/effection/blob/v3/www/plugins/rebase.ts#L39-L53
over onto the consumer. It's safer, and just much more robust.

Since we're going to be capturing a static snapshot of the website
anyway, it doesn't matter if the dynamic version is a bit less
efficient. And even in the case where we do care, it would be much
eaiser just to put a proper HTTP caching mechanism in place.
@cowboyd cowboyd merged commit 870c7c2 into production Dec 16, 2024
7 checks passed
@cowboyd cowboyd deleted the cl/simple-proxy branch December 16, 2024 22:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants