Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
Our proxy scheme was based on the idea that the proxy path is passed to the proxied website vias the
X-Base-Url
header. That way, the proxy site can generate itself with the correct based urls.So, for example if the site linked to:
<a href="/images/file.gif" />
Then it would generate the html:
<a href="/basepath/images/file.gif" />
With the same applying to links, images, stylesheets, scripts etc.....
The idea was that this would save computation since the HTML could be "rebased" at the source. That way we wouldn't have to read the entire stream, parse it as html, re-write the links, and then re-serialize it.
The onyl problem is that lots of links get generated from a lot of different contexts that we may or may not be able to control depending on which frameworks we use. In the case where we control the framework, we can access those places and generate the urls correctly.
However, for things like docusaurus and fresh, it ain't so easy.
Approach
This side steps all that, and just fetches the upstream website as-is, parses the HTML, and rewrites the links in the proxy itself. While nominally more expensive because it involves a parse and reserialization of the HTML, it is much less complex, and requires ZERO changes to the upstream site. It serves every request the way it would always do it.
In summary, we're moving like this
https://github.com/thefrontside/effection/blob/v3/www/plugins/rebase.ts#L39-L53 over onto the consumer. It's safer, and just much more robust.
Since we're going to be capturing a static snapshot of the website anyway, it doesn't matter if the dynamic version is a bit less efficient. And even in the case where we do care, it would be much eaiser just to put a proper HTTP caching mechanism in place.