Skip to content

Commit 09a3b2f

Browse files
committed
Editorial: make serialize/parse roundtrip examples linkable
I've had occasion to link to these a few times, and so making them more easily linkable seems like a good idea. Also fix some wrapping and use modern Infra syntax for code points while in the area.
1 parent 3c6cb6c commit 09a3b2f

File tree

1 file changed

+26
-31
lines changed

1 file changed

+26
-31
lines changed

source

Lines changed: 26 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -136868,37 +136868,34 @@ document.body.appendChild(text);
136868136868
<li><p>Return <var>s</var>.</p></li>
136869136869
</ol>
136870136870

136871-
<p class="warning">It is possible that the output of this algorithm, if parsed with an <span>HTML
136872-
parser</span>, will not return the original tree structure. Tree structures that do not roundtrip
136873-
a serialize and reparse step can also be produced by the <span>HTML parser</span> itself, although
136874-
such cases are typically non-conforming.</p>
136875-
136876-
<div class="example">
136871+
<p class="warning" id="warning-html-serializer-roundtrip">It is possible that the output of this
136872+
algorithm, if parsed with an <span>HTML parser</span>, will not return the original tree
136873+
structure. Tree structures that do not roundtrip a serialize and reparse step can also be produced
136874+
by the <span>HTML parser</span> itself, although such cases are typically non-conforming.</p>
136877136875

136876+
<div class="example" id="example-html-serializer-roundtrip-comments-and-script">
136878136877
<p>For instance, if a <code>textarea</code> element to which a <code data-x="">Comment</code>
136879136878
node has been appended is serialized and the output is then reparsed, the comment will end up
136880136879
being displayed in the text control. Similarly, if, as a result of DOM manipulation, an element
136881-
contains a comment that contains "<code data-x="">--&gt;</code>", then when
136882-
the result of serializing the element is parsed, the comment will be truncated at that point and
136883-
the rest of the comment will be interpreted as markup. More examples would be making a
136884-
<code>script</code> element contain a <code>Text</code> node with the text string "<code
136880+
contains a comment that contains "<code data-x="">--&gt;</code>", then when the result of
136881+
serializing the element is parsed, the comment will be truncated at that point and the rest of
136882+
the comment will be interpreted as markup. More examples would be making a <code>script</code>
136883+
element contain a <code>Text</code> node with the text string "<code
136885136884
data-x="">&lt;/script></code>", or having a <code>p</code> element that contains a
136886136885
<code>ul</code> element (as the <code>ul</code> element's <span data-x="syntax-start-tag">start
136887136886
tag</span> would imply the end tag for the <code>p</code>).</p>
136888136887

136889136888
<p>This can enable cross-site scripting attacks. An example of this would be a page that lets the
136890136889
user enter some font family names that are then inserted into a CSS <code>style</code> block via
136891-
the DOM and which then uses the <code data-x="dom-element-innerHTML">innerHTML</code> IDL attribute to get
136892-
the HTML serialization of that <code>style</code> element: if the user enters
136890+
the DOM and which then uses the <code data-x="dom-element-innerHTML">innerHTML</code> IDL
136891+
attribute to get the HTML serialization of that <code>style</code> element: if the user enters
136893136892
"<code data-x="">&lt;/style>&lt;script>attack&lt;/script></code>" as a font family name, <code
136894-
data-x="dom-element-innerHTML">innerHTML</code> will return markup that, if parsed in a different context,
136895-
would contain a <code>script</code> node, even though no <code>script</code> node existed in the
136896-
original DOM.</p>
136897-
136893+
data-x="dom-element-innerHTML">innerHTML</code> will return markup that, if parsed in a different
136894+
context, would contain a <code>script</code> node, even though no <code>script</code> node
136895+
existed in the original DOM.</p>
136898136896
</div>
136899136897

136900-
<div class="example">
136901-
136898+
<div class="example" id="example-html-serializer-roundtrip-nested-form">
136902136899
<p>For example, consider the following markup:</p>
136903136900

136904136901
<pre><code class="html">&lt;form id="outer">&lt;div>&lt;/form>&lt;form id="inner">&lt;input></code></pre>
@@ -136915,11 +136912,9 @@ document.body.appendChild(text);
136915136912
<pre><code class="html">&lt;html>&lt;head>&lt;/head>&lt;body>&lt;form id="outer">&lt;div><mark>&lt;form id="inner"></mark>&lt;input>&lt;/form>&lt;/div>&lt;/form>&lt;/body>&lt;/html></code></pre>
136916136913

136917136914
<ul class="domTree"><li class="t1"><code>html</code><ul><li class="t1"><code>head</code></li><li class="t1"><code>body</code><ul><li class="t1"><code>form</code> <span class="t2" data-x=""><code class="attribute name" data-x="attr-id">id</code>="<code class="attribute value" data-x="">outer</code>"</span><ul><li class="t1"><code>div</code><ul><li class="t1"><code>input</code></li></ul></li></ul></li></ul></li></ul></li></ul>
136918-
136919136915
</div>
136920136916

136921-
<div class="example">
136922-
136917+
<div class="example" id="example-html-serializer-roundtrip-foster-parenting">
136923136918
<p>As another example, consider the following markup:</p>
136924136919

136925136920
<pre><code class="html">&lt;a>&lt;table>&lt;a></code></pre>
@@ -136937,17 +136932,16 @@ document.body.appendChild(text);
136937136932
<pre><code class="html">&lt;html>&lt;head>&lt;/head>&lt;body>&lt;a><mark>&lt;a></mark>&lt;/a>&lt;table>&lt;/table>&lt;/a>&lt;/body>&lt;/html></code></pre>
136938136933

136939136934
<ul class="domTree"><li class="t1"><code>html</code><ul><li class="t1"><code>head</code></li><li class="t1"><code>body</code><ul><li class="t1"><code>a</code></li><li class="t1"><code>a</code></li><li class="t1"><code>table</code></li></ul></li></ul></li></ul>
136940-
136941136935
</div>
136942136936

136943-
<p>For historical reasons, this algorithm does not round-trip an initial U+000A LINE FEED (LF)
136944-
character in <code>pre</code>, <code>textarea</code>, or <code>listing</code> elements, even
136945-
though (in the first two cases) the markup being round-tripped can be conforming. The <span>HTML
136946-
parser</span> will drop such a character during parsing, but this algorithm does <em>not</em>
136947-
serialize an extra U+000A LINE FEED (LF) character.</p>
136937+
<p>For historical reasons, this algorithm does not round-trip an initial U+000A (LF) character in
136938+
<code>pre</code>, <code>textarea</code>, or <code>listing</code> elements, even though (in the
136939+
first two cases) the markup being round-tripped can be conforming. The <span>HTML parser</span>
136940+
will drop such a character during parsing, but this algorithm does <em>not</em> serialize an extra
136941+
U+000A (LF) character.</p>
136948136942
<!-- https://github.com/whatwg/html/issues/944 -->
136949136943

136950-
<div class="example">
136944+
<div class="example" id="example-html-serializer-roundtrip-linefeed">
136951136945
<p>For example, consider the following markup:</p>
136952136946

136953136947
<pre><code class="html">&lt;pre>
@@ -136968,9 +136962,10 @@ Hello.&lt;/pre></code></pre>
136968136962
<span data-x="concept-element-is-value"><code data-x="">is</code> value</span> is preserved
136969136963
through serialize-parse roundtrips.</p>
136970136964

136971-
<div class="example">
136972-
<p>When creating a <span>customized built-in element</span> via the parser, a developer uses the <code
136973-
data-x="attr-is">is</code> attribute directly; in such cases serialize-parse roundtrips work fine.</p>
136965+
<div class="example" id="example-html-serializer-roundtrip-is-attribute">
136966+
<p>When creating a <span>customized built-in element</span> via the parser, a developer uses the
136967+
<code data-x="attr-is">is</code> attribute directly; in such cases serialize-parse roundtrips
136968+
work fine.</p>
136974136969

136975136970
<pre><code class="html">&lt;script>
136976136971
window.SuperP = class extends HTMLParagraphElement {};

0 commit comments

Comments
 (0)