Skip to content

Commit 1faeb78

Browse files
committed
add @join to the att.linguistic.dependency examples
1 parent dc230dd commit 1faeb78

File tree

2 files changed

+24
-15
lines changed

2 files changed

+24
-15
lines changed

P5/Source/Guidelines/en/AI-AnalyticMechanisms.xml

+22-13
Original file line numberDiff line numberDiff line change
@@ -1204,16 +1204,19 @@ sake of readability).
12041204
<term>CoNLL-U</term>:
12051205
<eg xml:space="preserve">1 They they PRON _ _ 2 nsubj _ _
12061206
2 buy buy VERB _ _ 0 root _ _
1207-
3 books book NOUN _ _ 2 obj _ _
1207+
3 books book NOUN _ _ 2 obj _ SpaceAfter=No
12081208
4 . . PUNCT _ _ 2 punct _ _</eg>
12091209
In this example, the first column gives a numerical, one-based, node index
12101210
of the tokens in the current sentence, the second column a token of a word
12111211
form or a symbol, the third column the corresponding lemma, the forth
12121212
column a UD part-of-speech tag, the seventh column the node index of the
12131213
syntactic head of the current token, and the eighth column a label of the
1214-
type of the dependency relation between the token and its head. (Empty
1215-
columns are marked with <code>_</code>.) Note that, by convention, the
1216-
index for the (unrepresented) root node is 0.</p>
1214+
type of the dependency relation between the token and its head. The very
1215+
last, tenth, column can be used for miscellaneous information, such as
1216+
whether or not there is space to add after the token when joining them.
1217+
Empty columns are marked with <code>_</code>. Note that, by convention,
1218+
the index for the unrepresented, <soCalled>virtual</soCalled> root node
1219+
is 0.</p>
12171220
<p>A graphical rendition of this example is given below in terms of an
12181221
annotated dependency graph.</p>
12191222
<p><graphic width="300px" url="Images/dependency1.png"/></p>
@@ -1233,7 +1236,7 @@ sake of readability).
12331236
<s n="0">
12341237
<w n="1" head="2" deprel="nsubj" pos="PRON" lemma="they">They</w>
12351238
<w n="2" head="0" deprel="root" pos="VERB" lemma="buy">buy</w>
1236-
<w n="3" head="2" deprel="obj" pos="NOUN" lemma="book">books</w>
1239+
<w n="3" head="2" deprel="obj" pos="NOUN" lemma="book" join="right">books</w>
12371240
<pc n="4" head="2" deprel="punct" pos="PUNCT" lemma=".">.</pc>
12381241
</s>
12391242
</egXML>
@@ -1245,22 +1248,28 @@ sake of readability).
12451248
index of their syntactic head. Labels for the types of the dependency
12461249
relation are provided as the value of the <att>deprel</att> attributes on
12471250
<gi>w</gi> and <gi>pc</gi> elements. Part-of-speech tags and lemmas are
1248-
given as values of <att>pos</att> and <att>lemma</att> attributes.</p>
1251+
given as values of <att>pos</att> and <att>lemma</att> attributes. Last,
1252+
but not least, the <att>join</att> attribute can be used for information
1253+
on whether a token is adjacent to the tokens on its left-hand or
1254+
right-hand side when joining them.</p>
12491255

12501256
<p>A more complex example in the CoNLL-U format is given below:<note
12511257
place="bottom">The example is taken from the documentation of the CoNLL-U
1252-
format at <ptr target="https://universaldependencies.org/format.html"/>.</note>
1258+
format at <ptr target="https://universaldependencies.org/format.html"/>,
1259+
with the addition of <code>SpaceAfter=No</code> <!-- in the fifth row -->
1260+
on the last column for miscellaneous information.</note>
12531261
<eg xml:space="preserve">1 They they PRON PRP Case=Nom|Number=Plur 2 nsubj 2:nsubj|4:nsubj _
12541262
2 buy buy VERB VBP Number=Plur|Person=3|Tense=Pres 0 root 0:root _
12551263
3 and and CCONJ CC _ 4 cc 4:cc _
12561264
4 sell sell VERB VBP Number=Plur|Person=3|Tense=Pres 2 conj 0:root|2:conj _
1257-
5 books book NOUN NNS Number=Plur 2 obj 2:obj|4:obj _
1265+
5 books book NOUN NNS Number=Plur 2 obj 2:obj|4:obj SpaceAfter=No
12581266
6 . . PUNCT . _ 2 punct 2:punct _</eg>
12591267
In this grammatical annnotation of a sentence with coordination ellipsis,
1260-
there are additional columns. The fifth column provides a concurrent
1261-
part-of-speech tagging using a non-UD tagset. The sixth column adds a
1262-
pipe-separated list of morphosyntactic features. The ninth column encodes
1263-
an extended dependency structure with additional dependency relations.</p>
1268+
there are additional non-empty columns. The fifth column provides a
1269+
concurrent part-of-speech tagging using a non-UD tagset. The sixth column
1270+
adds a pipe-separated list of morphosyntactic features. And the ninth
1271+
column encodes an extended dependency structure with additional dependency
1272+
relations.</p>
12641273
<p>As can be seen from the graphical rendition of this example below, there
12651274
are dependent nodes with multiple arcs, pointing to more than one head
12661275
node.</p>
@@ -1273,7 +1282,7 @@ sake of readability).
12731282
<w n="2" head="0" deprel="root" pos="VERB VBP" msd="Number=Plur|Person=3|Tense=Pres" lemma="buy">buy</w>
12741283
<w n="3" head="4" deprel="cc" pos="CCONJ CC" lemma="and">and</w>
12751284
<w n="4" head="2 0" deprel="conj root" pos="VERB VBP" msd="Number=Plur|Person=3|Tense=Pres" lemma="sell">sell</w>
1276-
<w n="5" head="2 4" deprel="obj obj" pos="NOUN NNS" msd="Number=Plur" lemma="book">books</w>
1285+
<w n="5" head="2 4" deprel="obj obj" pos="NOUN NNS" msd="Number=Plur" lemma="book" join="right">books</w>
12771286
<pc n="6" head="2" deprel="punct" pos="PUNCT ." lemma=".">.</pc>
12781287
</s>
12791288
</egXML>

P5/Source/Specs/att.linguistic.dependency.xml

+2-2
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
<s n="0">
1919
<w n="1" head="2" deprel="nsubj" pos="PRON" lemma="they">They</w>
2020
<w n="2" head="0" deprel="root" pos="VERB" lemma="buy">buy</w>
21-
<w n="3" head="2" deprel="obj" pos="NOUN" lemma="book">books</w>
21+
<w n="3" head="2" deprel="obj" pos="NOUN" lemma="book" join="right">books</w>
2222
<pc n="4" head="2" deprel="punct" pos="PUNCT" lemma=".">.</pc>
2323
</s>
2424
</egXML>
@@ -32,7 +32,7 @@
3232
<w n="2" head="0" deprel="root" pos="VERB VBP" msd="Number=Plur|Person=3|Tense=Pres" lemma="buy">buy</w>
3333
<w n="3" head="4" deprel="cc" pos="CCONJ CC" lemma="and">and</w>
3434
<w n="4" head="2 0" deprel="conj root" pos="VERB VBP" msd="Number=Plur|Person=3|Tense=Pres" lemma="sell">sell</w>
35-
<w n="5" head="2 4" deprel="obj obj" pos="NOUN NNS" msd="Number=Plur" lemma="book">books</w>
35+
<w n="5" head="2 4" deprel="obj obj" pos="NOUN NNS" msd="Number=Plur" lemma="book" join="right">books</w>
3636
<pc n="6" head="2" deprel="punct" pos="PUNCT ." lemma=".">.</pc>
3737
</s>
3838
</egXML>

0 commit comments

Comments
 (0)