Skip to content

Commit e3960f0

Browse files
authored
Release package to PyPI (#2)
* convert readme to RST * prepare pypi release. version bump to 0.2.3 * automate build for pypi and pypitest * change install instructions to fetch package directly from pypi
1 parent 6b65951 commit e3960f0

File tree

6 files changed

+158
-120
lines changed

6 files changed

+158
-120
lines changed

.gitignore

+2
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
MANIFEST
2+
13
# Byte-compiled / optimized / DLL files
24
__pycache__/
35
*.py[cod]

Makefile

+8
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,11 @@ test-coverage:
1313

1414
lint:
1515
pylint nereval.py || exit 0
16+
17+
pypi:
18+
python setup.py sdist bdist_wheel
19+
twine upload dist/*
20+
21+
pypi-test:
22+
python setup.py sdist bdist_wheel
23+
twine upload --repository pypitest dist/*

README.md

-118
This file was deleted.

README.rst

+132
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
nereval
2+
=======
3+
.. image:: https://travis-ci.org/jantrienes/nereval.svg?branch=master
4+
:target: https://travis-ci.org/jantrienes/nereval
5+
6+
Evaluation script for named entity recognition (NER) systems based on entity-level F1 score.
7+
8+
Definition
9+
----------
10+
The metric as implemented here has been described by Nadeau and Sekine (2007) and was widely used as part of the Message Understanding Conferences (Grishman and Sundheim, 1996). It evaluates an NER system according to two axes: whether it is able to assign the right type to an entity, and whether it finds the exact entity boundaries. For both axes, the number of correct predictions (COR), the number of actual predictions (ACT) and the number of possible predictions (POS) are computed. From these statistics, precision and recall can be derived:
11+
12+
::
13+
14+
precision = COR/ACT
15+
recall = COR/POS
16+
17+
18+
The final score is the micro-averaged F1 measure of precision and recall of both type and boundary axes.
19+
20+
Installation
21+
------------
22+
.. code-block:: bash
23+
24+
pip install nereval
25+
26+
27+
Usage
28+
-----
29+
The script can either be used from within Python or from the command line when classification results have been written to a JSON file.
30+
31+
Usage from Command Line
32+
~~~~~~~~~~~~~~~~~~~~~~~
33+
Assume we have the following classification results in ``input.json``:
34+
35+
.. code-block:: json
36+
37+
[
38+
{
39+
"text": "CILINDRISCHE PLUG",
40+
"true": [
41+
{
42+
"text": "CILINDRISCHE PLUG",
43+
"type": "Productname",
44+
"start": 0
45+
}
46+
],
47+
"predicted": [
48+
{
49+
"text": "CILINDRISCHE",
50+
"type": "Productname",
51+
"start": 0
52+
},
53+
{
54+
"text": "PLUG",
55+
"type": "Productname",
56+
"start": 13
57+
}
58+
]
59+
}
60+
]
61+
62+
63+
Then the script can be executed as follows:
64+
65+
.. code-block:: bash
66+
67+
python nereval.py input.json
68+
F1-score: 0.33
69+
70+
71+
Usage from Python
72+
~~~~~~~~~~~~~~~~~
73+
Alternatively, the evaluation metric can be directly invoked from within python. Example:
74+
75+
.. code-block:: python
76+
77+
import nereval
78+
from nereval import Entity
79+
80+
# Ground-truth:
81+
# CILINDRISCHE PLUG
82+
# B_PROD I_PROD
83+
y_true = [
84+
Entity('CILINDRISCHE PLUG', 'Productname', 0)
85+
]
86+
87+
# Prediction:
88+
# CILINDRISCHE PLUG
89+
# B_PROD B_PROD
90+
y_pred = [
91+
# correct type, wrong text
92+
Entity('CILINDRISCHE', 'Productname', 0),
93+
# correct type, wrong text
94+
Entity('PLUG', 'Productname', 13)
95+
]
96+
97+
score = nereval.evaluate([y_true], [y_pred])
98+
print('F1-score: %.2f' % score)
99+
F1-score: 0.33
100+
101+
102+
Note on Symmetry
103+
----------------
104+
The metric itself is not symmetric due to the inherent problem of word overlaps in NER. So ``evaluate(y_true, y_pred) != evaluate(y_pred, y_true)``. This comes apparent if we consider the following example (tagger uses an BIO scheme):
105+
106+
.. code-block:: bash
107+
108+
# Example 1:
109+
Input: CILINDRISCHE PLUG DIN908 M10X1 Foo
110+
Truth: B_PROD I_PROD B_PROD B_DIM O
111+
Predicted: B_PROD B_PROD B_PROD B_PROD B_PROD
112+
113+
Correct Text: 2
114+
Correct Type: 2
115+
116+
# Example 2 (inversed):
117+
Input: CILINDRISCHE PLUG DIN908 M10X1 Foo
118+
Truth: B_PROD B_PROD B_PROD B_PROD B_PROD
119+
Predicted: B_PROD I_PROD B_PROD B_DIM O
120+
121+
Correct Text: 2
122+
Correct Type: 3
123+
124+
125+
Notes and References
126+
--------------------
127+
Used in a student research project on natural language processing at `University of Twente, Netherlands <https://www.utwente.nl>`_.
128+
129+
**References**
130+
131+
* Grishman, R., & Sundheim, B. (1996). `Message understanding conference-6: A brief history <http://www.aclweb.org/anthology/C96-1079>`_. *In COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics* (Vol. 1).
132+
* Nadeau, D., & Sekine, S. (2007). `A survey of named entity recognition and classification <http://www.jbe-platform.com/content/journals/10.1075/li.30.1.03nad>`_. *Lingvisticae Investigationes*, 30(1), 3-26.

setup.cfg

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
[bdist_wheel]
2+
universal=1
3+
4+
[metadata]
5+
description-file = README.rst

setup.py

+11-2
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,18 @@
1-
from distutils.core import setup
1+
import os
2+
from setuptools import setup
3+
4+
def read(fname):
5+
return open(os.path.join(os.path.dirname(__file__), fname)).read()
26

37
setup(
48
name='nereval',
5-
version='0.2.2',
9+
version='0.2.3',
10+
author='Jan Trienes',
11+
author_email='[email protected]',
12+
url='https://github.com/jantrienes/nereval',
613
description='Evaluation script for named entity recognition systems based on F1 score.',
14+
long_description=read('README.rst'),
15+
keywords=['ner', 'nlp', 'evaluation', 'f1_score', 'f1', 'linguistics', 'machine_learning'],
716
license='MIT',
817
py_modules=['nereval'],
918
tests_require=[

0 commit comments

Comments
 (0)