Skip to content

Commit 45fcc4c

Browse files
Grace MuznyStanford NLP
Grace Muzny
authored and
Stanford NLP
committed
merge master in
1 parent 6252707 commit 45fcc4c

File tree

946 files changed

+190493
-194615
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

946 files changed

+190493
-194615
lines changed

CONTRIBUTING.md

+5-1
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,13 @@ However, Stanford CoreNLP is copyright by Stanford. (Technically, by The Board o
88
In order for us to continue to be able to dual-license Stanford CoreNLP, we need to make sure that contributions from others do not restrict Stanford from separately licensing the code.
99

1010
Therefore, we can accept contributions on any of the following terms:
11+
1112
* If your contribution is a bug fix of 6 lines or less of new code, we will accept it on the basis that both you and us regard the contribution as de minimis, and not requiring further hassle.
1213
* You can declare that the contribution is in the public domain (in your commit message or pull request).
1314
* You can make your contribution available under a non-restrictive open source license, such as the Revised (or 3-clause) BSD license, with appropriate licensing information included with the submitted code.
14-
* You can sign and return to us a contributor license agreement (CLA), explicitly licensing us to be able to use the code. You can find these agreements at http://nlp.stanford.edu/software/CLA/ . You can send them to us or contact us at: [email protected] .
15+
* You can sign and return to us a contributor license agreement (CLA), explicitly licensing us to be able to use the code.
16+
There is a [Contributor License Agreement for Individuals](http://nlp.stanford.edu/software/CLA/individual.html) and
17+
a [Contributor License Agreement for Corporations](http://nlp.stanford.edu/software/CLA/corporate.html).
18+
You can send them to us or contact us at: [email protected] .
1519

1620
You should do development against our master branch. The project's source code is in utf-8 character encoding. You should make sure that all unit tests still pass. (In general, you will not be able to run our integration tests, since they rely on resources in our filesystem.)

LICENSE.txt

+617-282
Large diffs are not rendered by default.

README.md

+12
Original file line numberDiff line numberDiff line change
@@ -5,12 +5,24 @@ Stanford CoreNLP provides a set of natural language analysis tools written in Ja
55

66
The Stanford CoreNLP code is written in Java and licensed under the GNU General Public License (v3 or later). Note that this is the full GPL, which allows many free uses, but not its use in proprietary software that you distribute to others.
77

8+
#### How To Compile (with ant)
9+
10+
1. cd CoreNLP ; ant
11+
12+
#### How To Create A Jar
13+
14+
1. compile the code
15+
2. cd CoreNLP/classes ; jar -cf ../stanford-corenlp.jar edu
16+
817
You can find releases of Stanford CoreNLP on [Maven Central](http://search.maven.org/#browse%7C11864822).
918

1019
You can find more explanation and documentation on [the Stanford CoreNLP homepage](http://nlp.stanford.edu/software/corenlp.shtml#Demo).
1120

1221
The most recent models associated with the code in the HEAD of this repository can be found [here](http://nlp.stanford.edu/software/stanford-corenlp-models-current.jar).
1322

23+
Some of the larger (English) models -- like the shift-reduce parser and WikiDict -- are not distributed with our default models jar.
24+
The most recent version of these models can be found [here](http://nlp.stanford.edu/software/stanford-english-corenlp-models-current.jar).
25+
1426
For information about making contributions to Stanford CoreNLP, see the file [CONTRIBUTING.md](CONTRIBUTING.md).
1527

1628
Questions about CoreNLP can either be posted on StackOverflow with the tag [stanford-nlp](http://stackoverflow.com/questions/tagged/stanford-nlp),

build.gradle

+1-1
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ sourceCompatibility = 1.8
1111
targetCompatibility = 1.8
1212
compileJava.options.encoding = 'UTF-8'
1313

14-
version = '3.4.1'
14+
version = '3.6.0'
1515

1616
// Gradle application plugin
1717
mainClassName = "edu.stanford.nlp.pipeline.StanfordCoreNLP"

build.xml

+8-3
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,11 @@
133133
<exclude name="**/*.java"/>
134134
</fileset>
135135
</copy>
136+
<copy todir="${build.path}/edu/stanford/nlp/pipeline">
137+
<fileset dir="${source.path}/edu/stanford/nlp/pipeline">
138+
<exclude name="**/*.java"/>
139+
</fileset>
140+
</copy>
136141
</target>
137142

138143
<target name="test" depends="classpath,compile"
@@ -173,7 +178,7 @@
173178
<target name="slowitest" depends="classpath,compile"
174179
description="Run really slow integration tests">
175180
<echo message="${ant.project.name}" />
176-
<junit fork="yes" maxmemory="8g" printsummary="off" outputtoformatters="false" forkmode="perTest" haltonfailure="true">
181+
<junit fork="yes" maxmemory="12g" printsummary="off" outputtoformatters="false" forkmode="perTest" haltonfailure="true">
177182
<classpath refid="classpath"/>
178183
<classpath path="${build.path}"/>
179184
<classpath path="${data.path}"/>
@@ -308,7 +313,7 @@
308313
<include name="commons-lang3-3.1.jar"/>
309314
<include name="xom-1.2.10.jar"/>
310315
<include name="joda-time.jar"/>
311-
<include name="jollyday-0.4.7.jar"/>
316+
<include name="jollyday-0.4.9.jar"/>
312317
</lib>
313318
<zipfileset prefix="WEB-INF/data"
314319
file="/u/nlp/data/pos-tagger/distrib/english-left3words-distsim.tagger"/>
@@ -441,7 +446,7 @@
441446
<include name="xom-1.2.10.jar"/>
442447
<include name="xml-apis.jar"/>
443448
<include name="joda-time.jar"/>
444-
<include name="jollyday-0.4.7.jar"/>
449+
<include name="jollyday-0.4.9.jar"/>
445450
</lib>
446451
<!-- note for John: c:/Users/John Bauer/nlp/stanford-releases -->
447452
<lib dir="/u/nlp/data/StanfordCoreNLPModels">

data/edu/stanford/nlp/upos/ENUniversalPOS.tsurgeon

+16-6
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@ RB=target <... {/.*/}
124124
relabel target ADV
125125

126126
% DT -> PRON (pronominal this/that/these/those)
127-
@NP <: (DT=target < /^[Tt]h(is|at|ose|ese)$/)
127+
@NP <: (DT=target < /^(?i:th(is|at|ose|ese))$/)
128128

129129
relabel target PRON
130130

@@ -133,6 +133,21 @@ DT=target < __
133133

134134
relabel target DET
135135

136+
% WDT -> PRON (pronominal that/which)
137+
@WHNP|NP <: (WDT=target < /^(?i:(that|which))$/)
138+
139+
relabel target PRON
140+
141+
% WDT->SCONJ (incorrectly tagged subordinating conjunctions)
142+
@SBAR < (WDT=target < /^(?i:(that|which))$/)
143+
144+
relabel target SCONJ
145+
146+
% WDT -> DET
147+
WDT=target <... {/.*/}
148+
149+
relabel target DET
150+
136151
% ------------------------------
137152
% 1 to 1 mappings
138153
%
@@ -227,11 +242,6 @@ UH=target <... {/.*/}
227242

228243
relabel target INTJ
229244

230-
% WDT -> DET
231-
WDT=target <... {/.*/}
232-
233-
relabel target DET
234-
235245
% WP -> PRON
236246
WP=target <... {/.*/}
237247

doc/classify/README.txt

+4-2
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Stanford Classifier v3.5.2 - 2015-04-20
1+
Stanford Classifier v3.6.0 - 2015-12-09
22
-------------------------------------------------
33

44
Copyright (c) 2003-2012 The Board of Trustees of
@@ -24,7 +24,7 @@ QUICKSTART
2424
COMMAND LINE INTERFACE
2525
To classify the included example dataset cheeseDisease (in the examples directory), type the following at the command line while in the main classifier directory:
2626

27-
java -jar stanford-classifier.jar -prop examples/cheese2007.prop
27+
java -cp "*:." edu.stanford.nlp.classify.ColumnDataClassifier -prop examples/cheese2007.prop
2828

2929
This will classify the included test data, cheeseDisease.test, based on the probability that each example is a cheese or a disease, as calculated by a linear classifier trained on cheeseDisease.train.
3030

@@ -76,6 +76,8 @@ LICENSE
7676
CHANGES
7777
-------------------------
7878

79+
2015-12-09 3.6.0 Update for compatibility
80+
7981
2015-04-20 3.5.2 Update for compatibility
8082

8183
2015-01-29 3.5.1 New input/output options, support for GloVe

doc/corenlp/META-INF/MANIFEST.MF

+6
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
Manifest-Version: 1.0
2+
Implementation-Version: 3.7.0
3+
Built-Date: 2016-04-27
4+
Created-By: Stanford JavaNLP
5+
Main-class: edu.stanford.nlp.pipeline.StanfordCoreNLP
6+

doc/corenlp/README.txt

+4-1
Original file line numberDiff line numberDiff line change
@@ -42,8 +42,11 @@ LICENSE
4242
CHANGES
4343
---------------------------------
4444

45+
2015-12-09 3.6.0 Improved coreference, OpenIE integration,
46+
Stanford CoreNLP server
47+
4548
2015-04-20 3.5.2 Switch to Universal dependencies, add Chinese
46-
coreference systemCore NLP
49+
coreference system to CoreNLP
4750

4851
2015-01-29 3.5.1 NER, dependency parser, SPIED improvements;
4952
general bugfixes

doc/corenlp/corenlp.sh

+2-2
Original file line numberDiff line numberDiff line change
@@ -18,5 +18,5 @@ else
1818
scriptdir=$(dirname "$scriptpath")
1919
fi
2020

21-
echo java -mx3g -cp \"$scriptdir/*\" edu.stanford.nlp.pipeline.StanfordCoreNLP $*
22-
java -mx3g -cp "$scriptdir/*" edu.stanford.nlp.pipeline.StanfordCoreNLP $*
21+
echo java -mx5g -cp \"$scriptdir/*\" edu.stanford.nlp.pipeline.StanfordCoreNLP $*
22+
java -mx5g -cp "$scriptdir/*" edu.stanford.nlp.pipeline.StanfordCoreNLP $*

doc/corenlp/pom-full.xml

+12-7
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
<modelVersion>4.0.0</modelVersion>
33
<groupId>edu.stanford.nlp</groupId>
44
<artifactId>stanford-corenlp</artifactId>
5-
<version>3.5.2</version>
5+
<version>3.6.0</version>
66
<packaging>jar</packaging>
77
<name>Stanford CoreNLP</name>
88
<description>Stanford CoreNLP provides a set of natural language analysis tools which can take raw English language text input and give the base forms of words, their parts of speech, whether they are names of companies, people, etc., normalize dates, times, and numeric quantities, mark up the structure of sentences in terms of phrases and word dependencies, and indicate which noun phrases refer to the same entities. It provides the foundational building blocks for higher level text understanding applications.</description>
@@ -14,8 +14,8 @@
1414
</license>
1515
</licenses>
1616
<scm>
17-
<url>http://nlp.stanford.edu/software/stanford-corenlp-2015-04-21.zip</url>
18-
<connection>http://nlp.stanford.edu/software/stanford-corenlp-2015-04-21.zip</connection>
17+
<url>http://nlp.stanford.edu/software/stanford-corenlp-2015-12-06.zip</url>
18+
<connection>http://nlp.stanford.edu/software/stanford-corenlp-2015-12-06.zip</connection>
1919
</scm>
2020
<developers>
2121
<developer>
@@ -24,9 +24,9 @@
2424
<email>[email protected]</email>
2525
</developer>
2626
<developer>
27-
<id>john.bauer</id>
28-
<name>John Bauer</name>
29-
<email>[email protected]</email>
27+
<id>jason.bolton</id>
28+
<name>Jason Bolton</name>
29+
<email>[email protected]</email>
3030
</developer>
3131
</developers>
3232
<properties>
@@ -65,6 +65,11 @@
6565
<artifactId>slf4j-api</artifactId>
6666
<version>1.7.12</version>
6767
</dependency>
68+
<dependency>
69+
<groupId>com.google.protobuf</groupId>
70+
<artifactId>protobuf-java</artifactId>
71+
<version>2.6.1</version>
72+
</dependency>
6873
</dependencies>
6974
<build>
7075
<sourceDirectory>src</sourceDirectory>
@@ -83,7 +88,7 @@
8388
<configuration>
8489
<artifacts>
8590
<artifact>
86-
<file>${project.basedir}/stanford-corenlp-3.5.2-models.jar</file>
91+
<file>${project.basedir}/stanford-corenlp-3.6.0-models.jar</file>
8792
<type>jar</type>
8893
<classifier>models</classifier>
8994
</artifact>

doc/corenlp/pom-light.xml

+6-6
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
<modelVersion>4.0.0</modelVersion>
33
<groupId>edu.stanford.nlp</groupId>
44
<artifactId>stanford-corenlp</artifactId>
5-
<version>3.3.0</version>
5+
<version>3.6.0</version>
66
<packaging>jar</packaging>
77
<name>Stanford CoreNLP</name>
88
<description>Stanford CoreNLP provides a set of natural language analysis tools which can take raw English language text input and give the base forms of words, their parts of speech, whether they are names of companies, people, etc., normalize dates, times, and numeric quantities, mark up the structure of sentences in terms of phrases and word dependencies, and indicate which noun phrases refer to the same entities. It provides the foundational building blocks for higher level text understanding applications.</description>
@@ -14,8 +14,8 @@
1414
</license>
1515
</licenses>
1616
<scm>
17-
<url>http://nlp.stanford.edu/software/stanford-corenlp-2013-11-12.zip</url>
18-
<connection>http://nlp.stanford.edu/software/stanford-corenlp-2013-11-12.zip</connection>
17+
<url>http://nlp.stanford.edu/software/stanford-corenlp-2015-12-06.zip</url>
18+
<connection>http://nlp.stanford.edu/software/stanford-corenlp-2015-12-06.zip</connection>
1919
</scm>
2020
<developers>
2121
<developer>
@@ -24,9 +24,9 @@
2424
<email>[email protected]</email>
2525
</developer>
2626
<developer>
27-
<id>john.bauer</id>
28-
<name>John Bauer</name>
29-
<email>[email protected]</email>
27+
<id>jason.bolton</id>
28+
<name>Jason Bolton</name>
29+
<email>[email protected]</email>
3030
</developer>
3131
</developers>
3232
<properties>

doc/lexparser/README.txt

+3-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Stanford Lexicalized Parser v3.5.2 - 2015-04-20
1+
Stanford Lexicalized Parser v3.6.0 - 2015-12-09
22
-----------------------------------------------
33

44
Copyright (c) 2002-2015 The Board of Trustees of The Leland Stanford Junior
@@ -224,6 +224,8 @@ LICENSE
224224
CHANGES
225225
---------------------------------
226226

227+
2015-12-09 3.6.0 Updated for compatibility
228+
227229
2015-04-20 3.5.2 Switch to universal dependencies
228230

229231
2015-01-29 3.5.1 Dependency parser improvements; general

doc/lexparser/pom.xml

+12-7
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
<modelVersion>4.0.0</modelVersion>
33
<groupId>edu.stanford.nlp</groupId>
44
<artifactId>stanford-parser</artifactId>
5-
<version>3.5.2</version>
5+
<version>3.6.0</version>
66
<packaging>jar</packaging>
77
<name>Stanford Parser</name>
88
<description>Stanford Parser processes raw text in English, Chinese, German, Arabic, and French, and extracts constituency parse trees.</description>
@@ -14,8 +14,8 @@
1414
</license>
1515
</licenses>
1616
<scm>
17-
<url>http://nlp.stanford.edu/software/stanford-parser-2015-04-20.zip</url>
18-
<connection>http://nlp.stanford.edu/software/stanford-parser-2015-04-20.zip</connection>
17+
<url>http://nlp.stanford.edu/software/stanford-parser-2015-12-08.zip</url>
18+
<connection>http://nlp.stanford.edu/software/stanford-parser-2015-12-08.zip</connection>
1919
</scm>
2020
<developers>
2121
<developer>
@@ -24,9 +24,9 @@
2424
<email>[email protected]</email>
2525
</developer>
2626
<developer>
27-
<id>john.bauer</id>
28-
<name>John Bauer</name>
29-
<email>[email protected]</email>
27+
<id>jason.bolton</id>
28+
<name>Jason Bolton</name>
29+
<email>[email protected]</email>
3030
</developer>
3131
<developer>
3232
<id>spence.green</id>
@@ -45,6 +45,11 @@
4545
<artifactId>ejml</artifactId>
4646
<version>0.23</version>
4747
</dependency>
48+
<dependency>
49+
<groupId>org.slf4j</groupId>
50+
<artifactId>slf4j-api</artifactId>
51+
<version>1.7.12</version>
52+
</dependency>
4853
</dependencies>
4954
<build>
5055
<sourceDirectory>src</sourceDirectory>
@@ -63,7 +68,7 @@
6368
<configuration>
6469
<artifacts>
6570
<artifact>
66-
<file>${project.basedir}/stanford-parser-3.5.2-models.jar</file>
71+
<file>${project.basedir}/stanford-parser-3.6.0-models.jar</file>
6772
<type>jar</type>
6873
<classifier>models</classifier>
6974
</artifact>

0 commit comments

Comments
 (0)