-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Lots of changes happened since the last version, see the CHANGELOG for details. git-svn-id: https://pet.opendfki.de/repos/pet/main@222 4200e16c-5112-0410-ac55-d7fb557a720a
- Loading branch information
kiefer
committed
Oct 4, 2004
1 parent
2ea8791
commit 1804dc6
Showing
136 changed files
with
10,638 additions
and
4,701 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
- the bound on the number of inflection rules (setting max-inflections) does | ||
not work | ||
- flop is not able to dump cyclic structures | ||
- Berthold: packing vs. Relativsaetze (what is the exact error?) | ||
|
||
- wrong/no characterization when unfilling is used | ||
:-( No clean way to implement this; in fact, characterization should be | ||
implemented in the grammar, not in the processing engine. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
v0.99.0 | ||
Since this is the first entry, change descriptions are quite coarse | ||
grained. This should maybe change in the future. | ||
|
||
- Added doxygen compatible comments to most of the .h files, and some | ||
comments to source files too. | ||
- New input/lexical processing stage to allow more modularization and | ||
flexible exchange of tokenization, morphology, etc. | ||
- "japanese multiword bug" fixed | ||
- Application of inflection and lexical rules can now be completed before any | ||
syntactical processing takes place (which might be beneficial for | ||
chart dependencies in german) | ||
- fixed bug in acyclic transitive reduction in the boost version of flop | ||
- expansion failures (flop) now report the failure path | ||
|
||
- XML input mode (also as an replacement for the whiteboard version) | ||
- first version of fragmentary results in case of parse failure, maybe needs | ||
more flexibility for better heuristics. | ||
- activation of packing without restrictor setting does no longer lead to a | ||
segmentation fault; packing is simply not activated. | ||
- translation of iso chars to isomorphix in YY input mode | ||
- incr(tsdb[]) file dump mode | ||
- version string now included in flop and cheap binaries. version number is | ||
printed with usage information | ||
- printer for hierarchies in VCG tool style, can be used in cheap and flop | ||
- support for dynamic symbols | ||
- dag_expand now does the job correctly using a scheme similar to delta | ||
expansion. | ||
- moved the whole agenda code into the .h file with the hope of some positive | ||
inlining effects (and, besides, to get rid of another file). | ||
- more flexible restrictor functionality | ||
|
||
- lots of minor cleanup issues | ||
- first attempt to CHANGELOG, TODO, BUGS, version.h | ||
|
||
Done previously (from old ToDo file, partially redundant) | ||
|
||
+- XML input mode | ||
+ complete DTD specification (Uli S. and me did this) | ||
+ build SAX parser | ||
+ supersedes integration of bernd's (whiteboard) version | ||
|
||
+ fragmentary results in case of parse failure (v1.0) | ||
Fragmentsuche/-ausgabe im Falle von Parse-failures | ||
|
||
+ integrate ecls LISP (seems to work now) | ||
+ unfilling in PET leads to wrong / incomplete results (German grammar) | ||
re-expansion (dag_expand) gefixt | ||
+ packing/unpacking does characterization too | ||
|
||
+ leda -> boost migration done and checked | ||
|
||
+ CFROM/CTO fix: toplevel errors | ||
|
||
+ bei packing ohne packing-restrictor: segfault, jetzt: Warning & disable | ||
|
||
+ Nullfehler bei MRS muss Ausgabe produzieren | ||
+ YY-mode macht kein translate-iso-chars | ||
|
||
+ Schreiben von TSDB-Tabellen aus PET | ||
+ Erzeugen von item, parse und result tabellen, wenn PET in der HOG laeuft. | ||
yy.cpp ausschlachten: TSDBFILEAPI !! | ||
+ Optionsbeschreibung einbauen | ||
+ Counts fuer lexikalische Ambiguitaet | ||
|
||
+ correct sorting of results according to score | ||
+ -results=n option to get only the best n results | ||
+ fullform-morphology gibt beim Drucken den Stem mit raus | ||
+ Restricting the number of inflection rule applications | ||
|
||
+ positions and counts for YY and XML tokenizer | ||
+ perforce main branch auf den neuesten Stand bringen: | ||
raus: | ||
cheap: | ||
agenda.cpp inputchart.cpp/h inputtoken.cpp/h chartpositions.h | ||
tokenizer.cpp/h parser.cpp/h mrs.cpp/h | ||
common: | ||
errors.cpp | ||
|
||
neu: | ||
cheap: | ||
xmlparser* xml-tokenizer* pic-handler* pic-states.h lexparser.* | ||
common: | ||
hashing.h vcg_print.h version.h | ||
|
||
+ japanese multiword bug (requires input chart redesign) | ||
+? implement mrs/rmrs code - processor interface ?Is this implemented or not? | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
all: flop cheap doc | ||
|
||
flop: | ||
cd flop | ||
make flop | ||
|
||
cheap: | ||
cd cheap | ||
make cheap | ||
|
||
doc: flopdoc cheapdoc | ||
|
||
flopdoc: | ||
doxygen doxyconfig.flop | ||
|
||
cheapdoc: | ||
doxygen doxyconfig.cheap |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,50 +1,86 @@ | ||
- build process: auto(config|make) | ||
|
||
- cheap dynamic library / API | ||
|
||
- flop returns zero even in the presence of errors like non-unique feature | ||
introduction | ||
- Better error handling in flop for use with external applications | ||
- emacs compatible error messages for flop ? | ||
|
||
- Documentation | ||
- flop & cheap user doc | ||
- missing header file documentation (oe, please help here, if possible) | ||
itsdb.h, extdict.h, psqllex.h, tsdb++.h | ||
|
||
- unreleased memory? (see valgrind-errors-15-apr-04) | ||
|
||
- more flexible way to do selection of generic entries, e.g., based only on a | ||
(highly scored) subset of POS, or combined clues from morphology | ||
|
||
- more flexible heuristics / better selection of partial results | ||
|
||
- separate switches for unification and subsumption quickcheck computation | ||
|
||
- cleaning up: | ||
- option handling | ||
- YY references; split yy.cpp module into seperate modules | ||
- runtime selection of online-morphology vs full-forms | ||
- YY references; split yy.cpp module into separate modules | ||
+ yy_tokenizer removed from yy.cpp | ||
- server mode still unused, yy.cpp/h should become socket.cpp/h | ||
+ runtime selection of online-morphology vs full-forms | ||
- logging / debugging info: get rid of global verbosity, | ||
implement some central logging facility | ||
|
||
- build process | ||
- autoconfig | ||
- version.h mechanism | ||
implement some central logging facility (take log4cxx) | ||
|
||
- complete lexical database (postgres) integration | ||
- integrate silo | ||
|
||
- integrate ecls LISP | ||
- implement mrs/rmrs code - processor interface | ||
|
||
- leda -> boost migration | ||
|
||
- lsl completion - minimal | ||
- integrate silo | ||
|
||
- integrate bernd's (whiteboard) version | ||
- lsl completion - minimal ?? What does that mean ? | ||
|
||
- scoring: | ||
- offline scoring | ||
- simplified model for compatibility | ||
- simplified model for compatibility ?? What does that mean ? | ||
|
||
- packing: | ||
- fix & integrate subsumption quickcheck | ||
currently, it gives incorrect results for non-existing paths | ||
- generalise restrictor | ||
+- generalise restrictor: new restrictor interface implemented | ||
- simplify/optimise subsume | ||
- subtype caching | ||
- re-enable unfilling as far as possible | ||
|
||
- documentation | ||
|
||
- japanese multiword bug (requires input chart redesign) | ||
|
||
- defaults | ||
|
||
- generator | ||
|
||
- whenever dag_get_path_value is called, structure should be filled, at least | ||
under that path. | ||
|
||
- apply chart dependencies after lexical processing | ||
+ apply chart dependencies after lexical processing | ||
- chart dependencies after lex lookup (1) AND lex processing (2) | ||
-+ still to be tested, Berthold will try it | ||
|
||
- extend chart dependencies to allow a dependency to be conditioned on | ||
a specified path-value pair. | ||
a specified path-value pair. chart dependencies could take a variety of | ||
forms: (OP could be unifies, subsumes, is_subsumed_by, equals) | ||
- val(path1) OP val(path2) | ||
- val(path1) OP const1 && val(path2) OP const2 | ||
- val(path1) OP const1 && val(path1) OP val(path2) | ||
- val(path1) OP val(path2) && val(path2) OP const2 (??) | ||
|
||
- restrictors that are paths instead of features | ||
|
||
refactoring: | ||
- make tAgenda a template | ||
- make the unification engine(s) more modular | ||
- better decoupling of the dag allocation mechanism | ||
- replace item print routines by item printers where possible | ||
|
||
- diagnostic messages for errors in the MRS construction | ||
- performance loss compared zu ~kiefer/duo/public/pet-730.tgz is 30% -- | ||
because of the data structures in the chart that are necessary for packing, | ||
like _Cp_span? this has to be checked. | ||
|
||
- performance loss flop Leda vs. flop boost: seems to stem from a huge amount | ||
of minor page faults. How can the code be found that is responsible for this | ||
behaviour ? | ||
|
Oops, something went wrong.