forked from dbpedia/extraction-framework
-
Notifications
You must be signed in to change notification settings - Fork 2
New Extractor
Dimitris Kontokostas edited this page Mar 1, 2016
·
2 revisions
According to https://github.com/dbpedia/extraction-framework/pull/35/#issuecomment-16187074 the current design is the following
I order to create a new Extractor you need to extend the Extractor[T] trait and in particular:
-
WikiPageExtractorwhen you want to use only the page metadata. e.g.RedirectExtractor, `LabelExtractor', ... -
PageNodeExtractorwhen you want to work with the Wikitext AST (most common case) -
JsonNodeExtractorwhen you want to work with Wikidata pages
Examples of Extractors can be found in the org/dbpedia/extraction/mappings package in core module.
If you want to test your new extractor you can do it in two ways:
- for a full dump extraction you can add
.MyNewExtractorin the extraction property files indumpmodule - add your extractor in the
server.default.propertiesin theservermodule and start the mapping server with../run server. Openhttp://localhost:{PORT}/server/extraction/{LANG}/and try your extractor on a specific page. (You can also run this from your IDE and debug)