Skip to content
This repository was archived by the owner on Jan 17, 2022. It is now read-only.
This repository was archived by the owner on Jan 17, 2022. It is now read-only.

Can not use rule file - Unable to create Field Extractor #1

@jure

Description

@jure

I get this error:

ERROR: edu.isi.bmkeg.lapdf.classification.ruleBased.RuleBasedChunkClassifier - Unable to create Field Extractor for 'isMostPopularFontModifierBold' of '[ClassObjectType class=edu.isi.bmkeg.lapdf.features.ChunkFeatures]' in rule 'Title' : [Rule name='Title']
Unable to create Field Extractor for 'getMostPopularFontSize' of '[ClassObjectType class=edu.isi.bmkeg.lapdf.features.ChunkFeatures]' in rule 'Title' : [Rule name='Title']
Unable to create Field Extractor for 'isAlignedMiddle' of '[ClassObjectType class=edu.isi.bmkeg.lapdf.features.ChunkFeatures]' in rule 'Title' : [Rule name='Title']
Unable to create Field Extractor for 'isAllCapitals' of '[ClassObjectType class=edu.isi.bmkeg.lapdf.features.ChunkFeatures]' in rule 'Title' : [Rule name='Title']
Unable to create Field Extractor for 'isMostPopularFontModifierBold' of '[ClassObjectType class=edu.isi.bmkeg.lapdf.features.ChunkFeatures]' in rule 'DrugName' : [Rule name='DrugName']
Unable to create Field Extractor for 'getMostPopularFontSize' of '[ClassObjectType class=edu.isi.bmkeg.lapdf.features.ChunkFeatures]' in rule 'DrugName' : [Rule name='DrugName']
Unable to create Field Extractor for 'isAlignedMiddle' of '[ClassObjectType class=edu.isi.bmkeg.lapdf.features.ChunkFeatures]' in rule 'DrugName' : [Rule name='DrugName']
Unable to create Field Extractor for 'isAllCapitals' of '[ClassObjectType class=edu.isi.bmkeg.lapdf.features.ChunkFeatures]' in rule 'DrugName' : [Rule name='DrugName']
Unable to create Field Extractor for 'isMostPopularFontModifierBold' of '[ClassObjectType class=edu.isi.bmkeg.lapdf.features.ChunkFeatures]' in rule 'Subtitle' : [Rule name='Subtitle']
Unable to create Field Extractor for 'getMostPopularFontSize' of '[ClassObjectType class=edu.isi.bmkeg.lapdf.features.ChunkFeatures]' in rule 'Subtitle' : [Rule name='Subtitle']
Unable to create Field Extractor for 'isAlignedLeft' of '[ClassObjectType class=edu.isi.bmkeg.lapdf.features.ChunkFeatures]' in rule 'Subtitle' : [Rule name='Subtitle']
Unable to create Field Extractor for 'isAllCapitals' of '[ClassObjectType class=edu.isi.bmkeg.lapdf.features.ChunkFeatures]' in rule 'Subtitle' : [Rule name='Subtitle']
Unable to create Field Extractor for 'isMostPopularFontModifierBold' of '[ClassObjectType class=edu.isi.bmkeg.lapdf.features.ChunkFeatures]' in rule 'Subsubtitle' : [Rule name='Subsubtitle']
Unable to create Field Extractor for 'getMostPopularFontSize' of '[ClassObjectType class=edu.isi.bmkeg.lapdf.features.ChunkFeatures]' in rule 'Subsubtitle' : [Rule name='Subsubtitle']
Unable to create Field Extractor for 'isAlignedLeft' of '[ClassObjectType class=edu.isi.bmkeg.lapdf.features.ChunkFeatures]' in rule 'Subsubtitle' : [Rule name='Subsubtitle']
Unable to create Field Extractor for 'isAllCapitals' of '[ClassObjectType class=edu.isi.bmkeg.lapdf.features.ChunkFeatures]' in rule 'Subsubtitle' : [Rule name='Subsubtitle']

when trying to parse a PDF with this ruleset:

package edu.isi.bmkeg.pdf.classification.rules
import edu.isi.bmkeg.lapdf.features.ChunkFeatures;
import edu.isi.bmkeg.lapdf.model.ChunkBlock;

global ChunkBlock chunk;

rule "Title"
    activation-group "blockClassification"
    salience 4
    when
        ChunkFeatures(pageNumber==1)
    ChunkFeatures(isMostPopularFontModifierBold==true)
        ChunkFeatures(getMostPopularFontSize==11)
      ChunkFeatures(isAlignedMiddle==true)
    ChunkFeatures(isAllCapitals==true)
    eval(chunk.readNumberOfLine()<=3)
    then
        chunk.setType(chunk.TYPE_TITLE);
end

rule "DrugName"
  activation-group "blockClassification"
  salience 4
  when
    ChunkFeatures(pageNumber==1)
    ChunkFeatures(isMostPopularFontModifierBold==true)
    ChunkFeatures(getMostPopularFontSize==11)
    ChunkFeatures(isAlignedMiddle==true)
    ChunkFeatures(isAllCapitals==false)
    eval(chunk.readNumberOfLine()<=4)
  then
    chunk.setType(chunk.TYPE_TITLE);
end

rule "Subtitle"
    activation-group "blockClassification"
    salience 4
    when
    ChunkFeatures(isMostPopularFontModifierBold==true)
    ChunkFeatures(getMostPopularFontSize==11)
    ChunkFeatures(isAlignedLeft==true)
    ChunkFeatures(isAllCapitals==true)
    eval(chunk.isMatchingRegularExpression("^[1-9]")==true)
    then
        chunk.setType(chunk.TYPE_TITLE);
end

rule "Subsubtitle"
  activation-group "blockClassification"
  salience 4
  when
    ChunkFeatures(isMostPopularFontModifierBold==true)
    ChunkFeatures(getMostPopularFontSize==11)
    ChunkFeatures(isAlignedLeft==true)
    ChunkFeatures(isAllCapitals==false)
    eval(chunk.isMatchingRegularExpression("^[1-9]")==true)
  then
    chunk.setType(chunk.TYPE_TITLE);
end

Any idea what could possibly be going wrong? I'm using the latest 1.7.2-SNAPSHOT version.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions