Skip to content

Argument types_src not effective in get_dbpedia_uris() for CWB corpora when not also defining an s-attribute #61

@ChristophLeonhardt

Description

@ChristophLeonhardt

For CWB subcorpora, the columns defined in types_src are not returned. I think that this is due to the hard-coded assignment of columns here:

dbpedia/R/dbpedia.R

Lines 857 to 867 in 9392164

tab <- links[,
list(
cpos_left = expand_fun(.SD, direction = "left"),
cpos_right = expand_fun(.SD, direction = "right"),
dbpedia_uri = .SD[["dbpedia_uri"]],
text = .SD[["text"]],
types = .SD[["types"]]
),
by = c("start", "end"),
.SDcols = c("start", "end", "dbpedia_uri", "text", "types")
]

If I understand this correctly, then two columns are added to the links object which is created earlier: cpos_left and cpos_right. All other columns remain unchanged. The other columns are hard-coded, however.

This results in two issues: Additional columns returned in links such as "DBpedia_type" or "Wikidata_type" are not added (or better: kept). Also, columns which might not be there cause issues. In recent version of dbpedia you can drop the types column. However, if you drop it in this scenario, this causes an error since the column is expected but not available.

I was wondering whether it is possible to simplify the code above to really only add cpos_left and cpos_right as follows:

links[, "cpos_left" := expand_fun(.SD, direction = "left"), by = c("start", "end"), .SDcols = c("start", "end")]
links[, "cpos_right" := expand_fun(.SD, direction = "right"), by = c("start", "end"), .SDcols = c("start", "end")]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions