Skip to content

create a standard way within corppa to combine poetry metadata and text from all reference corpora #204

@mnaydan

Description

@mnaydan
  • n/a tiny corpus
  • Internet Poems
  • Chadwyck-Healey

primary goal / must have:

  • add tiny corpus
  • method to generate merged poem metadata for use with excerpts

revise configurations / setup for reuse:

  • tar.gz configs should be text_dir and allow EITHER tarball or dir path
  • internet poems uses tarball file list to generate metadata (or dir path)
  • other poems config metadata_path is url or local file
  • update shared config file
  • simplify polars schema handling (disable inference)
  • raise notimplemented if tar file not handled

review/testing notes:

  • able to run locally
  • able to run on prosody staging vm

follow up work (later):

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions