Skip to content

umd-lib/papaya

Repository files navigation

papaya

IIIF Presentation API Application

Configuration

Environment Variables

  • PAPAYA_URL Public facing base URL of this application.
  • PAPAYA_FCREPO_ENDPOINT URL of the Fedora repository. This is not directly accessed, but is used when translating between URIs and IIIF identifiers.
  • PAPAYA_FCREPO_PREFIX Prefix string to use in IIIF identifiers for resources in the Fedora repository.
  • PAPAYA_SOLR_ENDPOINT URL of the Solr server that will provide the metadata about the resources.
  • PAPAYA_SOLR_TEXT_MATCH_FIELD Field name in Solr to use for text queries that will return hit highlight annotation lists
  • PAPAYA_IIIF_IMAGE_ENDPOINT URL of the IIIF Image API server that provides additional metadata about the images.
  • PAPAYA_IIIF_IMAGE_ORIGIN Actual request URL to use for the IIIF Image API server, if it differs from PAPAYA_IIIF_IMAGE_ENDPOINT
  • PAPAYA_THUMBNAIL_WIDTH Maximum width of thumbnail images included in the manifest.
  • PAPAYA_LOGO_URL URL of an image file to be used as the logo in the manifest.
  • PAPAYA_METADATA_QUERIES_FILE YAML or JSON formatted file that contains a mapping from metadata field label to a jq query to retrieve the value or values for that field from the Solr document for a resource.

Files

  • METADATA_QUERIES_FILE YAML or JSON file that maps metadata field labels to jq queries. It also includes the queries necessary to build the IIIF Manifest structure (see papaya.source.Resource for more info). For example:

    Title: .object__title__display[]?
    Date: .object__date__edtf
    Bibliographic Citation: .object__bibliographic_citation__display[]?
    Creator: .object__creator[]?.agent__label__display[]
    Contributor: .object__contributor[]?.agent__label__display[]?
    Subject: .object__subject[]?.subject__label__display[]
    # structural metadata fields
    # used by Papaya to generate canvases, sequences, etc.
    $uri: .id
    $label: .object__title__display[]?
    $date: .object__date__dt?
    $license_uri: .object__rights__same_as__uris[0]
    $page_uris: .page_uri_sequence__uris[]?
    $page_image_ids: .iiif_thumbnail_sequence__ids[]?
    $*page_doc: .object__has_member[]|select(.id == $uri)
    $*page_label: .object__has_member[]|select(.id == $uri).page__title__txt
    $*file_page_uri: .object__has_member[]|select(.page__has_file[].id == $uri).id

IIIF Image Service Endpoint vs. Origin

The PAPAYA_IIIF_IMAGE_ENDPOINT is the canonical base URI for the IIIF Image server associated with this instance of Papaya. For many cases, it will also be the base URL that is used to make requests to that service.

However, there are cases where it makes more sense to be able to separate the canonical base URI from the request base URL. For instance, consider the case where both Papaya and the IIIF Image server are running inside a Kubernetes cluster and can be connected directly without leaving the internal Kubernetes network. In this case, it would be beneficial to be able to use the cluster-internal base URL to make the HTTP connections, while retaining the canonical URI any links in the generated manifest.

In this case, use PAPAYA_IIIF_IMAGE_ORIGIN to set the request base URL for the IIIF Image service. When this value is set, Papaya will use it instead of PAPAYA_IIIF_IMAGE_ENDPOINT to generate request URLs. In addition, Papaya will create a set of X-Forwarded-* headers to add to requests that reflect the canonical URI.

For example, given:

  • PAPAYA_IIIF_IMAGE_ENDPOINT is https://iiif.example.com/images/iiif/2
  • PAPAYA_IIIF_IMAGE_ORIGIN is http://papaya:3001/iiif/2

The headers would be:

  • X-Forwarded-Proto: https
  • X-Forwarded-Host: iiif.example.com
  • X-Forwarded-Path: /images

The X-Forwarded-Path is calculated by removing the path of the origin URL (e.g., /iiif/2) from the end of the path of the endpoint URI (e.g., /images/iiif/2).

Development Setup

Requires Python 3.14

These setup instructions also assume that you are running the development stacks for both umd-fcrepo and umd-iiif.

git clone [email protected]:umd-lib/papaya.git
cd papaya
python -m venv --prompt "papaya-py$(cat .python-version)" .venv
source .venv/bin/activate
pip install -e . --group test

Create a .env file with the following contents:

FLASK_DEBUG=1
PAPAYA_URL=http://localhost:3001/manifests
PAPAYA_FCREPO_ENDPOINT=http://fcrepo-local:8080/fcrepo/rest
PAPAYA_FCREPO_PREFIX=fcrepo:
PAPAYA_SOLR_ENDPOINT=http://localhost:8985/solr/fcrepo
PAPAYA_SOLR_TEXT_MATCH_FIELD=extracted_text__dps_txt
PAPAYA_IIIF_IMAGE_ENDPOINT=http://localhost:8182/iiif/2
PAPAYA_THUMBNAIL_WIDTH=250
PAPAYA_LOGO_URL=https://www.lib.umd.edu/images/wrapper/liblogo.png
PAPAYA_METADATA_QUERIES_FILE=metadata-queries.yml

Running

flask --app papaya.web run

The application will be available at http://localhost:5000

To listen on a different port, supply the --port option:

flask --app papaya.web run --port 3001

Tests

pytest

With coverage information:

pytest --cov src --cov-report term-missing tests

API Documentation

pip install -e . --group docs
pdoc solrizer

API documentation generated by pdoc will be available at http://localhost:8080/.

To serve the documentation on an alternate port:

pdoc -p 8888 solrizer

Now the documentation will be at http://localhost:8888/.

Docker Image

Build the image:

docker build -t docker.lib.umd.edu/papaya .

When running in a Docker container, the PAPAYA_SOLR_ENDPOINT and PAPAYA_IIIF_IMAGE_ENDPOINT environment variables will need to be adjusted to refer to the correct hostname.

Copy the .env file set up earlier to docker.env, and make these changes:

PAPAYA_SOLR_ENDPOINT=http://host.docker.internal:8985/solr/fcrepo
PAPAYA_IIIF_IMAGE_ENDPOINT=http://host.docker.internal:8182/iiif/2

Run, using this new docker.env file:

docker run --rm -it -p 3001:5000 --env-file docker.env docker.lib.umd.edu/papaya

Name

This application is so-named because the phrase "Presentation API Application" could be abbreviated "PAPIA", which could be pronounced the same as "papaya", and because it is paired with the Cantaloupe IIIF image server in the UMD Libraries' IIIF services stack.

License

Apache-2.0

See the LICENSE file for license rights and limitations.

About

IIIF Presentation API Application

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors