Skip to content

Missing media type (MIME type) for BagIt #22

@paulmillar

Description

@paulmillar

Media type (also known as a MIME type) is a widely used system for labelling the format of data. There is a central database of Media/MIME types maintained by Internet Assigned Numbers Authority (IANA). More information about Media/MIME types is available at the Media Type wikipedia entry.

Currently, there is no media type for BagIt.

This lack of a media type can causes problems, particularly in situations where a file might (or might not) be a BagIt file. As a concrete example, DataCite is updating their metadata schema so that it supports accessing the files in a dataset. One possibility is to provide the data directly (e.g., as a zip file) another possibility is to describe how to fetch the data using an empty BagIt file (one with an empty /data directory and details on how to fetch the data via the fetch.txt file). The DataCite metadata scheme supports recording the Media Type of the file; however, in both cases, the file would have the media type application/zip. A client may wish to download the data if it is a BagIt file (for example, to obtain metadata), but is currently unable to determine whether the linked zip file is a BagIt file.

Media type labels are somewhat sophisticated and include a few features that may prove useful for BagIt.

One feature of media types is the availability of suffixes. This allows a media type to describe both the file format and the underlying format; e.g., application/bagit+zip could describe a BagIt file that is based on the zip archive format. This allows clients that do not support BagIt but that do support zip (application/zip) archives to process the file; for example, to check the integrity of the files in the archive or to scan the file for viruses.

Another feature of media types is parameters. Parameters allows a media type to include metadata about the file. One common parameter is profile. This provides a flexible way to be more specific about the nature of the file without creating many new media types. There is already a profile language for BagIt: BagIt-profiles.

Altogether, this is an example of my suggestion for a BagIt Media Type:

application/bagit+zip;profile=https://example.org/bagit/my-profile

I would advocate that there is a discussion on what should the BagIt media type look like. Once a consensus is established, the corresponding media type should then be registered with IANA, so that it may be used to describe BagIt files.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions