|
OAI-PMH “defines a mechanism for harvesting XML-formatted metadata from repositories”. The OAI-PMH mandates unqualified Dublin Core (DC) as its common metadata format. The OAI-PMH also “supports the notion of multiple metadata sets, allowing communities to expose metadata in formats that are specific to their applications and domains”. Other metadata formats might include domain-specific Dublin Core Application Profiles (DCAPs) or other XML formats, such as LOM or ODRL.
Non-XML formatted data, e.g., MARC records, could either be conveyed using an XML translation (e.g. MARCXML) or could conceivably be wrapped in a CDATA section within the XML record itself. The latter approach, although messy, could also be used to enable content, as well as metadata, to be harvested using OAI-PMH, using content packaging standards such as METS, IMS Content Packaging and MPEG-21 DIDL.
The OAI-PMH specification defines 2 main actors:
- harvester - a client application that issues OAI-PMH requests. A harvester is operated by a service provider as a means of collecting metadata from repositories.
- repository - a network accessible server, managed by a data provider to expose metadata to harvesters.
To allow various repository configurations, the protocol distinguishes between three distinct entities related to the metadata made accessible by repositories:
- resource - the object or “stuff” that metadata is “about”. The nature of a resource, whether it is physical or digital, or whether it is stored in the repository or is a constituent of another database, is outside the scope of the protocol.
- item - a constituent of a repository from which metadata about a resource can be disseminated. An item is conceptually a container that stores or dynamically generates metadata about a single resource in multiple formats.
- record - metadata in a specific metadata format. A record is returned as an XML-encoded byte stream in response to a protocol request to disseminate a specific metadata format from a constituent item.
|