Open Archives Initiative Protocol for Metadata Harvesting The Open Archives Initiative Protocol for Metadata Harvesting
) is a protocol developed for harvesting metadata
descriptions of records in an archive so that services can be built using metadata from many archives. An implementation of OAI-PMH must support representing metadata in Dublin Core
, but may also support additional representations.
The protocol is usually just referred to as the OAI Protocol.
OAI-PMH uses XML
. Version 2.0 of the protocol was released in 2002; the document was last updated in 2015. It has a Creative Commons license
In the late 1990s, Herbert Van de Sompel
) was working with researchers and librarians at Los Alamos National Laboratory
(US) and called a meeting to address difficulties related to interoperability issues of e-print servers and digital repositories
. The meeting was held in Santa Fe, New Mexico
, in October 1999.
A key development from the meeting was the definition of an interface that permitted e-print servers to expose metadata
for the papers it held in a structured fashion so other repositories could identify and copy papers of interest with each other. This interface/protocol was named the "Santa Fe Convention".
Several workshops were held in 2000 at the ACM Digital Libraries conference,
at the 1st ACM/IEEE-CS joint conference on Digital libraries
and elsewhere to share the ideas from the Santa Fe Convention.
It was discovered at the workshops that the problems faced by the e-print community were also shared by libraries, museums, journal publishers, and others who needed to share distributed resources. To address these needs, the Coalition for Networked Information
and the Digital Library Federation
provided funding to establish an Open Archives Initiative
(OAI) secretariat managed by Herbert Van de Sompel and Carl Lagoze. The OAI held a meeting at Cornell University
(Ithaca, New York
) in September 2000 aimed to improve the interface developed at the Santa Fe Convention.
The specifications were refined over e-mail.
OAI-PMH version 1.0 was introduced to the public in January 2001 at a workshop in Washington D.C.
and another in February in Berlin, Germany
Subsequent modifications to the XML
standard by the W3C
required making minor modifications to OAI-PMH resulting in version 1.1. The current version, 2.0, was released in June 2002. It contained several technical changes and enhancements and is not backward compatible.
Some commercial search engines
use OAI-PMH to acquire more resources. Google
initially included support for OAI-PMH when launching sitemaps, however decided to support only the standard XML Sitemaps
format in May 2008.
In 2004, Yahoo!
acquired content from OAIster
(University of Michigan
) that was obtained through metadata harvesting with OAI-PMH. Wikimedia
uses an OAI-PMH repository to provide feeds of Wikipedia
and related site updates for search engines and other bulk analysis/republishing endeavors.
Especially when dealing with thousands of files being harvested every day, OAI-PMH can help in reducing the network traffic and other resource usage by doing incremental harvesting.
metadata search system uses OAI-PMH to index thousands of metadata records from Global Change Master Directory (GCMD) every day.
OAI-PMH has later been applied to sharing of scientific data.
OAI-PMH is based on a client–server
architecture, in which "harvesters" request information on updated records from "repositories". Requests for data can be based on a datestamp range, and can be restricted to named sets defined by the provider. Data providers are required to provide XML
metadata in Dublin Core
format, and may also provide it in other XML formats.
A number of software systems support the OAI-PMH, including Fedora
from the British Library
, GNU EPrints
from the University of Southampton
, Open Journal Systems
from the Public Knowledge Project
, HyperJournal from the University of Pisa
, Digibib from Digibis, MyCoRe
, Primo, DigiTool, Rosetta and MetaLib from Ex Libris
, ArchivalWare from PTFS
, DOOR 
from the eLab
in Lugano, Switzerland, panFMP from the PANGAEA (data library)
from Roaring Development, and jOAI.
A number of large archives support the protocol including arXiv
and the CERN
- ^ a b Marshall Breeding (September 2002). "Understanding the Protocol for Metadata Harvesting of the Open Archives Initiative". Computers in Libraries. 8 (24): 24–29. Retrieved October 11, 2013.
- ^ Marshall, E. (1999). "Researchers plan free global preprint archive". Science. 286 (5441): 887a–887. doi:10.1126/science.286.5441.887a. PMID 10577235.
- ^ "The Santa Fe Convention by the Open Archives Initiative". www.openarchives.org. Retrieved 2021-02-10.
- ^ "The Santa Fe Convention of the Open Archives Initiative". dspace.library.uu.nl. Retrieved 2021-02-10.
- ^ Edward A. Fox; Christine L. Borgman, eds. (2001). "Proceedings of the first ACM/IEEE-CS joint conference on Digital libraries". Joint Conference on Digital Libraries. Roanoke, Virginia, United States: ACM Press. doi:10.1145/379437. ISBN 978-1-58113-345-5.
- ^ Lagoze, Carl; Van de Sompel, Herbert (2001). "The open archives initiative: building a low-barrier interoperability framework". Proceedings of the First ACM/IEEE-CS Joint Conference on Digital Libraries - JCDL '01. Roanoke, Virginia, United States: ACM Press: 54–62. CiteSeerX 10.1.1.161.6800. doi:10.1145/379437.379449. ISBN 978-1-58113-345-5.
- ^ Van de Sompel, Herbert; Lagoze, Carl (2000). "The Santa Fe Convention of the Open Archives Initiative". D-Lib Magazine. 6 (2). doi:10.1045/february2000-vandesompel-oai. ISSN 1082-9873.
- ^ Coalition for Networked Information
- ^ Digital Library Federation
- ^ "OAi-tech Meeting, Cornell University, September 7-8 2000". www.openarchives.org. Retrieved 2021-02-10.
- ^ "The Open Archives Initiative: Open Meeting Renaissance Hotel, Washington DC January 23, 2001". www.openarchives.org. Retrieved 2021-02-10.
- ^ "The Open Archives Initiative: Open Meeting Staatsbibliothek zu Berlin, Germany February 26, 2001". www.openarchives.org. Retrieved 2021-02-10.
- ^ Van de Sompel, Herbert; Young, Jeffrey A.; Hickey, Thomas B. (2003). "Using the OAI-PMH ... Differently". D-Lib Magazine. 9 (7/8). doi:10.1045/july2003-young. ISSN 1082-9873.
- ^ "OAI11 – CERN-UNIGE Workshop on Innovations in Scholarly Communication". Indico. Retrieved 2021-02-10.
- ^ Google Webmaster blog
- ^ "Wikimedia update feed service". Wikimedia Meta-Wiki. Retrieved 14 July 2013.
- ^ incremental harvesting
- ^ R. Devarakonda; G. Palanisamy; J. Green; B. Wilson (2010). "Data sharing and retrieval uses OAI-PMH". Earth Science Informatics. Springer Berlin / Heidelberg. 4 (1): 1–5. doi:10.1007/s12145-010-0073-0. S2CID 46330319.
- ^ Devarakonda, Ranjeet; Palanisamy, Giri; Green, James M.; Wilson, Bruce E. (2011). "Data sharing and retrieval using OAI-PMH". Earth Science Informatics. 4 (1): 1–5. doi:10.1007/s12145-010-0073-0. ISSN 1865-0473.
- ^ DOOR
- ^ eLab
- ^ panFMP
- ^ "jOAI". Archived from the original on 2010-01-15. Retrieved 2009-11-16.
Last edited on 18 May 2021, at 23:53
Content is available under CC BY-SA 3.0
unless otherwise noted.