• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!


Harvesters - national and international

This version was saved 15 years, 2 months ago View current version     Page history
Saved by Alma Swan
on April 26, 2009 at 4:08:43 pm

Map version


Global harvesters (other than Web search engines)




Developed at UMichigan (originally in collaboration with U.Illinois). Harvests from over 1000 source collections and has almost 20 million records. Anticipates a number of enhancements including Google-like search, browse by subject, access to duplicates (provides access to all potential duplicates; attempt to discover and remove duplicates during ingest) and automated clustering of subject metadata in the test search interfaces. Discloses deep web resources.


BASE (Bielefeld Academic Search Engine)


Developed at the University of Bielefeld. Harvests from over 1000 source collections and has almost 16 million records. Anticipates a number of enhancements including browse function, provision of HTTP and SOAP interfaces and the inclusion of more source collections.


NDLTD (Networked Digital Library of These and Dissertations)


The Networked Digital Library of Theses and Dissertations (NDLTD) is an international organisation dedicated to promoting the adoption, creation, use, dissemination and preservation of electronic analogues to the traditional paper-based theses and dissertations.




Developed and hosted at the University of St.Gallen. Has around 13 million records and 6 million author names. Indexes full-text as well as metadata. Uses lexical and statistical tools to analyse metadata and develop keywords. Semantic processing of records to subject areas using Ontologys.


Regional harvesters




DRIVER is an EU-funded project aiming to facilitate the establishment of repositories in research-based institutions across Europe. It has established a set of guidelines (http://www.driver-repository.eu/DRIVER-Guidelines.html) offering a best practice tool and streamlining repository developments in Europe. Repositories register with DRIVER for harvesting: DRIVER currently provides a search service across around 170 repositories.


eIFL Portal


eIFL (Electronic Information for Libraries) supports and advocates for the wide availability of electronic resources for users of libraries in transitional and developing countries. It provides a portal through which the 100+ OA repositories in the member countries can be searched.  


DART Europe


Partnership of research libraries providing European portal for electronic theses and dissertations. Is the European Working Group of NDLTD.


National harvesters

The number of national-level harvesting initiatives is growing. They are usually funded either by government or by the national library in each country. Some are still at project/pilot level and funding for a sustainable future is not secure in all cases. The main initiatives (there may well be more in development) are listed below and shown on the diagram.


ARROW (Australian Research Repositories Online to the World) Discovery Service


Developed by the National Library of Australia, ARROW harvests from 28 university repositories and a further 12 digital collections. 


CARL/ARBC Metadata Harvester


Canadian Association of Research Libraries. Harvests from 10 university repositories (2 of these are e-theses collections).


DRIVER Belgium


The Belgian harvester built using DRIVER guidelines. Harvests Belgian university repositories.


DDF (Danish National Research Database)


Government-funded and part of the Danish Electronic Library (DEFF). Harvests OA repositories where present; focused so far mainly on institutional CRISes.




Madrid region harvester for universities in the region, including UNED (National Distance Education University of Spain) and the CSIC (Spanish National Research Council) repository.


Intute Repository Search


JISC-funded project (until July 2009) developed by MIMAS and UKOLN. Harvests and searches across UK repositories. Developing additional semantic search capabilities including metadata clustering.




JISC-funded project (completed 2008) that produced a pilot search service working on 7 Scottish university repositories.


Irish National Research Platform


Launched mid-2008 as a feasibility project to harvest from Irish institutional repositories. Will also provide the base for 'research assessment, bibliometric analysis and benchmarking'.




Japanese national portal developed and maintained by the National Institute of Informatics(NII). Harvests from all Japanese institutional repositories and currently has more than 570,000 metadata records compliant with "junii2", which is the de facto standard used by Japanese repositories and was developed by the NII. Consistent with the Dublin Core Element Set. http://www.nii.ac.jp/irp/en/system/junii2_en_20090213.xls

In addition to JAIRO, some subject-based national harvesters are in service: DML-JP(http://dmljp.math.sci.hokudai.ac.jp/), 'Repository of Archaeological Reports', 'Education Subject Repository', 'Open access and bi-directional repository for medical science', etc.




Incorporates DAREnet (repositories of the Dutch universities and research organisations), Cream of Science and Promise of Science (theses) alongside the national research database NOD.




Government-funded but future unclear as the Government ministry involved has declined to continue funding to support NORA in 2009. Uses a national metadata standard (OAI-compliant) at present. Aiming to convert Norwegian national metadata to DRIVER standards. Intends NORA to be the single harvesting point for international services wishing to index Norwegian research.




Brazil's national service. Government-funded portal for Brazilian institutional repositories.


PLEIADI (Portale per la Letteratura scientifica Elettronica Italiana su Archivi aperti e Depositi Istituzionali


The portal for Italian university repositories. Partner to PUMA in presenting a national search service for the OA literature.




Harvester for the 24 institutes of the Italian National Research Council (CNR). Partner to PLEIADI.




The Spanish service harvesting from institutional repositories. It currently harvests from 25 repositories, of which 14 are e-theses collections.


Repositório Científico de Acesso Aberto de Portugal


Portuguese national harvester. Harvests from 12 Portuguese university repositories.




Swedish national harvester developed by Universities of Uppsala and Gothenburg and the National Library of Sweden. Harvests from institutional publication databases (generally metadata-only records) and OA repositories where available. Provides metadata for harvesting by other services. Aim is to integrate it with the national bibliographic service LIBRIS (http://libris.kb.se). The requirements for this national service have been specified: the beta release is expected at the end of April 2009 and the final release in September 2009.



Google has recently changed the way it indexes repositories since it has dropped support for OAI when it indexes websites. It also appears to miss a considerable proportion of the hidden web. See:

Hagedorn K and Santelli J (2008) Google still not indexing hidden web URLs.  http://www.dlib.org/dlib/july08/hagedorn/07hagedorn.html [Google is 'missing' 55% of records in repositories]

Comments (0)

You don't have permission to comment on this page.