The ARDC Harvester comes with 4 basic harvest types
Each harvest type is implemented as a Python class and extends the
The harvest types are located in the
harvest_handlers directory and each harvest type should be self contained within it's own file.
Looking into the simplest of Harvester as an example on how a harvest handler should perform
GETHarvester harvest handler shows a minimum required of what a harvest handler should have, namely the
def harvest(self) definition, the
property docstring that defines the harvest handler and the new class definition being a subclass of the
The subclass method may override the methods of the
The docstring at the beginning of the class is used to provide the Registry with information about the harvest type. Although it was planned to use this information to determine and populate some of the fields in the Registry user interface, this is not yet fully implemented.
The docstring must take the form of a JSON object with the following properties:
id: The unique identifier for this harvest type.
title: The name of the harvest type as it appears in the “Harvest Method” dropdown in the “Harvester Settings” of the Registry.
description: A brief description of the harvest type, displayed to the right of the “Harvest Method” dropdown in the “Harvester Settings” of the Registry.
params: An array of the other parameters required to complete the specification of the harvest of a particular data source. For example, this might include a URI, or the type of harvester crosswalk to be used. Each array element is an object with these keys:
name: an identifier for this parameter name
required: one of the strings "true" or "false", indicating whether or not a Registry user must specify a value for this parameter.
Please see the implementation of the existing harvest types for examples.
This docstring will then be parsed and notify the ANDS Registry to populate the available harvester type field in the Data Source Settings page.