The ARDC Harvester comes with 4 basic harvest types

Each harvest type is implemented as a Python class and extends the Harvester class

The harvest types are located in the harvest_handlers directory and each harvest type should be self contained within it's own file. 

Looking into the simplest of Harvester as an example on how a harvest handler should perform

from Harvester import *
class GETHarvester(Harvester):
    """
       {
            "id": "GETHarvester",
            "title": "GET Harvester",
            "description": "simple GET Harvester to fetch a single metadata document",
            "params": [
                {"name": "uri", "required": "true"},
                {"name": "xsl_file", "required": "false"}
            ]
      }
    """
    def harvest(self):
        self.getHarvestData()
        self.storeHarvestData()
        self.runCrossWalk()
        self.postHarvestData()
        self.finishHarvest()

The GETHarvester harvest handler shows a minimum required of what a harvest handler should have, namely the def harvest(self) definition, the property docstring that defines the harvest handler and the new class definition being a subclass of the Harvester class.

The subclass method may override the methods of the Harvester class

The docstring at the beginning of the class is used to provide the Registry with information about the harvest type. Although it was planned to use this information to determine and populate some of the fields in the Registry user interface, this is not yet fully implemented.

The docstring must take the form of a JSON object with the following properties:

Please see the implementation of the existing harvest types for examples.

This docstring will then be parsed and notify the ANDS Registry to populate the available harvester type field in the Data Source Settings page.