Add Schema.org metadata to your repository
For Research Data Australia Contributors to have their datasets to show up in Google Dataset Search with a direct link to their own repository as the source, then it is necessary to implement schema.org markup on each dataset landing page in the repository.
Following is a guide to making your dataset records available to Google Dataset Search from your own repository (in addition to Research Data Australia):
- Use Google's Structured Data Testing Tool to see if there is any markup already on your dataset web pages.
- If there is no structured metadata, add Schema.org metadata to every dataset landing page that you want indexed and use the "Dataset" class. Use the Structured Data Testing Tool to verify if there are any syntax errors.
- Include landing page URLs in a sitemap file to help Google find your dataset pages. Pages which are crawled (or re-crawled), will go into the Google Dataset Search index and be searchable within a few days.
More information:
- Google AI Blog (26 September 2018), Building Google Dataset Search and Fostering an Open Data Ecosystem
- Google Developers page: guidelines for dataset providers
DataCite Blog (12 December 2018), Google Dataset Search Webinar - everything you always wanted to know about Google Dataset Search, https://doi.org/10.5438/4sdj-hf49