RLA Minimum Metadata
Overview
RLA Information Model describes the types of information or data objects that RLA captures and makes them discoverable. This section provides recommendations for the minimum properties required per data object.
The following Entitiy Relationship diagram represents the minimum required properties and relationships based on the RLA Information Model.
The recommended properties are selected based on two principles:
Principle 1: Minimise the number of properties with the aim of simplifying the data contribution process
Principle 2: Require only properties that support either a functional requirement of RLA or identify the entity across the RLA graph/data model.
These metadata properties would satisfy the minimum requirements for the RLA functional requirements.
Researcher
Property | Expected Type | Definition | Example |
---|---|---|---|
full name | text | Full name of a person Note: full name is acceptable if it is impossible to provide a name in preferred structural way, i.e. given name and family name | "Elizabeth Bromfield" |
given name | text | the given/first name of a person | “Elizabeth” |
family name | text | the family/last name of a person | “Bromfield" |
identifier | text identifier type from a controlled list: ORCID, ScopusID, ResearcherID, LinkedIn Profile | A unique string that can identify a person, with combination of a identifier type and a value. | Examples: Scopus Author ID: “57193736494” ResearcherID: “R-4037-2017” linked-in profile: https://www.linkedin.com/in/mingfangwu/) |
Website/URL | text | A url to the researcher's institutional webpage, providing access to their professional profile, publications, and contributions. | Website/URL (e.g. website: Prof James Bailey : Find an Expert : The University of Melbourne https://findanexpert.unimelb.edu.au/profile/351-james-bailey |
Publication
Property | Expected Type | Definition | Example |
---|---|---|---|
Identifier | text identifier type from a controlled list: DOI, ISBN, Website/URL, … | Unique string for identifying a publication, a combination of identifier type a value. Note: RLA prefers DOI, although accepts other identifiers as well | DOI: “10.5334/dsj-2019-003” ISBN: “978-0-9872143-5-5: Website: “https://agu.confex.com/agu/fm19/meetingapp.cgi/Paper/565981” |
Title | text | The publication title | |
Abstract | text | A brief summary of the publication's content, outlining the main arguments, results, and conclusions | |
Publication year | number (YYYY) | The year the publication was published, is important for understanding the context and currency of the research. | 2020 |
Publication type | text | A description of a publication | See DataCite controlled list of resource types for examples. |
Research Activities (Grants/Projects)
Property | Expected Type | Definition | Example |
---|---|---|---|
Identifier | text Identify type from a controlled list:
| An unique string that can identify a research activity (grant or project), a combination of identify type and a value | DOI for grant: RAiD:10.26259/7fd9c002 |
Website/URL | text | A link to more detailed information about the grant or project, providing access to a landing page, findings, or reports. | Website/AwardURL: http://dataportal.arc.gov.au/NCGP/Web/Grant/Grant/LE0453614 |
Title | text | The name of the described research activity (e.g. title of a grant or project). | |
Summary/Abstract | text | A brief overview of the project, including goals, and expected outcomes. | |
Announcement Year | Number (YYYY) | The year the grant was announced, giving context to the project's timeline and funding cycle. | 2022 |
Funder Identifier | text | Uniquely identifies a funding entity, according to various types including:
| ROR: 019wvm592 ABN: 52234063906 Crossref Funder ID: 501100000995 Wikidata ID: Q127990 GRID: grid.1001.0 ISNI: 0000 0001 2180 7477 Website: https://www.anu.edu.au |
Funder name | text | Name the funding provider (if no ROR/ABN/Crossref Funder ID identifier is provided) | Australian Research Council |
Organisations
Property | Expected Type | Description | Example |
---|---|---|---|
Identifier | text Identifier type from a controlled list:
| Uniquely identifies an organisation, a combination of identifier type and a value | ROR: 019wvm592 ABN: 52234063906 Crossref Funder ID: 501100000995 Wikidata ID: Q127990 GRID: grid.1001.0 ISNI: 0000 0001 2180 7477 |
Website/URL | text | website of the organisation | website: https://www.anu.edu.au |
Name | text | Name of the organisation | Australian National University |
Location/Country | text | The country where the organisation is located, is important for understanding the geographical and regulatory context. | Australia |
Location/City | text | The city where the organisation is based, offering more precise localization and potential collaboration opportunities. | Canberra |
Instruments
Property | Expected type | Description | Example |
---|---|---|---|
Identifier | text | DOI (or other resolvable identifiers) | |
Title | text | The name of the instrument | |
Description | text | Technical information about the instrument | |
Host Institution | text | An institution responsible for the management of the instrument | |
GeoLocation | text or Geolocation | Spatial region or named place where the data was gathered or about which the instrument is hosted. | countries:Australia or:
CODE
|
Patents
Property | Expected Type | Description | Example |
---|---|---|---|
Application_number | text | The application number that uniquely identify the application of an IP right | US 10229365 B2 |
Invention title | text | Title of the patent | Apparatus and method for quantum processing |
Status | from a controlled list: | The current status of the IP right or IP application | |
Inventor(s) Name | text | Inventor(s) of the patent | fullname:Fuechsle Martin or given name:Fuechsle |
Applicant(s) Name | text or RoR (for Organisation) | Applicant(s) of the patent | Univ Melbourne |
Relation Properties between RLA data objects
The RLA graph is a heterogeneous graph that consists of information from different sources. As such the relationship between nodes can be diverse and adopt different typologies to express the connections between nodes. For example, the relationship between researchers and projects can be classified as “contributor”, “participant”, or “investigator”. Where there is no widely accepted global taxonomy for relationship types, we recommend supporting the following:
RIF-CS Relationships Types: https://services.ands.org.au/documentation/rifcs/vocabs/vocabularies.html#Related_Information_Type
Crossref Relationships https://data.crossref.org/schemas/relations.xsd
ORCID Affiliation Taxonomy: https://info.orcid.org/documentation/integration-guide/admin-guide-to-affiliations/
ORCID Education and Employment: https://support.orcid.org/hc/en-us/articles/360006973933-Add-an-education-or-qualification-to-your-ORCID-record
Persistent Identifiers
The role of persistent identifiers is essential to support interoperability and long-term data int
egrity in the RLA graph/data model. Specifically, the following persistent identifiers play an important role in RLA metadata.
PIDs for Grants and Projects: Allocating PIDs (Persistent Identifiers) to research grants and projects enables the identification and disambiguation of these entities across the RLA graph. This is particularly important when a project has participants from different universities. While there is no globally accepted PID for Grants and Projects, there are three main options for allocating PIDs to projects and grants.
Firstly, both Crossref and DataCite allow the minting of DOIs (Digital Object Identifiers) for grant.
Secondly, a Persistent URL (PURL) can be used to transform local identifiers into Persistent URLs.
Finally, RAiD (Research Activity Identifier Service) opens new opportunities to mint PIDs for research projects or activities.
ORCID for researchers: Allocating ORCID identifiers to researchers is crucial for disambiguating individual researchers across information ingested into the RLA from various universities. Furthermore, ORCID allows the RLA to connect researchers with a wealth of information from publishers and funding bodies. As such it is highly recommended to adopt the use of ORCID for the researcher information provided to RLA. If ORCID is not available in the contributed metadata, a search provided by ORCID API and filtering the graph by related work can lead to identifying the missing ORCID identifiers.
ROR or ABN for Organisations: Identifying the type, location, and domain of activity for organisations mentioned in the RLA graph is crucial for offering valuable insights into current or potential research collaborations. Internationally, http://ROR.org is a viable option for universities and research organisations, whereas the Australian Business Number (ABN) serves as a comprehensive database for companies and all registered legal entities in Australia. Combining these PIDs offers adequate support for disambiguating organisations in the RLA graph. For new records provided to RLA, it is essential to aim for mapping the organisation names to one of these PIDs.
Note: ROR has been omitted from the “Figure: RLA Optimum Metadata Nodes and Relationships” in favour of simplicity. At the current stage of ROR development, most Australian organisations with ROR already have a registered ABN number.
DOI (or URL) for Publications and Datasets: Identifying publications and datasets with a DOI is highly recommended. However, for non-traditional research outputs where a DOI is not available, using a URL as an identifier can support disambiguation and facilitate the retrieval of complementary information from the webpage.