Skip to content

Latest commit

 

History

History
362 lines (340 loc) · 36.7 KB

schema-datacite.rst

File metadata and controls

362 lines (340 loc) · 36.7 KB

Metadata Schema for the Persistent Identification of Scientific Measuring Instruments

The following table presents the metadata schema for the persistent identification of instruments mapped onto the DataCite Metadata Schema 4.4. Note that the current version of the DataCite schema has not been designed to describe instruments. As a consequence, some definitions in the DataCite schema need to be stretched. For a few relevant instrument properties there is even no suitable place in the DataCite schema at all.

In this presentation, the DataCite schema is mostly taken as is, assuming that no adaptations are made to accommodate instruments. Nevertheless, there are some shortcomings of this approach, so some amendments of the schema would facilitate its use for instruments and should be negotiated with DataCite.

ID Property Obligation Occ Definition Allowed values, constraints, remarks Suggested Changes to DataCite schema
1 Identifier M 1 Unique string that identifies the instrument instance DOI None
1.a identifierType M 1 Type of the identifier

Controlled list of values:[1]

DOI
None
2 Creator M 1-n The instrument's manufacturer(s) or developer. This may also be the owner for custom build instruments   None
2.1 creatorName M 1 Full name of the manufacturer Free text None
2.1.a nameType R 0-1 The type of name

Controlled list of values:[2] Organizational

Personal
None
2.2 givenName R 0-1 First name of the manufacturer, if applicable Free text None
2.3 familyName R 0-1 Last name of the manufacturer, if applicable Free text None
2.4 nameIdentifier R 0-n Unique identifier of the manufacturer Free text, format is dependent upon schema None
2.4.a nameIdentifierScheme R 1 The name of the name identifier schema Free text, mandatory if nameIdentifier is used. Examples: ROR, ISNI, ORCID None
2.4.b schemeURI O 0-1 The URI of the name identifier schema Examples: http://www.isni.org, https://orcid.org None
2.5 affiliation O 0-n Organizational or institutional affiliation of the manufacturer Free text [3] None
3 Title M 1 Name by which the instrument instance is known Free text None
3.a titleType O 0-1 The type of Title

Controlled list of values:[4]

AlternativeTitle Subtitle TranslatedTitle Other
Add Name to controlled list of values
4 Publisher M 1 The name of the entity that holds, archives, publishes, prints, distributes, releases, issues, or produces the resource Free text [5] None
5 PublicationYear M 1 The year when the data was made publicly available YYYY [6] None
6 Subject R 0-n Subject, keyword, classification code, or key phrase describing the instrument Free text [7] None
6.a subjectScheme O 0-1 The name of the subject scheme or classification code or authority if one is used Free text None
6.b schemeURI O 0-1 The URI of the subject identifier scheme   None
6.c valueURI O 0-1 The URI of the subject term   None
7 Contributor M 1-n Institution(s) responsible for the management of the instrument. This may include the legal owner, the operator, or an institute providing access to the instrument. [8] None
7.a contributorType M 1 The type of contributor Controlled list of values: hostingInstitution None
7.1 contributorName M 1 Full name of the owner Free text None
7.1.a nameType R 0-1 The type of name

Controlled list of values:[9]

Organizational Personal
None
7.2 givenName R 0-1 First name of the owner, if applicable Free text None
7.3 familyName R 0-1 Last name of the owner, if applicable Free text None
7.4 nameIdentifier R 0-n Unique identifier of the owner Free text, format is dependent upon schema None
7.4.a nameIdentifierScheme R 1 The name of the name identifier schema Free text, mandatory if nameIdentifier is used. Examples: ROR, ISNI, ORCID None
7.4.b schemeURI O 0-1 The URI of the name identifier schema Examples: http://www.isni.org, https://orcid.org None
7.5 affiliation O 0-n Organizational or institutional affiliation of the contributor Free text [9] None
8 Date R 0-n Dates relevant to the instrument ISO 8601 [10] None
8.a dateType R 1 The type of the date Controlled list of values, see DataCite schema None
8.b dateInformation O 0-1 Specific information about the date, if appropriate Free text None
10 ResourceType M 1 The type of the resource

Free text. Suggested values:

Platform Instrument Sensor
None
10.a resourceTypeGeneral M 1 The general type of the resource

Controlled list of values:[11]

Other
None
11 AlternateIdentifier R 0-n Identifiers other than the DOI pertaining to the same instrument instance. This should be used if the instrument has a serial number. Other possible uses include an owner's inventory number or an entry in some instrument data base. Free text, should be unique identifiers None
11.a alternateIdentifierType R 1 Type of the identifier

Free text. Mandatory if AlternateIdentifier is used. Suggested values include:

serialNumber inventoryNumber
None
12 RelatedIdentifier R 0-n Identifiers of related resources Free text, must be globally unique identifiers. None
12.a relatedIdentifierType R 1 Type of the identifier Controlled list of values, see DataCite schema None
12.b relationType R 1 Description of the relationship Controlled list of values, see DataCite schema [12] Add a relationType for deployments, indicating was used in
12.c relatedMetaDataScheme O 0-1 The name of the related metadata scheme Use only for HasMetadata None
12.d schemeURI O 0-1 The URI of the related metadata scheme Use only for HasMetadata None
12.e schemeType O 0-1 The type of the related metadata scheme Use only for HasMetadata None
12.f resourceTypeGeneral O 0-1 The general type of the related resource Controlled list of values, see DataCite schema Other Add Instrument to controlled list of values
17 Description R 0-n Technical description of the device and its capabilities Free text None
17.a descriptionType R 1 The type of the description

Controlled list of values:[13]

Abstract Methods SeriesInformation TableOfContents TechnicalInfo Other
None

Footnotes

[1]If registering the PID with DataCite, it will forcibly be a DOI.
[2]The manufacturer of an instrument will most likely be an organization. In that case, nameType should be provided with a value of "Organizational".
[3]If the manufacturer is an organization, affiliation will be redundant with creatorName. It may be useful nevertheless to repeat that value in affiliation to facilitate organization searches.
[4]None of the specific values for titleType in the DataCite schema really fits an instrument name. The value "Other" will need to be used here.
[5]Publisher does not seem to fit at all for instruments. But it is mandatory in the DataCite schema, so we can not skip it. Need to negotiate with DataCite what to put here. Maybe the institution responsible to manage this DOI record and its metadata?
[6]Similar problem for PublicationYear as for Publisher.
[7]Use Subject for the classification of the type of the instrument.
[8]Contributor with contributorType=HostingInstitution should be used for the owner of the instrument. Other contributor types as permitted by the DataCite schema are of course possible, but are not considered in this presentation. Note that Contributor is only recommended in the DataCite schema, but at least one owner (e.g. Contributor with contributorType=HostingInstitution) should be considered mandatory for instruments.
[9](1, 2) Same remarks as for the subproperties nameType and affiliation of Creator also applies to the corresponding subproperties of Contributor.
[10]Use Date with dateType=Available to indicate when the instrument was in operation, either with a single date to indicate when this instrument instance started operation, or a date interval if this instrument instance ceased to be in operation.
[11]None of the specific values for resourceTypeGeneral in the DataCite schema fits an instrument. This leaves "Other" as the only option.
[12]Use "HasPart" and "IsPartOf" in lieu of "HasComponent" and "IsComponentOf".
[13]Not all of the listed values for descriptionType make sense for an instrument description. "TechnicalInfo" should be used for a technical description.

Notes and Issues

In the following, we collect some issues with the mapping of the instrument metadata schema onto DataCite as presented above, roughly ordered by increasing importance, from least concern to critical:

  • There is no LandingPage property in the DataCite schema. Nevertheless, the URL of a landing page is registered with every DataCite DOI in the practice. As long as there actually is a landing page that the instrument PID resolves to, it is considered mostly an esthetic question whether this is explicitely named in the schema or not.

  • There is no suitable place for MeasuredVariable in the DataCite schema. On the other hand, honestly speaking, the concepts for representing this information in our general schema have not been very advanced either. Linking some external resource with RelatedIdentifier / relationType=HasMetadata using some externally defined ontology seem to be the most viable approach anyway.

  • It should be possible to tell from the PID and its metadata that this one pertains to an instrument and not any other kind of resource. The only property in the DataCite schema suitable to hold this information is ResourceType and its subproperty resourceTypeGeneral. ResourceType is free text which does not offer a reliable classification. The only usuable value for resourceTypeGeneral is "Other". It would be desirable to add "Instrument" to the controlled list of values for resourceTypeGeneral.

  • It is not obvious that the name of the instrument would be in Title. This difficulty is even aggravated by the fact that there is no suitable specific value for titleType for this purpose. It would be desirable to add "Name" to the controlled list of values for titleType. This could also be useful for other resources then instruments, if they have a well known name.

  • It is not clear what to put into Publisher and PublicationYear for instruments.

  • It has been discussed in the group that there should be a way to relate an instrument with events, such as the deployment of an instrument in an expedition, using RelatedIdentifier. However it is not clear which relationType in the DataCite schema would be suitable for such a "has been deployed in" or "was used in" relation.

  • The only suitable property to store a serial number is AlternateIdentifier. It has been argued in the group that for this approach to be useful one would need to have a controlled list of values for alternateIdentifierType that includes an entry for "serialNumber", although there has not been a consensus on this. It has also been argued that such a controlled list of values would be impractical for some other use cases. This is still an unresolved issue also in the general schema.

  • As mentioned above, some of the definitions in the DataCite schema need to be significantly stretched in order to squeeze the relevant metadata for instruments in. It is not obvious what piece of information should be put where. It seems that some sort of a dedicated handbook on how to correctly create instrument metadata using this schema will be needed. The existing general DataCite documentation will not be enough.

  • There is no suitable place to put the model name of the instrument, although this is considered a very important piece of information.

    It has been suggested to use AlternateIdentifier, but that does not fit: AlternateIdentifier is for alternate identifiers that pertain to the same individual instrument instance. A model name identifies a series of instruments having the same or similar specifications, but not an individual instrument.