Skip to content
Leigh Dodds edited this page May 20, 2013 · 1 revision

This page provides some discussion on the rationale of the design of the vocabulary. For further background read over the summary of Related Work.

Background & Requirements

Broadly, our requirements are the ability to provide machine-readable metadata to help describe:

  • the licensing and/or right associated with usage of a dataset
  • the means of attributing a dataset to support "marking" of derivative datasets and applications

Looking at the related work we can see that:

  • Dublin Core and WAIVER both provide support for referencing rights statements, including both licenses and waivers
  • ccRel provides support for describing the aspects of a license and also properties for attribution

While this provides some good ground work there are limitations:

  • There is no recognition of the need to separately license a database and its contents. Even outside of the EU, it is entirely possible that the content of a database is covered by a separate license to its structure
  • There is no clear distinction between the description of a license and the usage of a license in a particular context, e.g. use of the Open Government License by a specific organisation that wishes to clearly record copyright notices and attribution requirements. ccRel doesn't provide terms to express the former and suggests that the latter are properties of the work.

For our purposes we want to allow data publishers to specify, as machine-readable metadata, the following:

  • A license or waiver that applies to the database/data being published
  • A license or waiver that applies to the content of a database, where is needs to be separately recorded
  • Copyright notices for a dataset
  • Attribution requirements
  • Pointers to additional material, e.g. notes on usage of a license, additional terms, etc.

Rights Statements

Borrowing from Dublin Core, this vocabulary introduces a "Rights Statement" as a resource that relates a Dataset to one or more Licenses.

The description of a License, such as the Open Government License, remains unchanged no matter how it is used or applied by an organisation. It is the Rights Statement that captures this customisation and description of copyright, attribution and other relevant relationships.

A Rights Statement might apply to an individual Dataset. But it could equally be applied to several Datasets, e.g. if an organization has a common set of attribution requirements.

Attribution and Citation

There are broadly two reasons why we might want to link back to the source of some data:

  • Attribution -- to give credit to the creator of a work
  • Citation -- to link to the source data or material, to support provenance and discovery

Attribution and Citation are community norms. But attribution is often also captured as a legal provision in a data license.

It is important to support easy re-use that re-users can easily attribute or cite a work or a data publisher as required by a Rights Statement, License, or specific community norms. The exact form of an attribution might vary by community. c.f. citation practices in scholarly communication.

To support attribution and citation we need to capture:

  • A link to the homepage of the Dataset (or the version being used)
  • A title for the Dataset
  • A URL for attribution purposes, which might be the above, a personal or organisation homepage, or a specific tracking URL as required
  • The name of the data publisher whose work is being credited. Stating this clearly ensures that re-users are able to identify the correct name of the publisher (which may need to be an legal entity)

The first of these items of data can be derived from the Dataset description. The latter ought to be part of the Rights Statement (note that the attribution name might be different from the data publisher, e.g. if a user is publishign data via a platform).

Clone this wiki locally