Skip to content

Commit

Permalink
Began conformance section
Browse files Browse the repository at this point in the history
  • Loading branch information
Matt Marshall committed Sep 7, 2023
1 parent 5db8137 commit 3947468
Show file tree
Hide file tree
Showing 4 changed files with 57 additions and 36 deletions.
12 changes: 12 additions & 0 deletions docs/hsds/conformance.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Conformance and Profiles
=========================

The goal of HSDS is to provide a common model for Human Services data, so that it may be consumed and used consistently by humans and systems. Therefore we set out these principles for conforming to the standard.

The nature of Human Services inevitably results in variations between different local or regional contexts. HSDS accounts for this by explicitly creating space for different **Profiles** of HSDS to respond to local needs. A publication may be conformant either to HSDS as specified in this reference, or to a HSDS Profile.


## Conformance


## Profiles
33 changes: 21 additions & 12 deletions docs/hsds/identifiers.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@ Consistent and good quality identifiers are important to ensure that HSDS data i

## `id` fields

Each object in HSDS has an `id` field. The `id` field is essential to HSDS as it allows the modelling of relationships between objects. This supports storing HSDS data in a relational database, so that they can dereference it for JSON publication. It also supports the Tabular Datapackage serialization of HSDS, where `id` is used as a primary key for identifying a record and as a foreign key to establish the relationship.
Each object in HSDS has an `id` property, which should be unique for each record of that type. For example two different `service` objects will contain different values in the `service/id` property. The `id` field is essential to HSDS as it allows the modelling of relationships between objects. This supports storing HSDS data in a relational database, so that they can dereference it for JSON publication. It also supports the Tabular Datapackage serialization of HSDS, where `id` is used as a primary key for identifying a record and as a foreign key to establish the relationship.

HSDS stipulates that universally unique identifiers (UUIDs) must be used for all `id` fields. Details of the UUID scheme are available in [RFC 4122](https://datatracker.ietf.org/doc/html/rfc4122).
HSDS stipulates that the values in all `id` properties must conform to the *Universally Unique Identifier (UUID)* format. UUID is defined by [RFC 4122](https://datatracker.ietf.org/doc/html/rfc4122).

There exists a number of tools to support publishers generating UUIDs:
A number of tools exist to support publishers generating UUIDs:

* [Online UUID Generator](https://www.uuidgenerator.net/) (web)
* [uuidgen](https://www.man7.org/linux/man-pages/man1/uuidgen.1.html) is available in the repos of many GNU/Linux and BSD systems (CLI)
Expand All @@ -23,38 +23,47 @@ There exists a number of tools to support publishers generating UUIDs:

There are likely many more suitable for your needs.

## Third-party or External Identifiers
HSDS data can also include information on third-party identifiers, which link HSDS data to records in other datasets or information systems, as well as the real world. For the sake of clarity, HSDS uses `identifer` in column names that refer to third-party identifiers.

## Third-party identifiers
HSDS data can also include information on third-party identifiers, which link objects to other records and the real world. For the sake of clarity, HSDS uses `identifer` in column names that refer to third-party identifiers.
### Location Identifiers

Locations in the real world are often identified by a range of different systems, schemes, or catalogues. There are often different schemes used within the same legal or geographic context. Systems exchanging data about services and locations may need to convert between schemes in order to process the data for analysis or other use e.g. rendering it on a map, or supporting filtering for locations near a system user.

In HSDS, the [Location](schema_reference.md#location) object contains properties which should be used to provide an identifier for the location:

* `location/external_identifier` is used to provide an external identifier for a location, drawn from a particular scheme. An example would be `5090701` which is drawn from the UPRN scheme in the UK.
* `location/external_identifier_type` is used to label the scheme from which the location is drawn such as [UPRN](https://www.gov.uk/government/publications/open-standards-for-government/identifying-property-and-street-information) in the UK.

There are many identifier schemes available for location data. The most suitable one for your data will likely depend on what is considered standard for your target legal or geographic scope.

### Organization Identifiers

It is important to reliably and consistently identify organizations for many different use-cases. For HSDS, this means that organizations can be identified properly within a single dataset, between different HSDS datasets, or in combination with other open datasets.

To support this, HSDS provides the [organization_identifier](schema_reference.md#organization-identifier) object to encapsulate third party identifier information about organizations. Each [organization](schema_reference.md#organization) object has an array of `organization_identifier`s because in the real world there is often a 1:many relationship between an organization and its identifiers; an organization may have different legal identifiers in different official registers, or different third parties may provide their own identification scheme for organizations.
To support this, HSDS provides the [organization_identifier](schema_reference.md#organization_identifier) object to encapsulate third party identifier information about organizations. Each [organization](schema_reference.md#organization) object has an array of `organization_identifier`s because in the real world there is often a 1:many relationship between an organization and its identifiers; an organization may have different legal identifiers in different official registers, or different third parties may provide their own identification scheme for organizations.

According to [org-id.guide](http://docs.org-id.guide/en/latest/terminology/), there are several different types of organization identifier:
According to [org-id.guide](http://docs.org-id.guide/en/latest/terminology/), there are several different types of organization identifier with varying degrees of canonicity:

* **Primary identifiers** are official, often legal, identifiers that unambiguously and directly identify a legal entity. Company registration numbers are usually primary identifiers, since a company usually cannot operate without a company number. Non-profits, charities, and other third-sector organizations usually have equivalent identification numbers.
* **Secondary identifiers** are official identifiers which are assigned to entities for a range of purposes. These may include a tax number or a VAT number (EU and UK), charitable status identifiers in contexts that do not have a primary register for non-profit entities, or even government procurement system identifiers.
* **Third party identifiers** are identifiers drawn from lists that are assembled and maintained independently of the organizations they're identifying. They often assign identifiers to known organizations, but they do not have legal status. The proprietary [D-U-N-S](https://en.wikipedia.org/wiki/Data_Universal_Numbering_System) register maintained by Dun & Bradstreet is an example of a third party identifier.
* **Local identifiers** are the internal system identifiers for organizations and entities within the context of a particular digital or information system, and cannot be expected to hold relevance outside of that system. An example of these would be an internal database identifier for an organization record in a software system or database.

Where possible, HSDS publishers should seek to collect legal or Primary identifiers for organizations, and publish these in their HSDS data. If these are not possible to collect for legal or practical reasons, then publishers should fall back down the list above. Where a publisher has collected multiple different organization identifiers for an organization, they should publish each of these to promote interoperability and data analysis across as many different datasets as possible.
Where possible, HSDS publishers should seek to collect legal or Primary identifiers for organizations and publish these in their HSDS data. If these are not possible to collect for legal or practical reasons, then publishers should fall back on Secondary identifiers if available. If these are not available then Third Party identifiers should be used. Local identifiers may be used when no other identifiers are available.

Where a publisher has collected multiple different organization identifiers for an organization, they should publish each of these to promote interoperability and data analysis across as many different datasets as possible. The exception to this is Local Identifiers, which should be omitted if better identifiers are available.

There are two parts to an organization identifier in HSDS:

1. A **register prefix** identifying the register from which the identifier is drawn. This is stored in the `organization_identifier/identifier_scheme` property. An example of this would be `GB-COH` for the [UK Companies House](http://org-id.guide/list/GB-COH).
2. The **organization id** identifying the organization, drawn from the above register. This is stored in the `organization_identifier/identifier` property. An id number drawn from `GB-COH` may look like `09506232`.

Where possible, publishers should try to draw from schemes represented on [org-id.guide](https://org-id.guide). If these are not available, publishers can [raise an issue](https://github.com/org-id/register/issues) on the org-id.guide Github repository.
Where possible, publishers should try to draw from schemes represented on [org-id.guide](https://org-id.guide). If these are not available, publishers can [raise an issue](https://github.com/org-id/register/issues) on the org-id.guide Github repository, and may also use the `organization_identifier/identifier_type` property to provide details of the scheme in a human-readable format.

There are other properties defined in `organization_identifier` which are necessary for HSDS' relational model between objects but may provide some confusion in this context:

* `organization_identifier/id` is the UUID for this specific `organization_identifier` object and used for Tabular serializations.
* `organization_identifier/organization_id` is the UUID for the `organization` object which associated with this organization identifier. It should match the `organization/id` property of an `organization` elsewhere in the dataset. It is used for Tabular serializations but is not required in JSON as the `organization_identifier` object should be [dereferenced](serialization.md#dereferencing).


### Location Identifiers


47 changes: 23 additions & 24 deletions docs/hsds/schema_reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,10 @@ HSDS data is not hierarchichal in the sense that it does not have a single top-l

Compiled schemas and example data, containing all HSDS objects, may be useful to publishers for a number of reasons. A number of compiled schema and example files are available from the [schema](https://github.com/openreferral/specification/tree/3.0/schema) and [examples](https://github.com/openreferral/specification/tree/3.0/examples) directories on the HSDS Github repository.

## Objects Reference
## Core Objects

### Core Objects

#### organization
### organization

`organization` is defined as:

Expand Down Expand Up @@ -60,7 +59,7 @@ Each `organization` object has the following fields:

::::

#### service
### service

`service` is defined as:

Expand Down Expand Up @@ -97,7 +96,7 @@ Each `service` object has the following fields:

::::

#### location
### location

`location` is defined as:

Expand Down Expand Up @@ -135,7 +134,7 @@ Each `location` object has the following fields:

::::

#### service_at_location
### service_at_location

`service_at_location` is defined as:

Expand Down Expand Up @@ -175,9 +174,9 @@ Each `service_at_location` object has the following fields:

::::

### Other Objects
## Other Objects

#### address
### address

`address` is defined as:

Expand Down Expand Up @@ -218,7 +217,7 @@ Each `address` object has the following fields:
::::


#### phone
### phone

`phone` is defined as:

Expand Down Expand Up @@ -259,7 +258,7 @@ Each `phone` object has the following fields:
::::


#### schedule
### schedule

`schedule` is defined as:

Expand Down Expand Up @@ -298,7 +297,7 @@ Each `schedule` object has the following fields:
::::


#### service_area
### service_area

`service_area` is defined as:

Expand Down Expand Up @@ -336,7 +335,7 @@ Each `service_area` object has the following fields:

::::

#### language
### language

`language` is defined as:

Expand Down Expand Up @@ -375,7 +374,7 @@ Each `language` object has the following fields:
::::


#### funding
### funding

`funding` is defined as:

Expand Down Expand Up @@ -412,7 +411,7 @@ Each `funding` object has the following fields:

::::

#### accessibility
### accessibility

`accessibility` is defined as:

Expand Down Expand Up @@ -450,7 +449,7 @@ Each `accessibility` object has the following fields:
::::


#### cost_option
### cost_option

`cost_option` is defined as:

Expand Down Expand Up @@ -488,7 +487,7 @@ Each `cost_option` object has the following fields:
::::


#### program
### program

`program` is defined as:

Expand Down Expand Up @@ -527,7 +526,7 @@ Each `program` object has the following fields:
::::


#### required_document
### required_document

`required_document` is defined as:

Expand Down Expand Up @@ -565,7 +564,7 @@ Each `required_document` object has the following fields:
::::


#### contact
### contact

`contact` is defined as:

Expand Down Expand Up @@ -603,7 +602,7 @@ Each `contact` object has the following fields:

::::

#### organization_identifier
### organization_identifier

`organization_identifier` is defined as:

Expand Down Expand Up @@ -641,7 +640,7 @@ Each `organization_identifier` object has the following fields:
::::


#### attribute
### attribute

`attribute` is defined as:

Expand Down Expand Up @@ -681,7 +680,7 @@ Each `attribute` object has the following fields:
::::


#### metadata
### metadata

`metadata` is defined as:

Expand Down Expand Up @@ -718,7 +717,7 @@ Each `metadata` object has the following fields:

::::

#### meta_table_description
### meta_table_description

`meta_table_description` is defined as:

Expand Down Expand Up @@ -747,7 +746,7 @@ Each `meta_table_description` object has the following fields:

::::

#### taxonomy
### taxonomy

`taxonomy` is defined as:

Expand Down Expand Up @@ -785,7 +784,7 @@ Each `taxonomy` object has the following fields:

::::

#### taxonomy_term
### taxonomy_term

`taxonomy_term` is defined as:

Expand Down
1 change: 1 addition & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ Contents:
hsds/api_reference
hsds/serialization
hsds/identifiers
hsds/conformance
hsds/variations_interoperability
hsds/changelog
Expand Down

0 comments on commit 3947468

Please sign in to comment.