Skip to content

Commit

Permalink
fixup! RFC 116: store attachment data in content items
Browse files Browse the repository at this point in the history
  • Loading branch information
barrucadu committed Jan 24, 2020
1 parent 29cedf5 commit e0a3658
Showing 1 changed file with 149 additions and 72 deletions.
221 changes: 149 additions & 72 deletions rfc-116-store-attachment-data-in-content-items.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,37 +66,61 @@ shouldn't conflate the two.

I propose we do this:

1. Split `asset_link` into two new types: `attachment_asset` and
1. Split `asset_link` into four new types: `base_attachment_asset`,
`attachment_asset`, `specialist_publisher_attachment_asset` and
`image_asset`, updating the schemas which use `asset_link`
accordingly.

**Migration concerns:** This can be done without breaking
compatibility, as it's an internal change to the schemas.
accordingly, with these fields:

- `base_attachment_asset`:
- `url` (required)
- `content_type` (required)
- `title`
- `attachment_asset` (extends `base_attachment_asset`):
- no extra fields at this time
- `specialist_publisher_attachment_asset` (extends `base_attachment_asset`):
- `content_id` (required)
- `created_at`
- `updated_at`
- `image_asset`:
- `url` (required)
- `content_type` (required)

**Migration concerns:** This would need to be done in three steps:
1. Update manuals-publisher to not set the `created_at` and `updated_at` for attachments.
2. Republish all manual sections.
3. Update the schemas.

1. Add new optional fields for extra metadata:

- Attachments:
- `base_attachment_asset`:
- `accessible`
- `alternative_format_contact_email`
- `attachment_type`
- `command_paper_number`
- `file_size`
- `filename`
- `hoc_paper_number`
- `id`
- `isbn`
- `file_size`
- `locale`
- `number_of_pages`
- `parliamentary_session`
- `preview_url`
- `unique_reference`
- `unnumbered_command_paper`
- `unnumbered_hoc_paper`
- Images:
- `attachment_asset`
- `id`
- `image_asset`
- `alt_text`
- `caption`
- `credit`
- `id`

**Migration concerns:** This can be done without breaking
compatibility, as all new fields are optional.

1. Add new attachment type for publication attachments:

`publication_attachment_asset` (extends `attachment_asset`) with fields:
- `attachment_type` (required)
- `command_paper_number`
- `hoc_paper_number`
- `isbn`
- `parliamentary_session`
- `unique_reference`
- `unnumbered_command_paper`
- `unnumbered_hoc_paper`

I've left out the following fields because they can be deduced from
others: `csv?`, `external?`, `external_url`, `file_extension`, `html?`,
Expand All @@ -113,96 +137,143 @@ I propose we do this:
`ordering`, `slug`.

**Migration concerns:** This can be done without breaking
compatibility, as it doesn't add any new mandatory fields.

1. Replace `content_id` in attachments with `id`.

**Migration concerns:** this would need to be done in five steps:
add `id`; update app to read from both fields but write to `id`;
republish docs; remove `content_id`; update apps to not use
`content_id`.
compatibility, as the new type will be unused for now.

1. Remove non-useful metadata fields:
1. Make `id` in `attachment_asset` mandatory.

- Attachments: `created_at` and `updated_at`

**Migration concerns:** this would need to be done in two steps:
update apps; remove fields.
**Migration concerns:** this would need to be done in three steps:
1. Update travel advice publisher and manuals publisher to set an arbitrary ID for attachments.
2. Republish all travel advice pages and manual sections.
3. Make the field mandatory.

1. Add a *mandatory* `details.attachments` field to all formats which
can have attachments.

**Migration concerns:** this would need to be done in four steps: add
*optional* field; update apps to write to it; republish docs; make
field mandatory.
**Migration concerns:** this would need to be done in four steps:
1. Add the field as optional.
2. Update publishing apps to write to it.
3. Republish all affected documents.
4. Make the field mandatory.

1. Add a *mandatory* `details.document_attachments` list to
publication formats, which is an ordered list of govspeak or HTML
strings (using the `#/definitions/multiple_content_types` schema)
rendering attachments from `details.attachments`.

1. Change the `details.documents` list to store govspeak hashes.
**Migration concerns:** this would need to be done in four steps:
1. Add the field as optional.
2. Update publishing apps (just Whitehall and Content Publisher) to write to it.
3. Republish all affected documents.
4. Make the field mandatory.

**Migration concerns:** this would need to be done in five steps:
change schema to allow both types; update frontend apps; update
publishing apps; republish docs; change schema to only allow new
type.
1. Remove the `details.documents` field.

1. Document the relationship between the `attachment_asset` type in
our content schemas and the data Govspeak attachments use. For
example, all fields required in one must be required in the other.
Any pairs of fields which do the same job in both should have the
same name. Implement any changes arising from this, for example,
more Whitehall-like attachment rendering in Govspeak.
**Migration concerns:** this would need to be done in four steps:
1. Update frontend apps to render from `details.document_attachments`.
2. Update publishing apps to not set `details.documents`.
3. Republish all affected documents.
4. Remove the field.

The final `attachment_asset_list` / `attachment_asset` / `image_asset`
schema will be:
The final schemas will be:

```
{
image_asset: {
type: "object",
additionalProperties: false,
required: [
"content_type",
"url",
],
properties: {
id: { type: "string", },
alt_text: { type: "string", },
caption: { type: "string", },
content_type: { type: "string", },
credit: { type: "string", },
url: { type: "string", format: "uri", },
},
},
attachment_asset: {
base_attachment_asset: {
type: "object",
additionalProperties: false,
required: [
"url",
"content_type",
"url",
],
properties: {
accessible: { type: "boolean", },
alternative_format_contact_email: { type: "string", },
attachment_type: { type: "string", enum: ["external", "file", "html"], },
command_paper_number: { type: "string", },
content_type: { type: "string", },
file_size: { type: "integer", },
filename: { type: "string", },
hoc_paper_number: { type: "string", },
id: { type: "string", },
isbn: { type: "string", },
locale: { "$ref": "#/definitions/locale", },
number_of_pages: { type: "integer", },
parliamentary_session: { type: "string", },
preview_url: { type: "string", format: "uri", },
title: { type: "string", },
unique_reference: { type: "string", },
unnumbered_command_paper: { type: "boolean", },
unnumbered_hoc_paper: { type: "boolean", },
url: { type: "string", format: "uri", },
},
},
attachment_asset_list: {
description: "An ordered list of attachment links",
type: "array",
items: {
"$ref": "#/definitions/attachment_asset",
attachment_asset: {
type: "object",
additionalProperties: false,
required: [
"content_type",
"id",
"url",
],
properties: {
allOf: [
{ "$ref": "#/definitions/base_attachment_asset/properties" },
{ id: { type: "string" } },
],
},
},
specialist_publisher_attachment_asset: {
type: "object",
additionalProperties: false,
required: [
"content_id",
"content_type",
"url",
],
properties: {
allOf: [
{ "$ref": "#/definitions/base_attachment_asset/properties" },
{
content_id: { "$ref": "#/definitions/guid", },
created_at: { format: "date-time", },
updated_at: { format: "date-time", },
},
],
},
},
publication_attachment_asset: {
type: "object",
additionalProperties: false,
required: [
"attachment_type",
"content_type",
"id",
"url",
],
properties: {
allOf: [
{ "$ref": "#/definitions/attachment_asset/properties" },
{
attachment_type: { type: "string", enum: ["external", "file", "html"], },
command_paper_number: { type: "string", },
hoc_paper_number: { type: "string", },
isbn: { type: "string", },
parliamentary_session: { type: "string", },
unique_reference: { type: "string", },
unnumbered_command_paper: { type: "boolean", },
unnumbered_hoc_paper: { type: "boolean", },
},
],
},
},
}
Expand All @@ -212,17 +283,16 @@ schema will be:

**Why two lists?**

There are really two different types of attachments in documents.
There are attachments which may be referenced in Govspeak but don't
otherwise appear on the document; in this case the attachment hashes
are more like metadata than actual document content. There are also
attachments which appear differently to the normal document text, like
in publications.
As mentioned earlier, there are really two different types of
attachments in documents. There are attachments which appear in the
body, and attachments which appear separately to the normal document
text, like in publications.

The `details.attachments` list would contain both types of
attachments, so there is one list with everything in. The
`details.documents` list would fulfil its current purpose: providing
the ordered list of attachments for publication-like document types.
`details.document_attachments` list fulfils its current purpose:
providing the ordered list of document-level attachments for
publication-like document types.

**Why split `asset_link`?**

Expand All @@ -231,6 +301,13 @@ attachments. However images have strictly less permissable metadata
than attachments, it doesn't make sense to give a page image an ISBN
for example. These should be separate types.

**Why have multiple attachment types?**

Different publishing apps have different allowable metadata, so it
makes sense to have different types constraining what a publishing app
is able to provide. However, we still want some unifying structure
behind the types, so we use inheritance.

**Why make the lists mandatory?**

Is there a difference between a missing list and a present, but empty,
Expand Down

0 comments on commit e0a3658

Please sign in to comment.