Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support $erase (hard delete) for a specific historical version of a resources #2304

Closed
lmsurpre opened this issue Apr 30, 2021 · 2 comments
Closed
Assignees
Labels
enhancement New feature or request showcase Used to Identify End-of-Sprint Demos

Comments

@lmsurpre
Copy link
Member

lmsurpre commented Apr 30, 2021

Is your feature request related to a problem? Please describe.
We're working to introduce a new $erase operation (#2304) for doing a hard delete on a specific logical resources (all versions).

For select use cases, it would be nice to be able to hard delete a specific historical version of a resource, while leaving the latest version of the resource in-tact.

There are two main use cases for this feature that I can think of:

  1. Saving room in the database. Each historical version of a resource is stored as a gzip-compressed json blob in the database. If you are using the server as the target of a dumb scheduled batch job that performs updates on each resource, this can start to add up over time. Allowing admin users to hard delete unneeded historical versions could reclaim that space.
  2. Removing resources due to data quality issues. For example, especially near the beginning of a project, integrators could churn through multiple versions of a resource before getting their ingestion pipeline "just right". If the customer does not want to expose these historical versions to the end user, but also does not want to "start fresh", then being able to selectively delete the historical versions could be useful.

Describe the solution you'd like
The$erase operation should take an optional argument so that you could invoke [base]/[ResourceType]/[id]/$erase with a version string that matches the version string of a particular version of this resource that you want to hard delete.

vread and resource history interactions which would read this version of this resource would need to return an appropriate error / warning.

If the user attempts to hard erase the latest version of a particular resource, I think we should reject that. That is, to delete the current version an end user would need to:

  1. Update the resource instance to a new version; then
  2. Delete the previous version

Describe alternatives you've considered
To support the user passing a version string to $erase, depending on the final design of $erase, its possible that we could support a pattern like this [base]/Patient/1234/_history/1/$erase, but I don't think that FHIR defines any resource version-specific operations like this.

To support erasure of the latest version of a resource, it would be possible to:

  1. set the previous version as the latest (if there is one)
  2. if there was one, then issue a reindex on it

Acceptance Criteria
1.
GIVEN [a precondition]
AND [another precondition]
WHEN [test step]
AND [test step]
THEN [verification step]
AND [verification step]

Additional context
Note: for Saving room in the database, my hope is that #2263 and #2284 will allow batch jobs to "stay dumb" (i.e. no state and no read-before-update) and still avoid producing lots of versions of the same resource with the same contents.

For Removing resources due to data quality issues, I think that most scenarios can be addressed by doing a hard delete of the whole resource, followed by a new ingestion run. One of the few cases I can think of where that might not be sufficient is if there are some appropriate and relevant historical versions of a resources, and then a "bad push". In this case, it would be nice to be able to update the resource back to a good state and then delete the historical version that had the bad data.
The only other way to address that particular scenario that I can think of would be to add support for posting history bundles (as described at https://www.hl7.org/fhir/http.html#other-bundles). If we had that, then a user could:

  1. read the history of the resource
  2. do a hard delete on the entire resource (all versions)
  3. edit the history bundle from step one to omit the bad version
  4. post the history bundle to the server to "restore" this resource with its proper history (including lastUpdated)
@prb112 prb112 added the enhancement New feature or request label May 4, 2021
@prb112 prb112 self-assigned this May 7, 2021
@prb112 prb112 added this to the Sprint 2021-06 milestone May 7, 2021
@prb112
Copy link
Contributor

prb112 commented May 7, 2021

I've added support for specific version erase following the supplied description.

prb112 added a commit that referenced this issue May 10, 2021
- Issues resolved are:
	- Add GDPR FHIR Operation Support - $erase #850
	- FHIROperationContext inconsistently available to Batch.java #2297
	- Support $erase (hard delete) for a specific historical version of a
resources #2304

- Remove the timeout and count parameters in lieu of the
version-specific erase
- Update the README with clarity on the parameters
	- logicalId, version, patient, reason
	- clarified the HTTP Method, Api Paths, Response Codes, and Responses
	- Add Acceptance Criteria to explain VERSION delete
- Update the test harnesses for Db2 and Postgres to use the simple Erase
approach
- Clean up the DAO, JdbcEraseTest to move away from Timeout
- Removed patient/reason so we don't reflect illegal content.
- Change function/stored proc to use resource_type, logical_id only
- Refactor erase.json to support only specific values.
- Revert test.properties to logging false
- Improve IT/UT test coverage

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>
@lmsurpre lmsurpre added the showcase Used to Identify End-of-Sprint Demos label May 11, 2021
@JohnTimm
Copy link
Collaborator

Ran through the Postman collection that @prb112 provided to verify the feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request showcase Used to Identify End-of-Sprint Demos
Projects
None yet
Development

No branches or pull requests

3 participants