Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reindex resource selection update slows down as reindex progresses (postgres) #1818

Closed
punktilious opened this issue Dec 11, 2020 · 0 comments
Assignees
Labels
bug Something isn't working performance performance postgres
Milestone

Comments

@punktilious
Copy link
Collaborator

punktilious commented Dec 11, 2020

Describe the bug
The reindex operation uses a statement like the following to pick the next resource to reindex. This is intended to avoid any concurrency issues. Unfortunately, although the index on reindex_tstamp is DESC, the scan ends up being a forward scan and must hit more and more blocks in order to find the first resource matching the filter predicate:

UPDATE fhirdata.logical_resources    
   SET reindex_tstamp = '2020-12-10T00:00:00Z'     
WHERE logical_resource_id = (
	SELECT lr.logical_resource_id          
  	  FROM fhirdata.logical_resources lr        
	 WHERE lr.reindex_tstamp < '2020-12-10T00:00:00Z'
  ORDER BY lr.reindex_tstamp DESC    
       FOR UPDATE SKIP LOCKED LIMIT 1) 
RETURNING logical_resource_id, resource_type_id, logical_id, reindex_txid;

Removing the DESC on the order by reduces the number of shared buffers hit from 50,000 to 7, and reduces the execution time from 900ms to 0.018ms according to EXPLAIN (ANALYZE,BUFFERS):

UPDATE fhirdata.logical_resources    
   SET reindex_tstamp = '2020-12-10T00:00:00Z'     
WHERE logical_resource_id = (
	SELECT lr.logical_resource_id          
  	  FROM fhirdata.logical_resources lr        
	 WHERE lr.reindex_tstamp < '2020-12-10T00:00:00Z'
  ORDER BY lr.reindex_tstamp   
       FOR UPDATE SKIP LOCKED LIMIT 1) 
RETURNING logical_resource_id, resource_type_id, logical_id, reindex_txid;

Explain output for the SELECT part of the statement:

"Limit  (cost=0.57..1.65 rows=1 width=16) (actual time=0.038..0.045 rows=1 loops=1)"
"  Buffers: shared hit=7"
"  ->  Index Scan Backward using idx_logical_resources_rits on logical_resources lr  (cost=0.57..113322070.36 rows=104493008 width=16) (actual time=0.034..0.036 rows=1 loops=1)"
"        Index Cond: (reindex_tstamp < '2020-12-10 00:00:00'::timestamp without time zone)"
"        Buffers: shared hit=7"
"Planning Time: 0.127 ms"
"Execution Time: 0.081 ms"
@punktilious punktilious added bug Something isn't working performance performance postgres labels Dec 11, 2020
punktilious added a commit that referenced this issue Dec 11, 2020
…rogresses

Signed-off-by: Robin Arnold <robin.arnold23@ibm.com>
punktilious added a commit that referenced this issue Dec 11, 2020
issue #1818 reindex resource selection update slows down as reindex p…
@prb112 prb112 added this to the Sprint 22 milestone Dec 11, 2020
@prb112 prb112 closed this as completed Dec 11, 2020
JohnTimm added a commit that referenced this issue Dec 15, 2020
* Enable OpenLiberty override the fhiropenapi functionality #1760

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Issue #1780 - add compartment search/search parameter config section

Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>

* issue #1789 refresh-tenants does not handle Db2 multi-tenant/multi-schema scenarios

Signed-off-by: Robin Arnold <robin.arnold23@ibm.com>

* Invalid SQL object name `--pool-size` exception when running Docker
`fhir-persistence-schema` on PostgreSQL #1797

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Misleading error messages when using command line parameters and not
`persistence.json` for Docker `fhir-persistence-schema` #1796

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Schema defined for `fhir` in `persistence.json` not used correctly when
running Docker `fhir-persistence-schema`
#1795

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* doc: update must gather to include open shift details

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* use the fhiruser password from helm (set by overrides from KeyProtect)

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>

* Fix the definition for List-identifier search

A workaround for https://jira.hl7.org/browse/FHIR-29937

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>

* #1802

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* #1802

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Fix up the Must Gather

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* IBM FHIR Server audit does not log bundles correctly. #1803

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* update documentation for the mustgather

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* fix: per code review changed the compare, and added an additional ignore
for the metadata endpoints

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Update fhir-server/src/main/java/com/ibm/fhir/server/util/RestAuditLogger.java

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

Co-authored-by: Lee Surprenant <lmsurpre@us.ibm.com>

* doc: compilation errors are now fixed

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* doc: compilation errors are now fixed

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Add tests case and fixes for #1795

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Add tests case and fixes for #1795

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Add tests case and fixes for #1795

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Add tests case and fixes for #1796

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Add tests case and fixes for #1797

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Add tests for cp and #1802

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* fix: #1808

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Add tests for plain and base64 encoded environment variables

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* fix per comments in code review and issue

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Prevent unauthorized reads before they happen

This prevents us from leaking information about what patient resources
exist / don't exist on the system, which was deemed as useful in case an
implementer uses sensitive data for the Resource.id of the Patient
resources (which still isn't advised).

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>

* Prevent unauthorized vread and history operations against Patient too

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>

* doc: adding code of conduct md file to root of repository

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* doc: update to clarify the experimental use of the ibm-fhir-schematool

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* doc: update to clarify the experimental use of the ibm-fhir-schematool

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* doc: adding code of conduct md file to root of repository

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* add communication method

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>

* Audit Refactoring #1542

- Refactor to separate Configuration, Mapping and AuditService
- Configuration to use EventStreams Binding by default
- Add a MapperFactory to split between cadf and auditevent types
	- AuditEventMapper - Wraps the AuditLogEntry into AuditEvent format
	- CADFMapper - Wraps the AuditLogEntry into CADF format
- Audit Service is now KafkaService, NoOpService separating the
Streaming framework from the mapping actions.
- Removed duplicate code for Event/EventStrams processing
- Add Test Coverage for Uncovered Classes

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Clean up spare printStackTrace

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* update the documentation for fhir-audit

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* updates per team code review

- Change NoOp to Nop Service
- Remove references to WHC/Disabled
- Changed and clarified package names
- Changed references to specific default places to generic/unknown
places (e.g. dallas to unknown)
- added examples for audit from config and audit from environment

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* issue #1818 reindex resource selection update slows down as reindex progresses

Signed-off-by: Robin Arnold <robin.arnold23@ibm.com>

* updates per team code review

- changed package names to remove logging.api and .model (reconciling it
to a flatter structure)

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* update docs

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* change formatting slightly on MapperFactory

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* fix: it tests were not running for drug formulary and c4bb

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* fix incorrect variable in minor startup log message

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>

* Convert Strings to Constants for Kafka Key-Value in Properties objects

- Per Code Review with Robin Arnold

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Code Review for FHIR Audit

- CadfEvent doesn't write out target
- Addresses Written Twice in CadfResource

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Added Test for FHIR Audit so it doesn't have multiple address blocks

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* update docs

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Removed auditLogService <extension url="http://ibm.com/fhir/extension/auditLogProperties">

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* fix: referenced license was incorrect

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

Co-authored-by: Paul Bastide <pbastide@us.ibm.com>
Co-authored-by: Mike Schroeder <mschroed@us.ibm.com>
Co-authored-by: Robin Arnold <robin.arnold23@ibm.com>
Co-authored-by: Lee Surprenant <lmsurpre@us.ibm.com>
Co-authored-by: Michael W Schroeder <66479070+michaelwschroeder@users.noreply.github.com>
JohnTimm added a commit that referenced this issue Dec 22, 2020
* Enable OpenLiberty override the fhiropenapi functionality #1760

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Issue #1780 - add compartment search/search parameter config section

Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>

* issue #1789 refresh-tenants does not handle Db2 multi-tenant/multi-schema scenarios

Signed-off-by: Robin Arnold <robin.arnold23@ibm.com>

* Invalid SQL object name `--pool-size` exception when running Docker
`fhir-persistence-schema` on PostgreSQL #1797

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Misleading error messages when using command line parameters and not
`persistence.json` for Docker `fhir-persistence-schema` #1796

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Schema defined for `fhir` in `persistence.json` not used correctly when
running Docker `fhir-persistence-schema`
#1795

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* doc: update must gather to include open shift details

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* use the fhiruser password from helm (set by overrides from KeyProtect)

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>

* Fix the definition for List-identifier search

A workaround for https://jira.hl7.org/browse/FHIR-29937

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>

* #1802

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* #1802

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Fix up the Must Gather

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* IBM FHIR Server audit does not log bundles correctly. #1803

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* update documentation for the mustgather

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* fix: per code review changed the compare, and added an additional ignore
for the metadata endpoints

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Update fhir-server/src/main/java/com/ibm/fhir/server/util/RestAuditLogger.java

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

Co-authored-by: Lee Surprenant <lmsurpre@us.ibm.com>

* doc: compilation errors are now fixed

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* doc: compilation errors are now fixed

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Add tests case and fixes for #1795

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Add tests case and fixes for #1795

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Add tests case and fixes for #1795

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Add tests case and fixes for #1796

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Add tests case and fixes for #1797

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Add tests for cp and #1802

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* fix: #1808

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Add tests for plain and base64 encoded environment variables

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* fix per comments in code review and issue

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Prevent unauthorized reads before they happen

This prevents us from leaking information about what patient resources
exist / don't exist on the system, which was deemed as useful in case an
implementer uses sensitive data for the Resource.id of the Patient
resources (which still isn't advised).

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>

* Prevent unauthorized vread and history operations against Patient too

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>

* doc: adding code of conduct md file to root of repository

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* doc: update to clarify the experimental use of the ibm-fhir-schematool

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* doc: update to clarify the experimental use of the ibm-fhir-schematool

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* doc: adding code of conduct md file to root of repository

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* add communication method

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>

* Audit Refactoring #1542

- Refactor to separate Configuration, Mapping and AuditService
- Configuration to use EventStreams Binding by default
- Add a MapperFactory to split between cadf and auditevent types
	- AuditEventMapper - Wraps the AuditLogEntry into AuditEvent format
	- CADFMapper - Wraps the AuditLogEntry into CADF format
- Audit Service is now KafkaService, NoOpService separating the
Streaming framework from the mapping actions.
- Removed duplicate code for Event/EventStrams processing
- Add Test Coverage for Uncovered Classes

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Clean up spare printStackTrace

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* update the documentation for fhir-audit

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* updates per team code review

- Change NoOp to Nop Service
- Remove references to WHC/Disabled
- Changed and clarified package names
- Changed references to specific default places to generic/unknown
places (e.g. dallas to unknown)
- added examples for audit from config and audit from environment

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* issue #1818 reindex resource selection update slows down as reindex progresses

Signed-off-by: Robin Arnold <robin.arnold23@ibm.com>

* updates per team code review

- changed package names to remove logging.api and .model (reconciling it
to a flatter structure)

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* update docs

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* change formatting slightly on MapperFactory

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* fix: it tests were not running for drug formulary and c4bb

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* fix incorrect variable in minor startup log message

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>

* Convert Strings to Constants for Kafka Key-Value in Properties objects

- Per Code Review with Robin Arnold

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Update to 4.6.0-SNAPSHOT

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Code Review for FHIR Audit

- CadfEvent doesn't write out target
- Addresses Written Twice in CadfResource

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Added Test for FHIR Audit so it doesn't have multiple address blocks

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* update docs

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Removed auditLogService <extension url="http://ibm.com/fhir/extension/auditLogProperties">

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* fix: referenced license was incorrect

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* Package name consistency issue after Code Review of FHIR-Audit Config
Examples

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* issue #1831 - fix search parameter filtering issue

Signed-off-by: Mike Schroeder <mschroed@us.ibm.com>

* issue #1739 - Update _include and _revinclude parameter checking

Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>

* Update .index.json

Warning about invalid entry when discovering HREX profile. 
An example snuck into the .index.json. 

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

* update changelog.md

Signed-off-by: Paul Bastide <pbastide@us.ibm.com>

Co-authored-by: Paul Bastide <pbastide@us.ibm.com>
Co-authored-by: Mike Schroeder <mschroed@us.ibm.com>
Co-authored-by: Robin Arnold <robin.arnold23@ibm.com>
Co-authored-by: Lee Surprenant <lmsurpre@us.ibm.com>
Co-authored-by: Michael W Schroeder <66479070+michaelwschroeder@users.noreply.github.com>
Co-authored-by: Troy Biesterfeld <tbieste@us.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working performance performance postgres
Projects
None yet
Development

No branches or pull requests

2 participants