Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Me/dpc 4458 bulk pat submit update #2394

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

MEspositoE14s
Copy link
Contributor

🎫 Ticket

https://jira.cms.gov/browse/DPC-4458

🛠 Changes

Bulk submits for both Patient and Practitioner resources have been upgraded:

  • In dpc-api, resource validation is done in parallel instead of sequence.
  • In dpc-attribution, inserts and queries are both batched before being sent to the DB.
  • DB batch and query size are configurable in application.yml files.
  • Customer generated ids for Patient and Practitioner resources are replaced.
  • Added logging for Patient and Practitioner bundle size on bulk submission.

ℹ️ Context

We noticed that /Patient/$submit was failing every time it was called by a customer, but working when called by our smoke tests. It worked for our smoke tests because they only submit 100 patients at a time, but our customers were trying to submit more.

After some testing, we realized that /Patient/$submit usually times out somewhere between 3k and 4k patients. This new version can handle 100k in local tests. (For reference, only one of our customers has more than 100k patients, so all but that one should now be able to submit their entire patient roster.)

Note

The current timeout when dpc-api calls dpc-attribution is 20 seconds. That number wasn't chosen for any customer specific reason, it was just the lowest number that worked with our smoke tests. If we decide to raise it, at least in local tests, we should be able to handle 250k patients at a time before we start running into memory issues.

🧪 Validation

Ran locally and got the following results:

Patients Total Request Time
1000 2 seconds
5000 4 seconds
10000 6 seconds
50000 18 seconds
100000 30 seconds
250000 72 seconds (Timeout removed)
500000 Memory error (Timeout removed)

@MEspositoE14s MEspositoE14s requested a review from a team January 23, 2025 20:13
@@ -625,6 +631,39 @@ public void testPatientPathAuthorization() throws GeneralSecurityException, IOEx
, "Expected auth error when export another org's patient's data");
}

@Test
void testBatchSubmit() throws GeneralSecurityException, IOException, URISyntaxException {
final int COUNT_TEST_PATIENTS = 500;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to test the api's performance locally without generating thousands of test patients and setting up Postman, just set the number of patients you want here and run this test. It'll do everything for you and log the total transaction time.

Copy link
Contributor

@jdettmannnava jdettmannnava Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running off a local script, I am submitting 220,000 successfully, but failing at 300,000

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's pretty similar to what I'm getting, too. Outside of one particular customer that has ~900k patients, that should be enough for everyone else to submit their entire patient roster in one shot if they need to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants