Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bulkdata patient/group exports are slow #1198

Closed
albertwang-ibm opened this issue Jun 5, 2020 · 8 comments
Closed

bulkdata patient/group exports are slow #1198

albertwang-ibm opened this issue Jun 5, 2020 · 8 comments
Assignees
Labels
enhancement New feature or request performance performance
Milestone

Comments

@albertwang-ibm
Copy link
Contributor

currently, the codes get the patients first, and then do compartment search for the patients one by one on for the required resource type, so this means for each page of patients(e.g, 200), 200 search queries have to be made to the database to get all those required resources for the patients.

we should consider to enhance the search API to allow it to be able to get compartment resources for multiple patients in one request, this will greatly improve the performance of patient/group exports.

@albertwang-ibm albertwang-ibm added enhancement New feature or request performance performance labels Jun 5, 2020
@prb112
Copy link
Contributor

prb112 commented Jun 5, 2020

could you use an _include? like this... but this one is EOB "_include", "ExplanationOfBenefit:patient"

@albertwang-ibm
Copy link
Contributor Author

not really, this is different, this is to get patient resources for the found ExplanationOfBenefits whose subject is the patients.

@albertwang-ibm
Copy link
Contributor Author

what we need is getting ExplanationOfBenefits for all found patients.

@albertwang-ibm
Copy link
Contributor Author

maybe can use _revinclude ...

@prb112
Copy link
Contributor

prb112 commented Jun 5, 2020

Right. It was just an example, you can flip it around

@albertwang-ibm
Copy link
Contributor Author

albertwang-ibm commented Jun 5, 2020

actually, I'm thinking that using _revinclude for the patient search could cause huge join which may also cause bad performance ... maybe we can achieve better performance by passing multiple id to compartment search instead of one and use "in" instead of "=" in the sql layer, this hopefully can help us to achieve better performance and avoid the potential time out issue. e.g, if we want to export all patients with all their ExplainationOfBenefits, then this will cause huge join in the db side which could cause db timeout that we have seen before ...

@albertwang-ibm
Copy link
Contributor Author

albertwang-ibm commented Jun 5, 2020

and seems _revinclude for patients may not work for group export ... because we don't have something like \patient?groupid="xxxxx"

@albertwang-ibm
Copy link
Contributor Author

ideally, we could have api like \patient?groupid="xxxxx", not matter if the group is members based or characteritics based. then we can use _revinclude based on it ....

@albertwang-ibm albertwang-ibm added this to the Sprint 13 milestone Jun 9, 2020
@albertwang-ibm albertwang-ibm self-assigned this Jun 9, 2020
albertwang-ibm added a commit that referenced this issue Jun 9, 2020
Signed-off-by: Albert Wang <xuwang@us.ibm.com>
albertwang-ibm added a commit that referenced this issue Jun 10, 2020
Signed-off-by: Albert Wang <xuwang@us.ibm.com>
albertwang-ibm added a commit that referenced this issue Jun 10, 2020
issue #1198 performance enhancement for patient/group export
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance performance
Projects
None yet
Development

No branches or pull requests

2 participants