Improve performance of type map creation #376
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Please see individual commits for extensive details. I’m happy to separate into multiple PRs if some work needs more vetting or discussion, as each commit can stand on its own (but separating will require resolving conflicts as commits 2 & 3 both depend on 1).
This is the result of investigation after encountering some performance issues with type map initialization being unexpectedly slow (after implementing a more time-sensitive use case) when the number of rows queries from
pg_type
is quite large (~131k in production). The enclosed performance testing was done on in an ideal (and faster) development environment with similar schema and slightly large pg_type set (~189k). In our production environment we were seeing type map creation take 4,102±550 ms, and since we create two type maps, that time doubles currently. Needless to say, that is quite slow and these changes aim to improve that.Finally, I am curious if it is possible to reduce the number of rows which are actually used to generate the coder maps, but I’m not certain if all of there are truly needed. It appears that every table with an array column gets two rows (one array_in, one record_in); since we have many tables this ends up summing to quite a lot.