Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of type map creation #376

Merged
merged 3 commits into from
Jul 29, 2021
Merged

Commits on Apr 7, 2021

  1. Do not query for unused data in BasicTypeMapping

    Usage of the range data was previously commented-out but was removed
    entirely in 365008e.
    amarshall committed Apr 7, 2021
    Configuration menu
    Copy the full SHA
    ed95202 View commit details
    Browse the repository at this point in the history
  2. Avoid materializing Result multiple times in type map

    Testing methodology:
    
    - 189k rows returned from pg_type query in `build_coder_maps`.
    - Timing calling `PG::BasicTypeMapForQueries.new` only, figures are
      average of 16 iterations.
    
                     runtime        objects        allocations
    Baseline :  2,027±350 ms          909 k            609 MiB
    Optimized:    874± 70 ms (-56%)   113 k (-88%)     113 MiB (-79%)
    
    Unfortunately performing the actual SQL query cannot easily be removed,
    so there is a fair bit of variance in the runtimes, however, results
    still show substantial improvement even when considering the worse case.
    If I had more time I would time the query and remove it, but that
    requires a bit more hacking than I would prefer, and, as said, the
    improvement stands either way.
    amarshall committed Apr 7, 2021
    Configuration menu
    Copy the full SHA
    4f38d28 View commit details
    Browse the repository at this point in the history

Commits on Apr 8, 2021

  1. Permit building coder maps ahead-of-time of TypeMap init

    Building coder maps can be quite expensive, building them ahead-of-time
    allows them to be reused for multiple type map creations, e.g. if
    creating both BasicTypeMapForQueries and BasicTypeMapForResults the cost
    need only be paid once.
    
    Example usage:
    
        conn = PG.connect(…)
        coder_maps = BasicTypeRegistry.build_coder_maps(conn)
        tmq = BasicTypeMapForQueries.new(conn, coder_maps: coder_maps)
        tmr = BasicTypeMapForResults.new(conn, coder_maps: coder_maps)
    
    Testing methodology:
    
    - 189k rows returned from pg_type query in `build_coder_maps`.
    - Cached and injected results from query to remove query cost and
      jitter from baseline.
    - Timing calling `PG::BasicTypeMapForQueries.new` only, figures are
      average of 32 iterations.
    - “Cached” calls `build_coder_maps` ahead-of-time, outside the timing
      loop, and passes-in.
    
              runtime
    Baseline:  374±16 ms
    Cached  :   <1± 0 ms (-99%)
    
    This confirms that the majority of the cost in creating a new type map
    is, when ignoring SQL query and results materialization, enclosed within
    `build_coder_maps`. As such, for *n* type maps, this reduces the
    (sometimes non-small) time complexity effectively from O(n) → O(1).
    amarshall committed Apr 8, 2021
    Configuration menu
    Copy the full SHA
    4ac7211 View commit details
    Browse the repository at this point in the history