Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix and improve Vamana and IVF PQ parameters #9

Merged
merged 1 commit into from
Aug 14, 2024

Conversation

jparismorgan
Copy link

What

Here we:

  • Fix Vamana which wasn't passing l_search to query() during batch.
  • Keep l_build constant for Vamana b/c it doesn't help performance, and just vary r_max_degree.
  • Have IVF PQ parameterize num_subspaces which improves benchmark performance.

Testing

I used the testing config below to test several different variations, and we check in the best: tiledb-vamana-5 and ivf-pq-5.

sift-128-euclidean_10_euclidean-batch

Testing config

float:
  euclidean:
  - base_args: ['@metric']
    constructor: TileDBIVFFlat
    disabled: false
    docker_tag: ann-benchmarks-tiledb
    module: ann_benchmarks.algorithms.tiledb
    name: tiledb-ivf-flat
    run_groups:
      IVFFLAT:
        # n_list:
        args: [[512, 1024, 2048, 4096, 8192]]
        # n_probe:
        query_args: [[1, 5, 10, 50, 100, 200]]

  - base_args: ['@metric']
    constructor: TileDBFlat
    disabled: false
    docker_tag: ann-benchmarks-tiledb
    module: ann_benchmarks.algorithms.tiledb
    name: tiledb-flat
    run_groups:
      FLAT:
        args:
            placeholder: [0]

  - base_args: ['@metric']
    constructor: TileDBVamana
    disabled: false
    docker_tag: ann-benchmarks-tiledb
    module: ann_benchmarks.algorithms.tiledb
    name: tiledb-vamana
    run_groups:
      VAMANA:
        # l_build & r_max_degree:
        args: [[20, 40, 60]]
        # l_search:
        query_args: [[1, 5, 10, 30, 50, 70, 90, 110, 130]]

  - base_args: ['@metric']
    constructor: TileDBVamana
    disabled: false
    docker_tag: ann-benchmarks-tiledb
    module: ann_benchmarks.algorithms.tiledb
    name: tiledb-vamana-2
    run_groups:
      VAMANA:
        # l_build & r_max_degree:
        args: [[40, 60]]
        # l_search:
        query_args: [[1, 5, 10, 30, 50, 70, 90, 110, 130]]

  - base_args: ['@metric']
    constructor: TileDBVamana
    disabled: false
    docker_tag: ann-benchmarks-tiledb
    module: ann_benchmarks.algorithms.tiledb
    name: tiledb-vamana-3
    # super().__init__(
    #         index_type="VAMANA",
    #         metric=metric,
    #         l_build=60,
    #         r_max_degree=l_build_and_r_max_degree
    #     )
    run_groups:
      VAMANA:
        # r_max_degree:
        args: [[10, 15, 20, 25, 30, 35, 40]]
        # l_search:
        query_args: [[1, 5, 10, 30, 50, 70, 90, 110, 130]]

  - base_args: ['@metric']
    constructor: TileDBVamana
    disabled: false
    docker_tag: ann-benchmarks-tiledb
    module: ann_benchmarks.algorithms.tiledb
    name: tiledb-vamana-4
    # super().__init__(
    #         index_type="VAMANA",
    #         metric=metric,
    #         l_build=100,
    #         r_max_degree=l_build_and_r_max_degree
    #     )
    run_groups:
      VAMANA:
        # r_max_degree:
        args: [[10, 15, 20, 25, 30, 35, 40]]
        # l_search:
        query_args: [[1, 5, 10, 30, 50, 70, 90, 110, 130]]
  
  - base_args: ['@metric']
    constructor: TileDBVamana
    disabled: false
    docker_tag: ann-benchmarks-tiledb
    module: ann_benchmarks.algorithms.tiledb
    name: tiledb-vamana-5
    # super().__init__(
    #         index_type="VAMANA",
    #         metric=metric,
    #         l_build=60,
    #         r_max_degree=l_build_and_r_max_degree
    #     )
    # but with l_search fixed!
    run_groups:
      VAMANA:
        # r_max_degree:
        args: [[10, 15, 20, 25, 30, 35, 40]]
        # l_search:
        query_args: [[1, 5, 10, 30, 50, 70, 90, 110, 130]]

  - base_args: ['@metric']
    constructor: TileDBVamana
    disabled: false
    docker_tag: ann-benchmarks-tiledb
    module: ann_benchmarks.algorithms.tiledb
    name: tiledb-vamana-6
    # super().__init__(
    #         index_type="VAMANA",
    #         metric=metric,
    #         l_build=50,
    #         r_max_degree=l_build_and_r_max_degree
    #     )
    # but with l_search fixed!
    run_groups:
      VAMANA:
        # r_max_degree:
        args: [[10, 15, 20, 25, 30, 35, 40]]
        # l_search:
        query_args: [[1, 5, 10, 30, 50, 70, 90, 110, 130]]
  
  - base_args: ['@metric']
    constructor: TileDBVamana
    disabled: false
    docker_tag: ann-benchmarks-tiledb
    module: ann_benchmarks.algorithms.tiledb
    name: tiledb-vamana-7
    # super().__init__(
    #         index_type="VAMANA",
    #         metric=metric,
    #         l_build=75,
    #         r_max_degree=l_build_and_r_max_degree
    #     )
    # but with l_search fixed!
    run_groups:
      VAMANA:
        # r_max_degree:
        args: [[10, 15, 20, 25, 30, 35, 40]]
        # l_search:
        query_args: [[1, 5, 10, 30, 50, 70, 90, 110, 130]]

  - base_args: ['@metric']
    constructor: TileDBVamana
    disabled: false
    docker_tag: ann-benchmarks-tiledb
    module: ann_benchmarks.algorithms.tiledb
    name: tiledb-vamana-8
    # super().__init__(
    #         index_type="VAMANA",
    #         metric=metric,
    #         l_build=60,
    #         r_max_degree=l_build_and_r_max_degree
    #     )
    # but with l_search fixed!
    run_groups:
      VAMANA:
        # r_max_degree:
        args: [[20, 40, 60, 80]]
        # l_search:
        query_args: [[1, 5, 10, 30, 50, 70, 90, 110, 130]]
  
  - base_args: ['@metric']
    constructor: TileDBIVFPQ
    disabled: false
    docker_tag: ann-benchmarks-tiledb
    module: ann_benchmarks.algorithms.tiledb
    name: tiledb-ivf-pq
    run_groups:
      IVFPQ:
        # n_list:
        args: [[512, 1024, 2048, 4096, 8192]]
        # n_probe:
        query_args: [[1, 5, 10, 50, 100, 200]]

  - base_args: ['@metric']
    constructor: TileDBIVFPQ
    disabled: false
    docker_tag: ann-benchmarks-tiledb
    module: ann_benchmarks.algorithms.tiledb
    name: tiledb-ivf-pq-2
    # num_subspaces=dimensions/4
    run_groups:
      IVFPQ:
        # n_list:
        args: [[512, 1024, 2048, 4096, 8192]]
        # n_probe:
        query_args: [[1, 5, 10, 50, 100, 200]]
  
  - base_args: ['@metric']
    constructor: TileDBIVFPQ
    disabled: false
    docker_tag: ann-benchmarks-tiledb
    module: ann_benchmarks.algorithms.tiledb
    name: tiledb-ivf-pq-3
    # num_subspaces=dimensions
    run_groups:
      IVFPQ:
        # n_list:
        args: [[512, 1024, 2048, 4096, 8192]]
        # n_probe:
        query_args: [[1, 5, 10, 50, 100, 200]]

  - base_args: ['@metric']
    constructor: TileDBIVFPQ
    disabled: false
    docker_tag: ann-benchmarks-tiledb
    module: ann_benchmarks.algorithms.tiledb
    name: tiledb-ivf-pq-4
    # num_subspaces=dimensions/8
    run_groups:
      IVFPQ:
        # n_list:
        args: [[512, 1024, 2048, 4096, 8192]]
        # n_probe:
        query_args: [[1, 5, 10, 50, 100, 200]]

  - base_args: ['@metric']
    constructor: TileDBIVFPQ
    disabled: false
    docker_tag: ann-benchmarks-tiledb
    module: ann_benchmarks.algorithms.tiledb
    name: tiledb-ivf-pq-5
    run_groups:
      IVFPQ:
        args: [
          # n_list:
          [512, 1024, 2048, 4096, 8192],
          # num_subspaces divisor:
          [1, 2, 4, 8]
        ]
        # n_probe:
        query_args: [[1, 5, 10, 50, 100, 200]]

Copy link

@cainamisir cainamisir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 🚀! Thanks!

@jparismorgan jparismorgan merged commit b76d670 into main Aug 14, 2024
34 of 43 checks passed
@jparismorgan jparismorgan deleted the jparismorgan/ann-benchmarks branch August 14, 2024 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants