Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong amount of docs for test corpora when creating a track with create-track #1317

Closed
dliappis opened this issue Aug 31, 2021 · 0 comments · Fixed by #1318
Closed

Wrong amount of docs for test corpora when creating a track with create-track #1317

dliappis opened this issue Aug 31, 2021 · 0 comments · Fixed by #1318
Assignees
Labels
bug Something's wrong
Milestone

Comments

@dliappis
Copy link
Contributor

As reported in https://discuss.elastic.co/t/why-rally-extract-1001-docs-in-test-mode/282836/3, when we create a track using the create-track subcommand, and the included indices contain >1000docs, we -1k corpora (used by --test-mode) contain 1001 docs.

Replication:

  1. Install Elasticsearch

    INST_ID=$(esrally install --quiet --distribution-version=7.14.0 --car=4gheap,basic-license --network-host=127.0.0.1 --http-port=9200 --node-name=es01 --master-nodes=es01 --seed-hosts=127.0.0.1 --runtime-jdk=bundled | jq --raw-output '.["installation-id"]')

  2. Start Elasticsearch

    esrally start --installation-id=$INST_ID --race-id=test1234 --runtime-jdk=bundled

  3. Load some data

    esrally race --pipeline=benchmark-only --target-hosts=127.0.0.1:9200 --track=geonames --challenge=append-no-conflicts-index-only --track-params="ingest_percentage:0.05" --on-error=abort --include-tasks="delete-index,create-index,check-cluster-health,index-append"

  4. Create a new track:

    $ esrally create-track --track=mynewtrack --indices=geonames --target-hosts=127.0.0.1:9200
    
    [INFO] Connected to Elasticsearch cluster [es01] version [7.14.0].
    
    Extracting documents for index [geonames] for test mode...    1001/1000 docs [100.1% done]
    Extracting documents for index [geonames]...                40000/40000 docs [100.0% done]
    
    [INFO] Track mynewtrack has been created. Run it with: esrally --track-path=/.../tracks/mynewtrack
    
  5. Observe the output above showing 1001/1000; also verifying using:

    $ wc -l tracks/mynewtrack/geonames-documents-1k.json
    1001 tracks/mynewtrack/geonames-documents-1k.json
    

This is due to off-by-one error in

if n > total_docs:
. I will raise a PR fixing this shortly.

@dliappis dliappis added the bug Something's wrong label Aug 31, 2021
@dliappis dliappis added this to the 2.3.0 milestone Aug 31, 2021
@dliappis dliappis self-assigned this Aug 31, 2021
dliappis added a commit to dliappis/rally that referenced this issue Aug 31, 2021
Fix number of docs in -1k file generated by the create-track
subcommand.

Closes elastic#1317
dliappis added a commit that referenced this issue Sep 1, 2021
This commit fixes a bug causing the generation of 1001 docs in the test-mode specific -1k corpus files
created by the create-track subcommand (due to an of Off-by-one error).

Closes #1317
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something's wrong
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant