Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Add support for large free text workloads #43

Open
kotwanikunal opened this issue Apr 21, 2022 · 7 comments
Open

[FEATURE] Add support for large free text workloads #43

kotwanikunal opened this issue Apr 21, 2022 · 7 comments
Labels
enhancement New feature or request

Comments

@kotwanikunal
Copy link
Member

Is your feature request related to a problem?

  • The current workloads do not have a large free text documents which is representative of the real world scenarios
  • This issue was highlighted in [BUG] Indexing Performance Degraded in OpenSearch 1.3.+ OpenSearch#2916 where there was a big drop in indexing performance which was not uncovered by the recommended workloads during the 1.3.0 launch as well as recently run tests for 1.2.4 and 1.3.0 with the same configuration
  • We would like to identify such anomalies to further strengthen our coverage and impact analysis for releases and feature additions to the OpenSearch codebase

What solution would you like?

  • Addition of new customer dataset representative workloads with large free text

What alternatives have you considered?

  • Running performance tests with multiple existing workloads which have even smaller documents than nyc_taxis

Do you have any additional context?

  • N/A
@anasalkouz
Copy link
Member

@treddeni-amazon this is really important to support. Having such performance degradation issues will decrease our trust of the performance benchmark results.

@travisbenedict
Copy link
Contributor

@kotwanikunal would the so workload fit this use case? It has freeform text fields - example doc.

@kotwanikunal
Copy link
Member Author

@kotwanikunal would the so workload fit this use case? It has freeform text fields - example doc.

Thanks for the update. I did schedule some tests over the weekend for so and http-logs. We should have the results by tomorrow to see if these tests detect the mentioned performance drop.

@kotwanikunal
Copy link
Member Author

The results for so and http-logs on 1.2.4 and 1.3.0 are similar in nature to our original findings. 1.3.0 seems to perform better in general.
We will need an additional workload with larger free text fields which is able to detect the performance drops.

image

@CEHENKLE
Copy link
Member

CEHENKLE commented May 9, 2022

@opensearch-project/benchmark-core Hey folks -- What are your thoughts on this? Is this an improvement we can get added?

@IanHoang
Copy link
Collaborator

@CEHENKLE @kotwanikunal Thanks for pointing this out. We will generate an additional workload with larger text fields as soon as possible. However, due to a high volume of tasks that we need to attend to, please expect delays.

@ankitkala
Copy link
Member

ankitkala commented Jul 18, 2022

@CEHENKLE @kotwanikunal Can you guys also verify whether the regression mentioned above could've been caught with 50% heap instead of the current 1 GB.
I'm asking this since elastic has been completely relying on these datasets for catching any regression. Nothing against adding new workloads but we should also improve our existing test setup.
We can also look for more thorough perf testing with different workloads and cluster configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants