Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize source filtering in SourceFieldMapper #81970

Merged
merged 13 commits into from
Jan 12, 2022

Conversation

mushao999
Copy link
Contributor

@mushao999 mushao999 commented Dec 21, 2021

@nik9000 and @weizijun did a great job in introducing a memory efficient xcontent filtering approach and benchmarked its peformance (#77154 #81575). From their contribution, source filtering will be faster significantly.
This PR apply this approach into SourceFilterMapper where map filtering is used to include or exclude fields.
We did a benchmark using esrally, and the details list as follow:

  • Cluster: 3 nodes 12Core 24G(Heap size)
  • Rally version: 2.3.0, param:--track=nyc_taxis --challenge=append-no-conflicts-index-only --client-options="timeout:60,basic_auth_user:'elastic',basic_auth_password:'Admin@gal'" --track-params='{"bulk_size": 1000, "index_settings": {"index.number_of_replicas": "1", "index.number_of_shards": "6"}, "bulk_indexing_clients": 10}'
  • Modified nyc-taxis's index.json, add "exludes": ["surcharge", "pickup_location"]
  • Result:
    • ParseFilter indexing Mean Throughput 59029.2 docs/s, 10% more than that of MapFilter which is 54825.9docs/s
    • ParseFilter indexing 50th percentile latency 144.171ms almost the same with that of MapFilter which is 145.753ms
    • ParseFilter indexing 99th percentile latency 1106.4ms, 25% faster then that of MapFilter which is 1464.82ms

So we believe indexing operation with source filtering will be more memory efficient after this PR being applied.

@elasticsearchmachine elasticsearchmachine added v8.1.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Dec 21, 2021
@mushao999 mushao999 changed the title optimize source filtering in SourceFieldMapper using xcontent filteri… optimize source filtering in SourceFieldMapper Dec 21, 2021
@mushao999
Copy link
Contributor Author

Rally Result of MapFilter

------------------------------------------------------
    _______             __   _____
   / ____(_)___  ____ _/ /  / ___/_________  ________
  / /_  / / __ \/ __ `/ /   \__ \/ ___/ __ \/ ___/ _ \
 / __/ / / / / / /_/ / /   ___/ / /__/ /_/ / /  /  __/
/_/   /_/_/ /_/\__,_/_/   /____/\___/\____/_/   \___/
------------------------------------------------------

|                                                         Metric |                 Task |       Value |   Unit |
|---------------------------------------------------------------:|---------------------:|------------:|-------:|
|                     Cumulative indexing time of primary shards |                      |     202.422 |    min |
|             Min cumulative indexing time across primary shards |                      |     202.422 |    min |
|          Median cumulative indexing time across primary shards |                      |     202.422 |    min |
|             Max cumulative indexing time across primary shards |                      |     202.422 |    min |
|            Cumulative indexing throttle time of primary shards |                      |           0 |    min |
|    Min cumulative indexing throttle time across primary shards |                      |           0 |    min |
| Median cumulative indexing throttle time across primary shards |                      |           0 |    min |
|    Max cumulative indexing throttle time across primary shards |                      |           0 |    min |
|                        Cumulative merge time of primary shards |                      |     109.501 |    min |
|                       Cumulative merge count of primary shards |                      |         388 |        |
|                Min cumulative merge time across primary shards |                      |     109.501 |    min |
|             Median cumulative merge time across primary shards |                      |     109.501 |    min |
|                Max cumulative merge time across primary shards |                      |     109.501 |    min |
|               Cumulative merge throttle time of primary shards |                      |     39.6005 |    min |
|       Min cumulative merge throttle time across primary shards |                      |     39.6005 |    min |
|    Median cumulative merge throttle time across primary shards |                      |     39.6005 |    min |
|       Max cumulative merge throttle time across primary shards |                      |     39.6005 |    min |
|                      Cumulative refresh time of primary shards |                      |     2.21285 |    min |
|                     Cumulative refresh count of primary shards |                      |         221 |        |
|              Min cumulative refresh time across primary shards |                      |     2.21285 |    min |
|           Median cumulative refresh time across primary shards |                      |     2.21285 |    min |
|              Max cumulative refresh time across primary shards |                      |     2.21285 |    min |
|                        Cumulative flush time of primary shards |                      |     11.7106 |    min |
|                       Cumulative flush count of primary shards |                      |         187 |        |
|                Min cumulative flush time across primary shards |                      |     11.7106 |    min |
|             Median cumulative flush time across primary shards |                      |     11.7106 |    min |
|                Max cumulative flush time across primary shards |                      |     11.7106 |    min |
|                                        Total Young Gen GC time |                      |         9.4 |      s |
|                                       Total Young Gen GC count |                      |         657 |        |
|                                          Total Old Gen GC time |                      |           0 |      s |
|                                         Total Old Gen GC count |                      |           0 |        |
|                                                     Store size |                      |      81.889 |     GB |
|                                                  Translog size |                      | 1.02445e-07 |     GB |
|                                         Heap used for segments |                      |           0 |     MB |
|                                       Heap used for doc values |                      |           0 |     MB |
|                                            Heap used for terms |                      |           0 |     MB |
|                                            Heap used for norms |                      |           0 |     MB |
|                                           Heap used for points |                      |           0 |     MB |
|                                    Heap used for stored fields |                      |           0 |     MB |
|                                                  Segment count |                      |          40 |        |
|                                       100th percentile latency |         create-index |     15.1375 |     ms |
|                                  100th percentile service time |         create-index |     15.1375 |     ms |
|                                                     error rate |         create-index |         100 |      % |
|                                       100th percentile latency | check-cluster-health |     30003.5 |     ms |
|                                  100th percentile service time | check-cluster-health |     30003.5 |     ms |
|                                                     error rate | check-cluster-health |         100 |      % |
|                                                 Min Throughput |                index |     52240.4 | docs/s |
|                                                Mean Throughput |                index |     54825.9 | docs/s |
|                                              Median Throughput |                index |     54327.3 | docs/s |
|                                                 Max Throughput |                index |     59199.1 | docs/s |
|                                        50th percentile latency |                index |     145.753 |     ms |
|                                        90th percentile latency |                index |     186.755 |     ms |
|                                        99th percentile latency |                index |     1464.82 |     ms |
|                                      99.9th percentile latency |                index |     3642.92 |     ms |
|                                     99.99th percentile latency |                index |     4719.31 |     ms |
|                                       100th percentile latency |                index |     10071.9 |     ms |
|                                   50th percentile service time |                index |     145.753 |     ms |
|                                   90th percentile service time |                index |     186.755 |     ms |
|                                   99th percentile service time |                index |     1464.82 |     ms |
|                                 99.9th percentile service time |                index |     3642.92 |     ms |
|                                99.99th percentile service time |                index |     4719.31 |     ms |
|                                  100th percentile service time |                index |     10071.9 |     ms |
|                                                     error rate |                index |           0 |      % |

Rally Result of ParserFilter

------------------------------------------------------
    _______             __   _____
   / ____(_)___  ____ _/ /  / ___/_________  ________
  / /_  / / __ \/ __ `/ /   \__ \/ ___/ __ \/ ___/ _ \
 / __/ / / / / / /_/ / /   ___/ / /__/ /_/ / /  /  __/
/_/   /_/_/ /_/\__,_/_/   /____/\___/\____/_/   \___/
------------------------------------------------------

|                                                         Metric |                 Task |       Value |   Unit |
|---------------------------------------------------------------:|---------------------:|------------:|-------:|
|                     Cumulative indexing time of primary shards |                      |     204.356 |    min |
|             Min cumulative indexing time across primary shards |                      |     204.356 |    min |
|          Median cumulative indexing time across primary shards |                      |     204.356 |    min |
|             Max cumulative indexing time across primary shards |                      |     204.356 |    min |
|            Cumulative indexing throttle time of primary shards |                      |           0 |    min |
|    Min cumulative indexing throttle time across primary shards |                      |           0 |    min |
| Median cumulative indexing throttle time across primary shards |                      |           0 |    min |
|    Max cumulative indexing throttle time across primary shards |                      |           0 |    min |
|                        Cumulative merge time of primary shards |                      |     106.225 |    min |
|                       Cumulative merge count of primary shards |                      |         397 |        |
|                Min cumulative merge time across primary shards |                      |     106.225 |    min |
|             Median cumulative merge time across primary shards |                      |     106.225 |    min |
|                Max cumulative merge time across primary shards |                      |     106.225 |    min |
|               Cumulative merge throttle time of primary shards |                      |     34.2447 |    min |
|       Min cumulative merge throttle time across primary shards |                      |     34.2447 |    min |
|    Median cumulative merge throttle time across primary shards |                      |     34.2447 |    min |
|       Max cumulative merge throttle time across primary shards |                      |     34.2447 |    min |
|                      Cumulative refresh time of primary shards |                      |     2.14308 |    min |
|                     Cumulative refresh count of primary shards |                      |         221 |        |
|              Min cumulative refresh time across primary shards |                      |     2.14308 |    min |
|           Median cumulative refresh time across primary shards |                      |     2.14308 |    min |
|              Max cumulative refresh time across primary shards |                      |     2.14308 |    min |
|                        Cumulative flush time of primary shards |                      |     11.0575 |    min |
|                       Cumulative flush count of primary shards |                      |         188 |        |
|                Min cumulative flush time across primary shards |                      |     11.0575 |    min |
|             Median cumulative flush time across primary shards |                      |     11.0575 |    min |
|                Max cumulative flush time across primary shards |                      |     11.0575 |    min |
|                                        Total Young Gen GC time |                      |       9.416 |      s |
|                                       Total Young Gen GC count |                      |         649 |        |
|                                          Total Old Gen GC time |                      |           0 |      s |
|                                         Total Old Gen GC count |                      |           0 |        |
|                                                     Store size |                      |     83.4966 |     GB |
|                                                  Translog size |                      | 1.02445e-07 |     GB |
|                                         Heap used for segments |                      |           0 |     MB |
|                                       Heap used for doc values |                      |           0 |     MB |

▽
|                                            Heap used for terms |                      |           0 |     MB |
|                                            Heap used for norms |                      |           0 |     MB |
|                                           Heap used for points |                      |           0 |     MB |
|                                    Heap used for stored fields |                      |           0 |     MB |
|                                                  Segment count |                      |          43 |        |
|                                       100th percentile latency |         create-index |     10.2571 |     ms |
|                                  100th percentile service time |         create-index |     10.2571 |     ms |
|                                                     error rate |         create-index |         100 |      % |
|                                       100th percentile latency | check-cluster-health |     30003.9 |     ms |
|                                  100th percentile service time | check-cluster-health |     30003.9 |     ms |
|                                                     error rate | check-cluster-health |         100 |      % |
|                                                 Min Throughput |                index |     56525.2 | docs/s |
|                                                Mean Throughput |                index |     58761.2 | docs/s |
|                                              Median Throughput |                index |     58217.2 | docs/s |
|                                                 Max Throughput |                index |       62062 | docs/s |
|                                        50th percentile latency |                index |     144.171 |     ms |
|                                        90th percentile latency |                index |     182.389 |     ms |
|                                        99th percentile latency |                index |      1106.4 |     ms |
|                                      99.9th percentile latency |                index |     2990.65 |     ms |
|                                     99.99th percentile latency |                index |      3797.9 |     ms |
|                                       100th percentile latency |                index |     4727.99 |     ms |
|                                   50th percentile service time |                index |     144.171 |     ms |
|                                   90th percentile service time |                index |     182.389 |     ms |
|                                   99th percentile service time |                index |      1106.4 |     ms |
|                                 99.9th percentile service time |                index |     2990.65 |     ms |
|                                99.99th percentile service time |                index |      3797.9 |     ms |
|                                  100th percentile service time |                index |     4727.99 |     ms |
|                                                     error rate |                index |           0 |      % |

@astefan astefan added the :Search Foundations/Mapping Index mappings, including merging and defining field types label Dec 22, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Dec 22, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @mushao999, thanks for opening. I'd like @nik9000 to have a look as well before we merge it but LGTM in general. I left a couple of suggestions.

@nik9000
Copy link
Member

nik9000 commented Jan 5, 2022

I'd like @nik9000 to have a look as well before we merge it but LGTM in general. I left a couple of suggestions.

Ah! @romseygeek this is the one I'd hope you'd review. And you already did. Cool. I'll take a look too.

@nik9000
Copy link
Member

nik9000 commented Jan 5, 2022

I'm a little surprised there isn't less GC. I mean, there is a little less GC, but I'd hoped for better. I'll bet the GC from the rest of the indexing process is just swamping the gains.

@nik9000
Copy link
Member

nik9000 commented Jan 5, 2022

I wonder if it's worth moving your change into XContentMapValues.filter(includes, excludes) so we get it everywhere. Presumably all the places that filter bytes should use this method if they can.

@nik9000
Copy link
Member

nik9000 commented Jan 5, 2022

@elasticmachine test this please

@mushao999
Copy link
Contributor Author

I wonder if it's worth moving your change into XContentMapValues.filter(includes, excludes) so we get it everywhere. Presumably all the places that filter bytes should use this method if they can.

Applying the new filtering mechanism throughout all the filtering cases is a good idea. But I do not think XContentMapValues.filter(includes, excludes) is a good place, we will have to convert source to map before filtering which is costly and unnecessary for the new mechanism.

@romseygeek
Copy link
Contributor

we will have to convert source to map before filtering which is costly and unnecessary for the new mechanism.

This is a good point! Can we instead create a new interface XContentFieldFilter that extends CheckedBiFunction<BytesReference, XContentType, BytesReference, IOException>, and add a static method newFilter(String[] includes, String[] excludes) that does what you've added to SourceFieldMapper? And we can then use this in other places as well - from a first look we can use it in FetchSourcePhase and ShardGetService at least. If it looks as though converting the other locations will be too much for one PR then we can break things up a bit, but ShardGetService should be simple and I think it would be helpful to have this in two places so we can check the API makes sense generally.

@mushao999
Copy link
Contributor Author

mushao999 commented Jan 11, 2022

we will have to convert source to map before filtering which is costly and unnecessary for the new mechanism.

This is a good point! Can we instead create a new interface XContentFieldFilter that extends CheckedBiFunction<BytesReference, XContentType, BytesReference, IOException>, and add a static method newFilter(String[] includes, String[] excludes) that does what you've added to SourceFieldMapper? And we can then use this in other places as well - from a first look we can use it in FetchSourcePhase and ShardGetService at least. If it looks as though converting the other locations will be too much for one PR then we can break things up a bit, but ShardGetService should be simple and I think it would be helpful to have this in two places so we can check the API makes sense generally.

Thanks, this is a great suggestion. I'v added two commits.

  • First one introduced XContentFieldFilter and added newFieldFilter method in XContentHelper
  • Second one use the XContentFieldFIlter in ShardGetService.

By the way, I think XContentFieldFilter can be a member in FetchSourceContext which can avoid constructing filter for every docs, in a new PR maybe.

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of comments but this looks good overall. Thanks for being patient with all of our suggestions!

@romseygeek
Copy link
Contributor

@elasticmachine test this please

@romseygeek
Copy link
Contributor

The build is failing on checkstyle - can you also run ./gradlew precommit and fix up any styling issues that it finds?

@mushao999
Copy link
Contributor Author

mushao999 commented Jan 11, 2022

@romseygeek Thanks for you suggestion. The newFilter method has been moved to XContentFieldFilter. Test and checkStyle problems have also been fixed. Please help to invoke the test again.

@romseygeek
Copy link
Contributor

@elasticmachine test this please

@mushao999
Copy link
Contributor Author

@romseygeek There are still two test failed , I will try to fix them tomorrow. Thanks again.

@mushao999
Copy link
Contributor Author

@romseygeek Sorry for the test failure. Tests failed due to null value of xContentType in XContentFieldFilter#apply. I've added the null value handling in this method and run tests locally for many times.

@romseygeek
Copy link
Contributor

@elasticmachine test this please

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for your patience on this

@romseygeek romseygeek merged commit 4560a0c into elastic:master Jan 12, 2022
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Jan 20, 2022
This reverts commit 4560a0c. It
unexpectedly caused elastic#82891.
elasticsearchmachine pushed a commit that referenced this pull request Jan 20, 2022
This reverts commit 4560a0c. It
unexpectedly caused #82891.
mushao999 added a commit to mushao999/elasticsearch that referenced this pull request Jan 26, 2022
This commit introduces XContentFieldFilter, which applies field includes/excludes to 
XContent without having to realise the xcontent itself as a java map.  SourceFieldMapper
and ShardGetService are cut over to use this class
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team v8.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants