Using Elasticsearch 7.17.3, GMS generates a lot of log entries "[ignore_throttled] parameter is deprecated" #9745

githendrik · 2024-01-30T10:45:56Z

Describe the bug

I'm running ElasticSearch 7.17.3, as specified in the datahub helm charts. This works, but GMS produces a lot of WARNINGS in the logs:

org.opensearch.client.RestClient:85 - request [POST http://elasticsearch:9200/datahubingestionsourceindex_v2/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true] returned 1 warnings: [299 Elasticsearch-7.17.3-5ad023604c8d7416c9eb6c0eadb62b14e766caff "[ignore_throttled] parameter is deprecated because frozen indices have been deprecated. Consider cold or frozen tiers in place of frozen indices."]

Apparently it's recommended to update the ElasticSearch / Opensearch java clients to remove these warnings. I've tried updating the opensearch client to the latest version (2.11.1), but unfortunately this has breaking changes and doesn't compile.

It doesn't happen when running datahub docker quickstart, as the ES version in the quickstart docker compose files is still on 7.10.1

To Reproduce
Steps to reproduce the behavior:

Set DATAHUB_SEARCH_TAG to 7.17.3
Start Datahub using Docker quickstart
Observe GMS logs

Expected behavior
No deprecation warnings in logs

Additional info
I'm more than happy to contribute to a fix. However I'm missing a bit of context on the implementation, and as stated this wasn't a "zero-touch" upgrade unfortunately.
If someone could give me some pointers, I'd be happy to try and find a fix.

github-actions · 2024-03-01T01:49:48Z

This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io

Masterchen09 · 2024-03-13T11:42:24Z

In the Elasticsearch repository there is a pull request which prevents that the ignore_throttled parameter is added to the search request, when it is set to the default value of true (elastic/elasticsearch#84827). However DataHub is using the OpenSearch client (see here= and unfortunately there isn't such a logic which prevents the parameter to be added to the search request (see here). I am not even sure whether you can use Elasticsearch 7.17.3 with the OpenSearch client, because it seems the OpenSearch client is only guaranteed to be compatible up to version 7.10.2 of Elasticsearch (see here). It also seems to be that frozen tiers are not deprecated in OpenSearch...? Maybe the Helm chart should also have an OpenSearch deployment instead of Elasticsearch if the OpenSearch client is used?

Nonetheless a quick, but maybe not nice solution would be to filter the corresponding log message using the logback.xml here:

datahub/metadata-jobs/mae-consumer-job/src/main/resources/logback.xml

Lines 8 to 10 in b0163c4

    
           <filter class="com.linkedin.metadata.utils.log.LogMessageFilter"> 
        
               <excluded>scanned from multiple locations</excluded> 
        
           </filter>

datahub/metadata-jobs/mce-consumer-job/src/main/resources/logback.xml

Lines 8 to 10 in b0163c4

    
           <filter class="com.linkedin.metadata.utils.log.LogMessageFilter"> 
        
               <excluded>scanned from multiple locations</excluded> 
        
           </filter>

datahub/metadata-service/war/src/main/resources/logback.xml

Lines 11 to 13 in b0163c4

    
           <filter class="com.linkedin.metadata.utils.log.LogMessageFilter"> 
        
               <excluded>scanned from multiple locations</excluded> 
        
           </filter>

datahub/datahub-upgrade/src/main/resources/logback.xml

Lines 11 to 13 in b0163c4

    
           <filter class="com.linkedin.metadata.utils.log.LogMessageFilter"> 
        
               <excluded>scanned from multiple locations</excluded> 
        
           </filter>

<filter class="com.linkedin.metadata.utils.log.LogMessageFilter">
	<excluded>scanned from multiple locations</excluded>
	<excluded>[ignore_throttled] parameter is deprecated because frozen indices have been deprecated</excluded>
</filter>

github-actions · 2024-04-14T02:16:35Z

This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io

github-actions · 2024-05-15T01:50:49Z

This issue was closed because it has been inactive for 30 days since being marked as stale.

githendrik added the bug Bug report label Jan 30, 2024

github-actions bot added the stale label Mar 1, 2024

Masterchen09 mentioned this issue Mar 13, 2024

fix: exclude Elasticsearch ignore_throttled warnings from log #10042

Merged

5 tasks

github-actions bot removed the stale label Mar 14, 2024

github-actions bot added the stale label Apr 14, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 15, 2024

henning-gerhardt mentioned this issue Feb 17, 2025

A lot of WARN messages from ElasticSearch usage kitodo/kitodo-production#6424

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using Elasticsearch 7.17.3, GMS generates a lot of log entries "[ignore_throttled] parameter is deprecated" #9745

Using Elasticsearch 7.17.3, GMS generates a lot of log entries "[ignore_throttled] parameter is deprecated" #9745

githendrik commented Jan 30, 2024

github-actions bot commented Mar 1, 2024

Masterchen09 commented Mar 13, 2024

github-actions bot commented Apr 14, 2024

github-actions bot commented May 15, 2024

Using Elasticsearch 7.17.3, GMS generates a lot of log entries "[ignore_throttled] parameter is deprecated" #9745

Using Elasticsearch 7.17.3, GMS generates a lot of log entries "[ignore_throttled] parameter is deprecated" #9745

Comments

githendrik commented Jan 30, 2024

github-actions bot commented Mar 1, 2024

Masterchen09 commented Mar 13, 2024

github-actions bot commented Apr 14, 2024

github-actions bot commented May 15, 2024