Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stats and extended_stats aggregation unexpectedly fails with "Cannot format stat [max] with format ..." #113811

Closed
gmalkas opened this issue Sep 30, 2024 · 3 comments · Fixed by #113846
Assignees
Labels
:Analytics/Aggregations Aggregations >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@gmalkas
Copy link

gmalkas commented Sep 30, 2024

Elasticsearch Version

8.15.2

Installed Plugins

No response

Java Version

bundled

OS Version

Linux 5.15.102-1-pve #1 SMP PVE 5.15.102-1 (2023-03-14T13:48Z) x86_64 GNU/Linux

Problem Description

We recently upgraded from 8.12.2 to 8.15.2 and started to receive errors in a query with stats and extended_stats aggregations:

"reason": {
  "type": "illegal_argument_exception",
    "reason": "Cannot format stat [max] with format [DocValueFormat.DateTime(format[epoch_millis] locale[], Z, MILLISECONDS)]",
    "caused_by": {
      "type": "date_time_exception",
      "reason": "Field EpochMillis cannot be printed as the value -9223372036854775808 cannot be negative according to the SignStyle"
    }
}

Yet, a simple max aggregation on the same dataset returns the expected result:

    "aggregations": {
        "recorded_at": {
            "value": 1726341900000.0,
            "value_as_string": "1726341900000"
        }
    }

And a min aggregation shows there is no negative timestamp in this dataset:

"aggregations": {
  "recorded_at": {
    "value": 1726331520000.0,
      "value_as_string": "1726331520000"
  }
}

Here is the relevant part of the index mappings:

{
  "signals": {
    "mappings": {
      "dynamic": "strict",
        "properties": {
          "recorded_at": {
            "type": "date",
            "format": "epoch_millis"
          }
        }
    }
  }
}

I think it might have been introduced by #107678 in 8.14.0.

Steps to Reproduce

$ curl -XPUT -H'content-type: application/json' localhost:9200/myindex -d '{"mappings": { "dynamic": "strict", "properties": { "recorded_at": { "type": "date", "format": "epoch_millis" } } }, "settings": { "number_of_replicas": 0, "number_of_shards": 1 }}'
{"acknowledged":true,"shards_acknowledged":true,"index":"myindex"}
$ curl -XPOST -H'content-type: application/json' localhost:9200/myindex/_bulk --data-binary @/tmp/bulkdata.json
$ curl -XPOST -H'content-type: application/json' localhost:9200/myindex/_search -d '{"size": 0, "aggregations": {"recorded_at": {"stats": {"field": "recorded_at"}}}}'
{"took":0,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":174,"relation":"eq"},"max_score":null,"hits":[]},"aggregations":{"recorded_at":{"count":174,"min":1.72633152E12,"max":1.7263419E12,"avg":1.72633671E12,"sum":3.0038258754E14,"min_as_string":"1726331520000","max_as_string":"1726341900000","avg_as_string":"1726336710000","sum_as_string":"300382587540000"}}}

As you can see, I was not able to reproduce using a simplified dataset extracted from production that contains only the recorded_at field. Our production dataset that is experiencing the issue is 22 GB. I haven't yet found a way to reproduce the error with a smaller dataset I could share here.

The reason I thought it might be a bug is that the same query used to work before we upgraded from 8.12. It's possible there is an issue within our dataset and the query used to ignore it before validation was introduced in 8.14.0, but I wasn't able to find any negative timestamp in the entire dataset, using a min aggregation with no filter. The error message with the large negative value makes me think it might be a bug.

I am not sure how I could troubleshoot this further to pinpoint the data causing the problem.

Thank you!

Logs (if relevant)

No response

@gmalkas gmalkas added >bug needs:triage Requires assignment of a team area label labels Sep 30, 2024
@gmalkas
Copy link
Author

gmalkas commented Sep 30, 2024

I was able to reproduce the bug on an empty index, I think I hadn't properly restarted Elasticsearch after upgrading on my test machine:

$ sudo /usr/share/elasticsearch/bin/elasticsearch --version 
Version: 8.15.2, Build: deb/98adf7bf6bb69b66ab95b761c9e5aadb0bb059a3/2024-09-19T10:06:03.564235954Z, JVM: 22.0.1
$ curl -XPUT -H'content-type: application/json' localhost:9200/myindex -d '{"mappings": { "dynamic": "strict", "properties": { "recorded_at": { "type": "date", "format": "epoch_millis" } } }, "settings": { "number_of_replicas": 0, "number_of_shards": 1 }}'
{"acknowledged":true,"shards_acknowledged":true,"index":"myindex"}
$ curl -XPOST -H'content-type: application/json' localhost:9200/myindex/_search -d '{"size": 0, "aggregations": {"recorded_at": {"stats": {"field": "recorded_at"}}}}'
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Cannot format stat [max] with format [DocValueFormat.DateTime(format[epoch_millis] locale[], Z, MILLISECONDS)]"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"myindex","node":"C4_0XtBXSRy0LB-20AK3zQ","reason":{"type":"illegal_argument_exception","reason":"Cannot format stat [max] with format [DocValueFormat.DateTime(format[epoch_millis] locale[], Z, MILLISECONDS)]","caused_by":{"type":"date_time_exception","reason":"Field EpochMillis cannot be printed as the value -9223372036854775808 cannot be negative according to the SignStyle"}}}],"caused_by":{"type":"illegal_argument_exception","reason":"Cannot format stat [max] with format [DocValueFormat.DateTime(format[epoch_millis] locale[], Z, MILLISECONDS)]","caused_by":{"type":"illegal_argument_exception","reason":"Cannot format stat [max] with format [DocValueFormat.DateTime(format[epoch_millis] locale[], Z, MILLISECONDS)]","caused_by":{"type":"date_time_exception","reason":"Field EpochMillis cannot be printed as the value -9223372036854775808 cannot be negative according to the SignStyle"}}}},"status":400}

Hope this helps.

Thank you.

@iverase iverase added :Analytics/Aggregations Aggregations and removed needs:triage Requires assignment of a team area label labels Oct 1, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Oct 1, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@iverase
Copy link
Contributor

iverase commented Oct 1, 2024

Thank you for reporting, It is indeed a bug introduced by that PR. I open a PR to fox it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants