-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure conflicted fields can be searchable and/or aggregatable #13070
Ensure conflicted fields can be searchable and/or aggregatable #13070
Conversation
Nice @chrisronline! Is this something we can backport to 5.5 as well? I'm worried that we'll see more issues like this as 5.5 gains greater adoption |
@alexfrancoeur I don't see a reason why not, but I'd defer to @epixa's judgement |
Yeah, we'll want to fix this in 5.5 as well |
@@ -67,8 +67,8 @@ export function readFieldCapsResponse(fieldCapsResponse) { | |||
return { | |||
name: fieldName, | |||
type: 'conflict', | |||
searchable: false, | |||
aggregatable: false, | |||
searchable: types.reduce((acc, esType) => (acc || capsByType[esType].searchable), false), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be cleaner if this used Array#some()
rather than reduce.
types.some(type => !!capsByType[type].searchable)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed!
@spalger Great suggestion. Updated |
Do we know which aggregations supported conflict fields in 5.x? I'm trying to do an aggregations on a field that is long in one index and integer in another, since the field is conflict types I'm not able to select it in the histogram field chooser. |
Per a discussion with @spalger, we've decided to make some changes here. Instead of trying multiple types as a conflict in the field caps response, we are now resolving all multiple types into kibana types and if they all resolve into the same kibana type, we're no longer considering that a conflict. For example, Because this will solve the immediate use case, we can revert all the UI changes since those fields will not be reported as conflicted. In the case where there are still conflicted fields (where the types simply are different), the behavior will still exist where users are unable to select that field for filtering or aggregating. ++ to @spalger for the suggestion! |
@chrisronline Can you elaborate on why this is the better approach to simply leaning on field_caps for these values? |
@@ -63,27 +64,45 @@ export function readFieldCapsResponse(fieldCapsResponse) { | |||
const capsByType = capsByNameThenType[fieldName]; | |||
const types = Object.keys(capsByType); | |||
|
|||
// If there are multiple types but they all resolve to the same kibana type | |||
// ignore the conflict and carry on (my wayward son) | |||
if (types.length > 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason to check the length of types here, just get the uniqueKibanaTypes
and check that.
@epixa I wouldn't say this is better, but it is better replicating the behavior of 5.x |
if (uniqueKibanaTypes.length > 1) { | ||
// If a single type is marked as searchable or aggregatable, all the types are searcuable or aggregatable | ||
const conflictIsSearchable = types.some(type => { | ||
return !!capsByType[type].searchable || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic would probably make a nice little helper function that could be reused below too. Maybe something like hasCapability(capsByType[type], 'searchable')
?
Based on the test data listed in the description, consider the responses from field_stats and field_caps:
The
With field_caps, it doesn't resolve them to a single type but returns both types. However, if we converted both Per @spalger, we want to achieve as much of 5.x behavior as possible and this smaller code change will still get us there. |
Thanks for the summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Can you merge in the latest master to get CI running on a working commit? |
a4eb414
to
c11457e
Compare
@epixa Sure! Just rebased and CI is building now |
Can we get a test for these changes? This seems like a pretty critical behavior for how we handle fields in Kibana, so a regression here will break a lot of stuff (as evidence by our regression here breaking a lot of stuff). |
@epixa Definitely. I somehow missed the existing test file when looking the first time, but I updated it to include tests for the newly added logic. |
c0063f0
to
7048fce
Compare
@spalger Added the functional test. Lemme know if that looks right to you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple minor things, once they're fixed I say merge this sucker!
] | ||
const dateField = resp.body.fields.find(f => f.type === 'date'); | ||
const successField = resp.body.fields.find(f => f.type === 'conflict'); | ||
expect(dateField).to.eql({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests intentionally test the entire response body. If it changes at all these tests should fail. Mind not filtering but asserting that the entire body matches?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Updated
"settings": { | ||
"index": { | ||
"number_of_shards": "1", | ||
"number_of_shards": "5", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason for 5 shards, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason, no. I didn't manually change that so I'm not sure what happened, but reverted back to 1
@spalger Updated. Ready for another pass! |
LGTM |
* Ensure that conflict fields can be searchable and/or aggregatable in the UI * Use `some` instead of `reduce` * Revert UI changes * Attempt to convert multiple ES types to kibana types, and if they all resolve to the same kibana type, there is no conflict * Add comma back * Cleaner code * Add tests * Update failing test to handle searchable and aggregatable properly * Add functional test to ensure similar ES types are properly merged * Update tests * Revert shard size
* Ensure that conflict fields can be searchable and/or aggregatable in the UI * Use `some` instead of `reduce` * Revert UI changes * Attempt to convert multiple ES types to kibana types, and if they all resolve to the same kibana type, there is no conflict * Add comma back * Cleaner code * Add tests * Update failing test to handle searchable and aggregatable properly * Add functional test to ensure similar ES types are properly merged * Update tests * Revert shard size
* Ensure that conflict fields can be searchable and/or aggregatable in the UI * Use `some` instead of `reduce` * Revert UI changes * Attempt to convert multiple ES types to kibana types, and if they all resolve to the same kibana type, there is no conflict * Add comma back * Cleaner code * Add tests * Update failing test to handle searchable and aggregatable properly * Add functional test to ensure similar ES types are properly merged * Update tests * Revert shard size
* Ensure that conflict fields can be searchable and/or aggregatable in the UI * Use `some` instead of `reduce` * Revert UI changes * Attempt to convert multiple ES types to kibana types, and if they all resolve to the same kibana type, there is no conflict * Add comma back * Cleaner code * Add tests * Update failing test to handle searchable and aggregatable properly * Add functional test to ensure similar ES types are properly merged * Update tests * Revert shard size
This regression breaks a lot of our visualizations as well. Spending [more] hours to reindex is looking to be a lost cause, as it's a very slow process. Very happy to hear that the string->text/keyword problem is worked in newer versions of Kibana. Any indication when 5.5.2 might be released as a package? The snapshot release link is 404, sadly. |
@excalq We're working on shoring up/QAing the 5.5.2 release now, so hopefully it'll go out in the next week or two. |
I have met this issue, my elastic version is 5.5.1. What version fixed this issue? |
@KeithTt 5.5.2 |
@epixa, @spalger, @chrisronline - @AlonaNadler and I are chatting about the impending ECS changes and there is concern that this will cause a lot more of these index mapping conflicts. It appears that this PR didn't actually fix the problem, since we have the open issues referenced above. Is this correct? The issue still exists? Disclaimer - I did not read through all the comments, but judging by the last two, seemed like it should be fixed. Thanks for any extra insight! |
Looking at the testing description, it looks like it is focused on concrete data type differences (i.e., text vs. keyword, keyword vs. integer, integer vs. long, etc..)? What about the scenario if one index has "host" as a keyword field vs. another index where "host" represents as an object (i.e. host.X, etc..)? |
@stacey-gammon The underlying issue here is that Elasticsearch (by design) does not support searching across types without causing shard errors. There might be exceptions to that that I'm not familiar with, but on the whole that's how it is. This particular PR was removing an artificial limitation in Kibana that disallowed even attempting to search/aggregate on a field that we identified as having conflicts, so the end result would be that you can attempt to aggregate on the field anyway and if ES returns shard failures, we'll show them as errors. For large data sets, the shard filter phase in Elasticsearch will kick in and should eliminate the shard errors in the most common cases because it will limit the amount of shards that actually get hit to serve the request, so it probably won't touch older indices that have different mappings. The downside to this approach is that when you do search across mapping boundaries, you get shard errors. Small data sets might be under the 128 shard default threshold for the shard filter phase, they might need to reindex. At this scale, reindexing is realistic, so I'm personally pretty comfortable with the ES team's recommendation here. So the big issue is on larger data sets when searching across mapping boundaries, and I'm not sure how we could fix that reliably. So far every time we've attempted to add conveniences in Kibana to diverge from Elasticsearch's behaviors in order to improve the search experience, we've shot ourselves in the foot and made things worse. The behavior this PR undoes is such an example where we tried to make the UX nicer for folks and ended up hurting people. |
It seems like the PR is not complete; specifically, it doesn't appear to handle the concrete vs. object type scenario. Take this Elasticsearch example: Say there are 2 instances: By the way, the above example is not arbitrary, this is one possibility of Beat's breaking schema change in 6.3+:
If an aggregation is requested against test-* and "host" field, ES will still return aggregation results without errors:
Except that it will only include the aggregation results obtained from In Kibana, when it detects a difference in the field type across the 2 indices, it will simply block "host" from being used in any aggregations. Even though field caps indicate that the "host" field for
It seems like for consistency purposes, Kibana can relax the conflict checking here and also allow "host" field to be aggregatable. This way, Kibana can just let ES decide what aggregation results will actually be returned if the users' date range cover both forms of the indices v.s., today where it completely prevents users from being able to use the host field that is present in older indices. |
@ppf2 Can you open a new issue so the apps team can review? This issue is ancient history. |
Resolves #12728
Summary
This PR ensures that fields that have type conflicts will no longer be assumed as non aggregatable or non searchable within Kibana.
With this change, there is a possibility of shard failures in ES. Due to a recent change in ES, users with more than 128 shards will not see a shard failure error because of the type conflict. However, users with less than 128 shards will need to update their mappings and reindex their data to prevent related shard failures.
Testing
text
andkeyword
orlong
andinteger
)Note: If you already have an environment setup with conflicted fields, simply refresh the field list for the kibana index pattern instead.
Testing Screenshots
My environment is less than 128 shards so I will see a shard failure:
![screen shot 2017-07-24 at 2 38 40 pm](https://user-images.githubusercontent.com/56682/28538895-cfb74818-707d-11e7-8d87-67d215609459.png)
Sample Data