-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Facet value of boolean mapped fields should be "true" or "false" #2462
Comments
Hi, Any idea when this will be included in the master branch? Thanks! |
@kroepke This makes sense. We will get this in after the lucene4 refactorings. |
@martijnvg ok, great. Until then we will build our own version with that patch then. |
A workaround for this problem is using the following script in the term fact. {
"query": {
"match_all": {}
},
"facets": {
"bla": {
"terms": {
"term": true,
"size": 10,
"script": "doc['foo'].value == 'T' ? true : false"
}
}
}
} Results in: {
"facets": {
"bla": {
"_type": "terms",
"missing": 0,
"total": 1,
"other": 0,
"terms": [
{
"term": "true",
"count": 1
}
]
}
} |
@kroepke Just interested, how does using |
@karmi We haven't done any benchmarks yet, we hope to get this deployed the coming week. |
@karmi I've just done a very unscientific test, using the queries above. |
@kroepke Thanks for the numbers! That is still quite a good performance, considering it has to scan all the matching documents and aggregate the counts. |
this seems to be open for a long time... I will look into that soonish |
@s1monw |
our plan is to look into this for the 1.0 release. Your solution is definitely a possible way for us to support it. |
The following request illustrates the new behavior:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [ {
"_index" : "boolfacet",
"_type" : "auto",
"_id" : "1",
"_score" : 1.0, "_source" : {"foo":true}
} ]
},
"facets" : {
"bar" : {
"_type" : "terms",
"missing" : 0,
"total" : 1,
"other" : 0,
"terms" : [ {
"term" : "true",
"count" : 1
} ]
}
}
} Without the flag it's back to 'T' and 'F', also of course if it's set to 'false'. |
hey there, I just briefly looked at the pull request and I wonder if we can make this a little less intrusive. At the end of the day what counts here is what kind of representation we have to "True" and "False" so we could just load this through FieldData adding a feature next to the FieldDataFilter. I'd call it a FieldDataTransformer that you can use to transform `"T" to "True"' when field data is loaded then your facets would just naturally come back as a full boolean string? |
I'm not sure I can follow you there :) From what I can see you still need to have the field mapper to do that for you, since that's what it does, no? I'm asking because this problem affects every field that stores values in a different representation, e.g. IPs. For those you get back the int based representation, which I actually find worse than the boolean case... |
Hiya @kroepke Been talking to @s1monw about this and two features seem to make sense: FieldData Transformer:You would be able to specify a
The benefit here is that there would be a one-time cost of conversion (just when loading the fielddata) and the same values would be exposed for facets, scripting and sorting. Facet Value Transformer:A script which is applied just before the serialization of the facet responses into JSON which could, eg:
These would be more efficient than the current facet scripts because they would only be run on the return values, not on every value. However, you would still have the raw longs loaded into memory which are more useful for calculations (eg easier to calc netmasks on a long rather than on a string IPv4 address) I believe that the value transformers are planned for the new facets that we are planning for 1.0 (@uri?) |
The idea of having to specify a script for executing these mappings feel awkward, imho. I think that everybody using elastic search will eventually run into this issue, and the script solution seems more like a work around rather than an "option" of something that should be transparent in the first place. I could see this feature as a generic transformer(which i like) for the facets, but for elastic search builtin types(boolean, ip..) I think this should be done without the need of defining anything(at most a "flag", saying you want it to be done, and even that is just because of backward compatibility). If internally, this "automatic" transformation for the builtin types is done using the script implementation you just described, cool. but having everybody defining its own script for mappings that most likely everybody will do, feels wrong. that's my 2 cents anyway... @clintongormley maybe the wrong uri :D @uboness |
(heh, too many nicks in too many places) Yes, I understand what you mean about the core types in ES. I think Perhaps for these two types, this should be the default return value in the new aggregations rewrite. We don't have to maintain backward compatibility here because this is completely new functionality. The old facets will continue to work as before. For any other "label" transformations, the only generically useful way of doing them is via a script, as everybody will have their own particular requirements which would result in an explosion of options. |
@clintongormley Agreed. This way makes complete sense and I like the script approach as a generic way of doing this. Looking forward to it. |
@clintongormley In fact this applies to dates as well, if you use terms facets (even if you might argue that terms facets on dates could be done with the date histogram facet instead). |
@kroepke As I understand it, there are also plans for converting datetimes into strings, by specifying formats (rather than scripts). However, for (eg) charting libraries, it is usually more efficient to hand them millis-since-the-epoch instead of date strings, so the default return value will probably remain as longs. |
Any news on this? |
we'll work on this one post 1.0 |
We deprecated facets already in |
Any updates on which ES release will have this ? |
@aaneja will probably be 2.0 |
When performing the following index and search operations I would expect the facet values of boolean fields to be "true" or "false" and not "T" and "F", since "T" and "F" are implementation detail of the BooleanFieldMapper.
To reproduce (using any version I tested, including current master):
Note the "term": "T" line in the facets.
A preliminary patch follows shortly.
The text was updated successfully, but these errors were encountered: