Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_uid sorting #79

Closed
dclemenzi opened this issue Jul 7, 2018 · 4 comments
Closed

_uid sorting #79

dclemenzi opened this issue Jul 7, 2018 · 4 comments
Labels

Comments

@dclemenzi
Copy link
Contributor

Having issues with line 195 in ElasticFeatureSource appending _uid sorting when another SortBy property is specified.

        if (isSort(query) && nativeQueryBuilder.equals(ElasticConstants.MATCH_ALL)) {
            searchRequest.addSort("_uid", naturalSortOrder);
        }

Which creates this curl command:

curl -k --cert /etc/pki/java/temp.pem  -iX POST 'https://node26:9202/coalesce-linkages/Linkage/_search' -d '{"size":1,"query":{"bool":{"must":[{"term":{"coalesceentity.objectkey":"847b178b-6abb-4b1d-a2db-979bdeb00968"}},{"term":{"coalescelinkage.entity2name":"SpiderQuery"}}]}},"from":0,"_source":["coalesceentity.objectkey","coalescelinkage.entity2key"],"sort":[{"coalesceentity.lastmodified":{"order":"asc"}},{"_uid":{"order":"asc"}}]}'

On a (364 Million) database this is causing the following exception:

Caused by: org.elasticsearch.client.ResponseException: method [POST], host [https://node26:9202], URI [/coalesce-linkages/Linkage/_search], status line [HTTP/1.1 500 Internal Server Error]
{"error":{"root_cause":[{"type":"circuit_breaking_exception","reason":"[fielddata] Data too large, data for [_uid] would be [2942561521/2.7gb], which is larger than the limit of [2556061286/2.3gb] …

Removing the _uid sort resolves the issue and returns the expected results. Looking through the code I figured out that we could bypass this logic by adding the following hint onto my query:

Hints hints = new Hints();
hints.put(Hints.VIRTUAL_TABLE_PARAMETERS, Collections.singletonMap("q", Collections.EMPTY_MAP.toString()));

Is this the correct way? Is there a better way? Am I loosing anything by replacing the nativeQueryBuilder?

@sjudeng
Copy link
Contributor

sjudeng commented Jul 7, 2018

I think it's okay as a workaround and you won't lose anything. The default MATCH_ALL nativeQueryBuilder will result in the query builder derived from your OGC filter being used anyway (see here).

But it seems like a bug on the plugin side. Looking at it again I think it's there just to support GeoTools NATURAL/REVERSE_ORDER (specifically these tests). Given this it seems that really where it's currently checking isSort it should be checking "isFidSort", with the implementation being to check specifically whether sort is SortBy.NATURAL_ORDER or SortBy.REVERSE_ORDER.

@sjudeng sjudeng added the bug label Jul 7, 2018
@dclemenzi
Copy link
Contributor Author

Appreciate the quick response.

Based on the code it does not look like it derives from the query? The boolean check is using getNativeQueryBuilder() not getQueryBuilder().

Still not sure why when sorting on _uid this would cause the query to fail in the first place. This particular query does not return a significant number of results which would imply that the sorting is being applied before the filter? But this behavior is not consistent with sorting on other fields which work. This is probably more of an ElasticSearch issue or an issue with how we have it configured and outside the scope of this plugin.

FYI the _uid and _type fields are deprecated in version 6.0.0 and are being removed in version 7.0.0 so this will probably need to change when updating this plugin to the latest version. Tried changing it to _id but that resulted in this error:

{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Fielddata is not supported on field [_id] of type [_id]"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"coalesce-linkages","node":"yR7p1MvzSFuB4yOHb-Wzxg","reason":{"type":"illegal_argument_exception","reason":"Fielddata is not supported on field [_id] of type [_id]"}}]},"status":400}

We are running version 5.4 so maybe this would work in 6+?

@dclemenzi
Copy link
Contributor Author

So my workaround for this issue still works but now that I am using the hints for aggregations and highlighting I would prefer to pass-through the hints from the client without making any modifications. Is there any reason why this check is using the nativeQueryBuilder instead of the queryBuilder? Switching this resolves the issue and it looks like its the only thing nativeQueryBuilder is used for. Setting the "q" hint it modifies the nativeQueryBuilder in FilterToElastic which is why the workaround works.

@sjudeng
Copy link
Contributor

sjudeng commented Nov 28, 2018

It's not obvious to me why it's nativeQueryBuilder instead of queryBuilder. Do tests still pass if you make the switch?

It seems like the better solution might be to remove the _uid block here and instead add the logic in as a condition here, something like below. I can try to test this out unless you get to it first.

...
} else if (sort == SortBy.NATURAL_ORDER || sort == SortBy.REVERSE_ORDER) {
    searchRequest.addSort("_uid", naturalSortOrder);
} ...

dclemenzi added a commit to dclemenzi/elasticgeo that referenced this issue Nov 29, 2018
…etermine whether or not to append sorting on _uid
dclemenzi added a commit to dclemenzi/elasticgeo that referenced this issue Nov 30, 2018
# The first commit's message is:

ngageoint#79 Logic was not using the query created by the filter to determine whether or not to append sorting on _uid

# This is the 2nd commit message:

switch variable
dclemenzi added a commit to dclemenzi/elasticgeo that referenced this issue Nov 30, 2018
…etermine whether or not to append sorting on _uid
sjudeng added a commit that referenced this issue Dec 3, 2018
dclemenzi added a commit to dclemenzi/elasticgeo that referenced this issue Dec 5, 2018
…etermine whether or not to append sorting on _uid
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants