-
Notifications
You must be signed in to change notification settings - Fork 294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migration to Elasticsearch 5 #254
Conversation
- PhotonQueryBuilder now uses QueryBuilder instead of FilterBuilder (API change in 2.0). However, note that some methods are now deprecated and will be removed in ES 3.0. - Updated elasticsearch-wordending-tokenfilter-filter to ES 2.2 - Groovy module installation now required for local node. - Test gateway type is not NONE anymore (None throws an error). - _id.path has been removed in 2.0 - name.similarity, collector.similarity in mappings.json was throwing errors. Apparently, they well not properly defined (no behavior). A similarity is now only defined for collector.properties.default.similarity.
- updated FilterBuilder -> QueryBuilder (new API) # Conflicts: # src/main/java/de/komoot/photon/App.java
- better index clean up in TagFilterQueryBuilderSearchTest
… ES 2.2.0 including the language specific, ngram-relevant fields 'e.g. en, it' | added jarhell fixes from karussell in issue komoot#217, truncating the mock stuff which is only necessary in maven test scope. Tests are running, so this is safe | Added flag to ignore java 7 'broken javaDoc means compile error' behaviour | remove the index after each test case in order to be able to run the test suite (this was left at issue komoot#217). This comprises modifications in ESBaseTester and TagFilterQueryBuilderSearchTest
…rrors with ES 5.4.1. in order I can work. Changed nothing else so code diffs are possible afterwards.
…removed wordending
…s now requiered by ES in order to run it embedded | Excluding jna because it yields to a bad version conflict with the jvm-own native interface.
…s used for testing
Maybe komoot @ChristophLing could indicate if that have any interest in further improvements or if a fork or a move of the repo to new maintainers would be preferred. |
@simonpoole this is not necessary IMO as @lonvia has access. @lonvia provides some help (and we the server) in the background to let @reuschling test this PR for bigger data sets, if this works we'll try this for the world wide export. Maybe we should better communicate this activity though. |
…y against the distance to the given location in the location bias request. This would be distanceSort=true, which is also the default. 'false' gives the former behaviour.
@@ -42,13 +53,25 @@ public PhotonRequestFactory(Set<String> supportedLanguages) { | |||
} catch (Exception nfe) { | |||
//ignore | |||
} | |||
Boolean locationDistanceSort = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest to switch to 'false' as default. While I like the distance sort (=stronger bias) when zoomed to greater zoom levels, the former results are closer to my expectations when zoomed out. E.g. for Stuttgart with a location bias of (51;9) I would expect to see Stuttgart in the results, rather than 'Stuttgarter Straße' in Frankfurt or Kassel. Perhaps providing a zoom level or weighting factor for the bias would be an option for a later PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmh, probably we should argue then how it was before (I think it was kind of broken ie it should be indeed false) because if you argue on expectation I can give you a counter argument like when you search for "bahnhof" I would expect the train station within that city.
Perhaps providing a zoom level or weighting factor for the bias would be an option for a later PR.
Related to #198 ... probably we could also provide a bounding box filter.
Please @reuschling correct me if I'm wrong but I think even without reverse geocoding we need the geo index to provide the location bias. |
@karussell yes indeed, geocoding is a must for searching and sorting against geo coordinates, even to provide the photon standard functionalities. @lonvia I removed the unnecessary jar, thanks for finding this. @hbruch, @karussell I would also go for the bounding box or a radius, to keep the sorting criteria well defined. We should also discuss this in a different issue. |
Perform first queries with zero fuzziness | Fix throws declaration | Adapt tests to perform zero fuzziness query
@hbruch provided a nice solution for the 'Steigstr Stuttgart' issue. I merged it, thanks for this again. |
Thanks! I've deployed the recent change to the test server and ran the benchmark #263 - now the counts for every subtests are identical to the current master. (Haven't looked into the details though) |
The response times are worse than before except for France: poland 80sec, USA 54s, Germany 60s, France1 31s, France2 350s But the single runs are sometimes different in itself - so I'll have to make the measurement more robust as a first step |
sorry for the bubble, was a measurement problem: Germany 18s, USA 17s, poland 6sec, France1 9s, France2 100s |
@lonvia do you agree or disagree regarding mergeability of this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Next time please make sure you only address one issue per PR.
Thanks @reuschling, @lonvia, @hbruch and finally the initiator @masda70! |
btw: @reuschling if you connect your git commit email to github, the commits will be linked to your account and you'll also appear as contributor in the stats. |
Is the radius parameter public? how would an api call look like? |
Please ask questions on the mailing list |
This is the current result for Elasticsearch 5 migration, as asked for at #249 (from myself).
In this pull request, everything from #219 is included plus some commits for bugfixing (tests are running now at least on my machine, plus the jarhell fixes from #217. This is commit 538ee4d.
Further, I implemented the migration to ES 5. The tests are running, I also have a running server with results that looks subjectively good as a first impression (further testing needed of course). I just reindexed the old ES index into a new one for testing - currently I have no Nominatim server for testing the import code. Regarding the Nominatim import, there was also a regarding commit these days - I didn't rebased it yet for simplicity during the merge.
I am looking forwad to feedback and comments :)