-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DetectLanguage.scala: class LanguageIdentifier in package language is deprecated #286
Comments
@borislin if you have time, do you want to take this one on? It should be an easy one. |
@ruebot Sure, I'll work on this. |
Update: Current code for to fix this issue: https://github.com/archivesunleashed/aut/tree/refactor-detect-language I can't test my code now due to a a lot of dependency issues/errors in Maven log: build.log After discussing with @ruebot, it turns out that it's more complicated than we thought and we need more time to sort out this dependency hell before pushing a PR for this issue. |
I just pushed a fix to the dependency errors. They were caused by a conflict between versions of Guava. Hadoop 2.6.5 is bringing in Guava 11, while tika-langdetect requires a more modern version (1.19.1 calls for 17.0). I created a version of tika-langdetect that shades Guava, basically following what is described here. I pushed my changes to The build is still failing, but now it's because two tests fail:
I haven't looked into this yet. This shading solution is obviously not ideal, but it might do in the short term since we should be using the updated tika. The long term solution would be to upgrade Hadoop and our other dependencies. |
I remember going down this rabbit hole, and had setup a bunch of exclusions on the Guava dependencies. Maybe it would be worth going down that path again? That said, the transitive dependencies on this project are not fun to sort out! |
Started digging into the test failures. I suspect Tika is returning more with this version, and we need to dig into that more. But, maybe we should update our implementation too? I hadn't noticed this example before in the API documentation. |
Boris was never able to build it, and ran out of time before he left to finish it, so that explains why it never got that far. |
Follow-on to #285
I believe we need to update
DectectLanguage
to use this method.The text was updated successfully, but these errors were encountered: