Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slash / is escaped with two slashes instead of one, parse failure #660

Closed
maxgalbu opened this issue Aug 6, 2014 · 17 comments · Fixed by #668
Closed

slash / is escaped with two slashes instead of one, parse failure #660

maxgalbu opened this issue Aug 6, 2014 · 17 comments · Fixed by #668

Comments

@maxgalbu
Copy link
Contributor

maxgalbu commented Aug 6, 2014

I'm using:

  • elasticsearch 1.3.1
  • elastica (master branch)

My fix is to escape with one \ instead of \

$result = str_replace('/', '\\/', $result);

I had a look at previous issues but i can't see why it was escaped two times.

here's the log

[2014-08-06 10:16:25,587][DEBUG][action.search.type       ] [Stanley Stewart] [fai][3], node[3obbmGZQS-S8hZ9twBLROg], [P], s[STARTED]: Failed to execute [org.elasticsearch.action.search.SearchRequest@50021d3]
org.elasticsearch.search.SearchParseException: [fai][3]: from[-1],size[-1]: Parse Failure [Failed to parse source [{"query":{"query_string":{"query":"gioved\u00ec 3\\\\/7 a","default_operator":"OR"}},"filter":{"or":[{"term":{"tipo":"pagina"}},{"term":{"tipo":"evento"}}]},"highlight":{"pre_tags":["<b>"],"post_tags":["</b>"],"fields":{"nome":{"fragment_size":300,"number_of_fragments":1},"nome_ita":{"fragment_size":300,"number_of_fragments":1},"nome_eng":{"fragment_size":300,"number_of_fragments":1},"testo_ita":{"fragment_size":300,"number_of_fragments":1},"testo_eng":{"fragment_size":300,"number_of_fragments":1}}}}]]
        at org.elasticsearch.search.SearchService.parseSource(SearchService.java:664)
        at org.elasticsearch.search.SearchService.createContext(SearchService.java:515)
        at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:487)
        at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:256)
        at org.elasticsearch.search.action.SearchServiceTransportAction$5.call(SearchServiceTransportAction.java:206)
        at org.elasticsearch.search.action.SearchServiceTransportAction$5.call(SearchServiceTransportAction.java:203)
        at org.elasticsearch.search.action.SearchServiceTransportAction$23.run(SearchServiceTransportAction.java:517)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.index.query.QueryParsingException: [fai] Failed to parse query [giovedì 3\\/7 a]
        at org.elasticsearch.index.query.QueryStringQueryParser.parse(QueryStringQueryParser.java:240)
        at org.elasticsearch.index.query.QueryParseContext.parseInnerQuery(QueryParseContext.java:238)
        at org.elasticsearch.index.query.IndexQueryParserService.innerParse(IndexQueryParserService.java:342)
        at org.elasticsearch.index.query.IndexQueryParserService.parse(IndexQueryParserService.java:268)
        at org.elasticsearch.index.query.IndexQueryParserService.parse(IndexQueryParserService.java:263)
        at org.elasticsearch.search.query.QueryParseElement.parse(QueryParseElement.java:33)
        at org.elasticsearch.search.SearchService.parseSource(SearchService.java:648)
        ... 9 more
Caused by: org.apache.lucene.queryparser.classic.ParseException: Cannot parse 'giovedì 3\\/7 a': Lexical error at line 1, column 16.  Encountered: <EOF> after : "/7 a"
        at org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase.java:130)
        at org.apache.lucene.queryparser.classic.MapperQueryParser.parse(MapperQueryParser.java:882)
        at org.elasticsearch.index.query.QueryStringQueryParser.parse(QueryStringQueryParser.java:223)
        ... 15 more
Caused by: org.apache.lucene.queryparser.classic.TokenMgrError: Lexical error at line 1, column 16.  Encountered: <EOF> after : "/7 a"
        at org.apache.lucene.queryparser.classic.QueryParserTokenManager.getNextToken(QueryParserTokenManager.java:1133)
        at org.apache.lucene.queryparser.classic.QueryParser.jj_scan_token(QueryParser.java:599)
        at org.apache.lucene.queryparser.classic.QueryParser.jj_3R_2(QueryParser.java:482)
        at org.apache.lucene.queryparser.classic.QueryParser.jj_3_1(QueryParser.java:489)
        at org.apache.lucene.queryparser.classic.QueryParser.jj_2_1(QueryParser.java:475)
        at org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:226)
        at org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:212)
        at org.apache.lucene.queryparser.classic.QueryParser.TopLevelQuery(QueryParser.java:170)
        at org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase.java:120)
        ... 17 more
@ruflin
Copy link
Owner

ruflin commented Aug 6, 2014

Here is the issues with pull request when we changed it: #589

@ruflin
Copy link
Owner

ruflin commented Aug 6, 2014

@maxgalbu Can you take a look at this to see why this breaks again (or not)?

@maxgalbu
Copy link
Contributor Author

maxgalbu commented Aug 6, 2014

IMO the best way would be to add an actual test of escapeTerm in a query with an elasticsearch daemon. In the tests folder I only see a test for the output of the function, but I could be wrong.

If that's not the case, can you point me to a SomethingTest.php file where I can add an actual query? I can try to edit it and make a pull request

@ruflin
Copy link
Owner

ruflin commented Aug 6, 2014

Unfortunately it seems that you are right. Can you create a test with an actual query and open a pull request?

@maxgalbu
Copy link
Contributor Author

maxgalbu commented Aug 8, 2014

That's what I was saying man :)

I'll see what test file I can edit

@ruflin
Copy link
Owner

ruflin commented Aug 8, 2014

👍

@maxgalbu
Copy link
Contributor Author

The travis-ci test confirms the bug on php 5.3 but not on php 5.4... WTF?

https://travis-ci.org/maxgalbu/Elastica/builds/32777811

@ruflin
Copy link
Owner

ruflin commented Aug 17, 2014

This was the first thing that came to my mind but I think it is not related. http://php.net/manual/en/security.magicquotes.php

PHP 5.3 is EOL ;-)

@maxgalbu
Copy link
Contributor Author

magicquotes should not be related, it shouldn't escape chars on a literal string.

Removing the double quote makes it work with php 5.3 but not > 5.3:

https://travis-ci.org/maxgalbu/Elastica/builds/32806687

So, would you accept a patch that makes Elastica work with php 5.3? It would be a single if that escapes with a single or double \

@ruflin
Copy link
Owner

ruflin commented Aug 18, 2014

As I'm quite sure still a lot of users will keep using PHP 5.3 I will accept a patch as long as it doesn't break PHP 5.4 and following.

@kostiklv
Copy link
Contributor

I faced a similar issue and here is what I found:

  1. JSON class is using JSON_UNESCAPED_SLASHES which is supported ONLY in 5.4. It means that / are NOT being escaped in 5.4, but still escaped in 5.3
  2. Util::escapeTerm has a special treatment for / character - it's being escaped with two backslashes, instead of a single backslash (compared to other reserved characters).

This 2 things combined produce unexpected results, especially on different PHP versions:
If you have a query with term "foo/bar" which you escape with Util::escapeTerm:

  • On PHP 5.3 it will end up as foo\\\\\/bar (5 backslashes - 2 for each backslash added by escapeTerm plus 1 added by json_encode for forward slash
  • On PHP 5.4 it will end up as foo\\\\/bar (4 backslashes - 2 for each backslash added by escapeTerm and 0 for forward slash, thanks to JSON_UNESCAPED_SLASHES

Here you can see this problem in action: http://3v4l.org/YYd4F

I propose to fix it by:
A. Removing JSON_UNESCAPED_SLASHES - it was introduced in #560, based on suggestions from #559, however #559 didn't mention JSON_UNESCAPED_SLASHES. So, I fail to see why JSON_UNESCAPED_SLASHES is needed at all. Note, that official ES PHP Client doesn't use it.
B. Remove weird treatment of forward slash escaping in escapeTerm, ES documentation doesn't make a difference between forward slash and other reserved characters escaping.

If you apply the above fixes, it will work the same in both PHP 5.3 and above: http://3v4l.org/5gE31

P.S. We're still using Elastica 1.1, where the whole escaping is complicated even more by this weird maniuplation

@ruflin
Copy link
Owner

ruflin commented Aug 20, 2014

@kostiklv Thank your for the detailed analysis. I agree with all points you mentioned above. Can you open a pull request based on it?

@maxgalbu
Copy link
Contributor Author

Good analysis, well done :)

If you open a pull request, can you copy my test that i added in my fork:

maxgalbu@1f5c107

...to make sure the escape is tested against elasticsearch?

Or if you want i can add your changes to my fork and do a pull request myself.

@kostiklv
Copy link
Contributor

Ok, if you agree on the proposed changes, I'll try to create a PR later today or tomorrow. Thanks!

@ruflin
Copy link
Owner

ruflin commented Aug 20, 2014

Please coordinate with each other which is going to do which part. In any case, I'm looking forward to the pull request.

@kostiklv
Copy link
Contributor

I've submitted the PR - #668, took several retries to make Travis happy, but now it's green

@maxgalbu, I've pulled your commit with test into my PR. Thanks!

@Tobion
Copy link
Collaborator

Tobion commented Jan 14, 2015

@kostiklv good job

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants