Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get result as document #960

Closed
yehosef opened this issue Oct 18, 2015 · 11 comments
Closed

get result as document #960

yehosef opened this issue Oct 18, 2015 · 11 comments

Comments

@yehosef
Copy link

yehosef commented Oct 18, 2015

I would like to query for some documents, make changes to them, and then save them again. Normally when getting a single doc by primary key, I get an elastica\document and make the changes via that.

I would also like to use this approach when retrieving a Elastica\ResultSet or Result. I would like to either retrieve this as a Document or convert it to one (eg, including the version info). Is this possible? And if not, could there be a way to do it?

@ruflin
Copy link
Owner

ruflin commented Oct 19, 2015

Currently this is not directly supported by Elastica. It was requested several times in the past but nobody implemented it so far. It shouldn't be too hard to implement it on top of ResultSet or Result. It would required to create document objects based on the retrieved values.

@yehosef
Copy link
Author

yehosef commented Oct 19, 2015

ok - .. you can leave open for a little and I'll see if I can do it. I
have a working implementation. The issue is that probably it needs to
ensure that the fields are not specified or else it will give partial
documents and saving them would not give the desired results.

On Mon, Oct 19, 2015 at 9:48 AM, Nicolas Ruflin notifications@github.com
wrote:

Currently this is not directly supported by Elastica. It was requested
several times in the past but nobody implemented it so far. It shouldn't be
too hard to implement it on top of ResultSet or Result. It would required
to create document objects based on the retrieved values.


Reply to this email directly or view it on GitHub
#960 (comment).

@ruflin
Copy link
Owner

ruflin commented Oct 19, 2015

👍 Let me know if you need some help.

@yehosef
Copy link
Author

yehosef commented Oct 19, 2015

Thanks! I'm building an ORM like structure using Elastica - this is the part in my constructor where I handle Elastic\Result

 elseif ($data instanceof Elastica\Result)
    {
        /** @var Elastica\Result $data */

        $doc = new Elastica\Document($data->getId(), $data->getData());

        $version = $data->getVersion();
        $version && $doc->setVersion($version);

        $this->doc = $doc;
        $this->doc->setDocAsUpsert(true);
    }

a few things I'm not sure about:

  1. do I need the version check, or will a Result always have a version?
  2. I'm not sure if I need the DocAsUpsert (I not even really sure what this does.. :)
  3. is there anything else I need to be passing into the new Document from the Result?

Like I said, I assume there should be some $data->getFields() check or something - if I specified the fields, I assume it would fail because I might not have all the data.

Any other ideas based on your experience with ES?

ps - thank you so much for Elastica - it's really amazing!

@ruflin
Copy link
Owner

ruflin commented Oct 20, 2015

  • As far as I know, version is only returned if version is set to true in the request. So you need to build in a check
  • DocAsUpsert means the values in the document are used to "partially" update a document. Assuming you get "all" data as a response, you don't need this. There are some additional cases with script, but I think this would complicate the code too much (keep it simple and cover 80% of the use cases).
  • You should also add index and type to the document

The line $version && $doc->setVersion($version); is quite hard to read. I would split it up to an if clause.

Good point about the fields. I think the creation of the documents should be triggered by a separate function as not all cases will support this. In this case if no fields exist an exception can be thrown. Which cases it work should be part of the function doc header. Again, lets make baby step. Have a function that works in the standard cases but throw an exception if not all data we need exists.

I think important will be to add some integration tests to see if it works as expected.

@yehosef
Copy link
Author

yehosef commented Oct 20, 2015

Thanks for the comments.

I don't set a version in the doc, will it do a check? If not, if the version isn't set in the request then the doc will not have it and it'll save fine anyway.
About the inline if - I find the style easier to read, but I know others don't so I'll change.

You say

In this case if no fields exist an exception can be thrown
This should be if the $data->getFields() is set, no? If I specified the fields in the request, then I can't know without comparing the fields (which is messier and I don't think an important starting point).

About adding the index/type. When I normally get back a document from a primary key lookup, do I also have the index and type? I'm not familiar enough with the interplay between documents and index/types.

What do you mean that the creation of the doc should be in a separate function? I'm assuming there will be a specific factory function in Elastica (maybe Elastica\Document). I don't think in the main class I would put in the constructor - I just did it my class because it's an ORM. Perhaps something like Elastica\Document::createFromResult. There could also be a createFromResultSet as a helper to do the looping.

What do you think about these directions?

@ruflin
Copy link
Owner

ruflin commented Oct 20, 2015

  • About inline and not: You are right, this is very subject. Main reason is that so far it is never used (as far as I know) in the library.
  • For every _hit in results you have index and type. See also here: https://www.elastic.co/guide/en/elasticsearch/reference/1.7/_the_search_api.html
  • I was thinking of ResultSet->getDocuments() and/or Result->getDocument(). The outcome would be the same, but the function would be more where it's needed (I assume). So ->getDocuments() would iterate over results and use the getDocument() function.

@ruflin
Copy link
Owner

ruflin commented Dec 31, 2015

@yehosef Any updates here?

@yehosef
Copy link
Author

yehosef commented Dec 31, 2015

sorry - I haven't worked on this any more. I think the approach with the ResultSet->getDocuments() and/or Result->getDocument() is great.

@ruflin
Copy link
Owner

ruflin commented Jan 4, 2016

@yehosef Do you have any plans to continue work on it?

@intech
Copy link
Contributor

intech commented Jan 13, 2016

A good idea. I'm thinking about the same and found this issue. Are needed methods ResultSet->getDocuments() and Result->getDocument()

@intech intech mentioned this issue Jan 13, 2016
ruflin added a commit that referenced this issue Jan 17, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants