Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[5.2] Allows developers to traverse their query result via a cursor #13030

Merged
merged 9 commits into from
May 24, 2016
Merged

[5.2] Allows developers to traverse their query result via a cursor #13030

merged 9 commits into from
May 24, 2016

Conversation

kharysharpe
Copy link
Contributor

Ran into an issue with attempting to use chunk wherein the underlining database would change and therefore the data would no longer be as initially requested. It is also generated a query for every chuck it returns.

Traditionally in php we would do something like below, but I found no Eloquent equivalent.

    while ($row = $stmt->fetch(PDO::FETCH_NUM)) {
      $data = $row[0] . "\t" . $row[1] . "\t" . $row[2] . "\n";
      print $data;
    }

  • Efficiently process thousands of records, uses memory efficiently.
  • Hits the database once with single query with local and global scopes.
  • Allows you to use a cursor to process the result set.

Usage:

foreach(Flight::where('foo', 'bar')->cursor() as $model) {

    //Do something complex with each $model

};

Update: Using generator instead of callback
Update: Using Eloquent model and changed name to cursor

Khary Sharpe added 3 commits April 5, 2016 19:39
Efficiently process thousands of records, uses least amount of memory,
hits the database once with single query, allows you to use a ‘cursor’
to process the result set via a callback/closure.
@GrahamCampbell GrahamCampbell changed the title Allows developers to traverse their query result via a cursor [5.2] Allows developers to traverse their query result via a cursor Apr 6, 2016
@ConnorVG
Copy link
Contributor

ConnorVG commented Apr 6, 2016

If this is to go ahead, I propose we name the function either scroll or caret as this is a cursor design (or the obvious cursor or pointer).

Fetch is too generic (and already in-use).

Khary Sharpe added 3 commits April 6, 2016 07:46
Set Fetch mode
Switched to using generators
@kharysharpe
Copy link
Contributor Author

Changed the name to traverse() as it more intuitively describes what is being done.

@GrahamCampbell
Copy link
Member

Please squash to one commit.

Also, it doesn't matter if StyleCI is failing. We have it set to auto-merge the fixes whenever we merge PRs. Please see our contribution guide.

It does matter that travis is failing though.

// developer take care of everything within the callback, which allows us to
// keep the memory low for spinning through large result sets for working.

$continue = (yield $row);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to assign here? Should this not just be:

if ($row === false) {
    return;
}

yield $row;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First thought was allowing for a response to be sent back, but for this use case you are correct it is not needed.

Efficiently process thousands of records, uses least amount of memory,
hits the database once with single query, allows you to use a ‘cursor’
to traverse the result set.
@barryvdh
Copy link
Contributor

barryvdh commented Apr 6, 2016

@GrahamCampbell You don't have to squash your commits in a PR, you can just squash them while merging. See https://github.com/blog/2141-squash-your-commits

@vlakoff
Copy link
Contributor

vlakoff commented Apr 6, 2016

Haven't thorougly reviewed the code, but if this allows for more efficient queries on large datasets, big +1 on the idea. There should be tests for such crucial and low-level operations.

@patsplat
Copy link

patsplat commented Apr 7, 2016

Don't like adding new api for something that seems like it should be written the right way.

Rather than adding new api to safely iterate through a large db result set, is it possible to change the base db collection classes to fetch one row at at time?

@patsplat
Copy link

patsplat commented Apr 7, 2016

if there is to be new api recommend the name "cursor()" as in:

foreach( Model::where('foo', 'bar')->cursor() as $model) {

}

Would align with ->get() and ->first() as "another method that transitions a query into results"

@kharysharpe
Copy link
Contributor Author

Now returns Eloquent model

@phroggyy
Copy link
Contributor

As I can see, the only drawback is that it's impossible to eager load results. That said, I think this is a feature that should be added, and a small note made about the limitations.

@tomschlick
Copy link
Contributor

👍 being able to query via a cursor would be very helpful for large data dumps/aggregation jobs

@jbrooksuk
Copy link
Member

Cursor is the right name for this, please don't change it.

@asantibanez
Copy link
Contributor

Could it be called "iterate"? Or something similar to " iterator"?

//Hydrate and yield an Eloquent Model
$model = $this->model->newFromBuilder($row);

yield $model;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally, the first php yield keyword in laravel! 😀👍

@tomschlick
Copy link
Contributor

As @jbrooksuk said, please keep it cursor.

@GrahamCampbell
Copy link
Member

Could it be called "iterate"? Or something similar to " iterator"?

No.

@dark-panda
Copy link

One addition I'd recommend is adding a call to PDOStatement::closeCursor() somewhere in there. According to the docs at http://php.net/manual/en/pdostatement.closecursor.php...

"To ensure that the application will work with all database drivers, the author inserts a call to PDOStatement::closeCursor() on $stmt before executing the $otherStmt PDOStatement object."

@JosephSilber
Copy link
Member

How would eager loading work with this?

@phroggyy
Copy link
Contributor

@JosephSilber it wouldn't as far as I can see, as you can't retrieve any
primary key early. I think it should still be implemented, but with a note
in the docs
On Thu, 14 Apr 2016 at 21:53, Joseph Silber notifications@github.com
wrote:

How would eager loading work with this?


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#13030 (comment)

@JosephSilber
Copy link
Member

JosephSilber commented Apr 14, 2016

It kinda breaks the way Eloquent works, so I think this should only exist on the base query builder.

If you really need Eloquent objects, you can hydrate them yourself.

foreach (User::where('foo', 'bar')->toBase()->cursor() as $row) {
    $user = (new User)->newFromBuilder($row);

    // Do something complex with each $user
}

This would require a matching method on the Eloquent builder that throws an error if used directly.


BTW, generate or traverse make way more sense than cursor.

@phroggyy
Copy link
Contributor

Agreed, it should be on the QB I guess
On Thu, 14 Apr 2016 at 22:03, Joseph Silber notifications@github.com
wrote:

It kinda breaks the way Eloquent works, so I think this should only exist
on the base query builder.

If you really need Eloquent objects, you can hydrate them yourself.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#13030 (comment)

@BrandonShar
Copy link
Contributor

I like it as part of eloquent; it seems to me that it being part of eloquent takes care of the most common use. I feel like @JosephSilber 's example would be the majority of use cases.

@kharysharpe
Copy link
Contributor Author

kharysharpe commented May 4, 2016

Thanks for the feedback and comments.

What changes, if any, are needed for this to be accepted?

@taylorotwell
Copy link
Member

taylorotwell commented May 24, 2016

Just pulling this down and running DB::cursor('select * from users') throws an exception for me:

SQLSTATE[HY000]: General error: fetch mode requires the classname argument (SQL: select * from users)

Is this supposed to be working? If I can't even do that how do I know the rest of this was even tested manually?

@kharysharpe
Copy link
Contributor Author

Use case was for eloquent only.

foreach(Flight::where('foo', 'bar')->cursor() as $model)  ...

I will take a look at that and fix it.

@taylorotwell
Copy link
Member

taylorotwell commented May 24, 2016

It doesn't work like that either... Same error. Did you actually test that this was working on your machine?

@taylorotwell
Copy link
Member

I'm fixing it all. Don't worry about it.

@taylorotwell taylorotwell merged commit cca72f7 into laravel:5.2 May 24, 2016
@taylorotwell
Copy link
Member

Fixed all the problems with this. DB::cursor('sql') now returns a generator itself. The query and Eloquent versions use that. The Eloquent version returns another generator.

@GrahamCampbell
Copy link
Member

:D

@kharysharpe
Copy link
Contributor Author

kharysharpe commented May 24, 2016

Hmmm....that's weird thought I had renamed those already.
Thanks for fixing.
Looking forward to this being included in Laravel.

Edit:
I see what happened - parent is a2a1fb9 and not cca72f7

@kharysharpe
Copy link
Contributor Author

@dark-panda mentioned using PDOStatement::closeCursor() - #13030 (comment)

Should/Could this be included as well?

@taylorotwell
Copy link
Member

I’m not sure. Maybe.

On May 24, 2016, at 11:27 AM, kharysharpe notifications@github.com wrote:

@dark-panda https://github.com/dark-panda mentioned using PDOStatement::closeCursor() - #13030 (comment) #13030 (comment)
Should/Could this be included as well?


You are receiving this because you modified the open/close state.
Reply to this email directly or view it on GitHub #13030 (comment)

@kharysharpe
Copy link
Contributor Author

Thought about doing something like this:

            $continue = (yield $model);

            if ($continue === false) {
                 //PDOStatement::closeCursor();
                return;
            }

But this breaks HHVM tests, as it is not supported.

@vlakoff
Copy link
Contributor

vlakoff commented May 26, 2016

Is there documentation for this?

@taylorotwell
Copy link
Member

Doesn’t look like it has been added yet.

On May 26, 2016, at 1:47 PM, vlakoff notifications@github.com wrote:

Is there documentation for this?


You are receiving this because you modified the open/close state.
Reply to this email directly or view it on GitHub #13030 (comment)

@vlakoff
Copy link
Contributor

vlakoff commented May 26, 2016

pinging @kharysharpe, otherwise no doubt there will be little hands to take care of this.

@taylorotwell
Copy link
Member

I keep a list of all things needing documentation so I can do it for 5.3 release :)

On May 26, 2016, at 1:49 PM, vlakoff notifications@github.com wrote:

pinging @kharysharpe https://github.com/kharysharpe, otherwise no doubt there will be little hands to take care of this.


You are receiving this because you modified the open/close state.
Reply to this email directly or view it on GitHub #13030 (comment)

@kharysharpe
Copy link
Contributor Author

@vlakoff I will make an attempt at documentation tonight.

@vlakoff
Copy link
Contributor

vlakoff commented Jun 17, 2016

Just to bring it a bit more visibility, I've added a code comment concerning fetch mode.

@vlakoff
Copy link
Contributor

vlakoff commented Jun 21, 2016

Refs #14052, which fixes support for PDO::FETCH_CLASS mode in cursor().

@nerdo
Copy link

nerdo commented Jul 22, 2016

+1, but generate() makes more sense to me than cursor(), as it more clearly describes what is happening under the hood at the PHP level and maintains a level of abstraction with the concept of a true DB cursor, which this isn't AFAIK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.