Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

return the ids from a bulk insert [postgres only] #104

Closed
wants to merge 11 commits into from

Conversation

robmathews
Copy link
Contributor

A common request is to return the ids from the newly created object. This patch does that. It's postgres only, but at least it doesn't break for non-postgres databases - it just doesn't set the id.

The api is unchanged. Instead, for every model that was saved, it sets the :id and marks the record clean, so that you can keep using it if you want.

Seems like one could pretty easily do a BFS on the changed associations and bulk-import the next level down in the tree in one sql statement as well .. at least for postgres.

Let me know if this is any use to you; works fine for us, and that's all I need.

@zdennis
Copy link
Owner

zdennis commented Jul 3, 2013

@robmathews, thanks for the PR. I didn't know PostgreSQL had such a cool feature! This has me thinking of how to get this to work with MySQL and SQLite.

@robmathews
Copy link
Contributor Author

SQLite is easy, as you can be certain that there are no parallelism
concerns - just append ";select
last_insert_rowid()"http://www.sqlite.org/lang_corefunc.html#last_insert_rowid-
and now you know that the first id is that value - number of rows
inserted.

For MySQL you can append ";SELECT LAST_INSERT_ID()" to the end of the
query. You'd have to assume that every id value increased monotonically,
which may or may be be true - I don't know much about MySQL. Actually, one
of the answers herehttp://stackoverflow.com/questions/7333524/how-can-i-insert-many-rows-into-a-mysql-table-and-get-ids-backindicates
that for some versions of MySQL, that is a valid assumption.

SET @countTotal = (SELECT COUNT(*) FROM nGrams);

For other versions, this would work:

enter a transaction. Now, you wont be able to see any rows from any other
inserts

  • BEGIN TRANSACTION -- this would probably be done automatically by
    ActiveRecord
  • SET @start_id = (select max_id(my_table))
  • INSERT INTO my_table(....) VALUES (val1, val2...), (val3, val4...),....
  • select id from my_table where id >= start_id

Easy enough, actually.

@robmathews
Copy link
Contributor Author

lol, ignore that random paste in the middle of the answer ;)

On Thu, Jul 4, 2013 at 11:17 AM, Robert Mathews <
rob@justsoftwareconsulting.com> wrote:

SQLite is easy, as you can be certain that there are no parallelism
concerns - just append ";select last_insert_rowid()"http://www.sqlite.org/lang_corefunc.html#last_insert_rowid\- and now you know that the first id is that value - number of rows
inserted.

For MySQL you can append ";SELECT LAST_INSERT_ID()" to the end of the
query. You'd have to assume that every id value increased monotonically,
which may or may be be true - I don't know much about MySQL. Actually, one
of the answers herehttp://stackoverflow.com/questions/7333524/how-can-i-insert-many-rows-into-a-mysql-table-and-get-ids-backindicates that for some versions of MySQL, that is a valid assumption.

SET @countTotal = (SELECT COUNT(*) FROM nGrams);

For other versions, this would work:

enter a transaction. Now, you wont be able to see any rows from any other
inserts

  • BEGIN TRANSACTION -- this would probably be done automatically by
    ActiveRecord
  • SET @start_id = (select max_id(my_table))
  • INSERT INTO my_table(....) VALUES (val1, val2...), (val3,
    val4...),....
  • select id from my_table where id >= start_id

Easy enough, actually.

@robmathews
Copy link
Contributor Author

poke

@chtrinh
Copy link

chtrinh commented Aug 14, 2013

👍 to get this merged in

- rename a few variables (obj => model, val => parent) for example
- set the foreign key on a sub-collection when the parent is imported
- tests for importing 3 levels of objects
- remove extra recursion in add_objects
- new model for testing 3 levels of objects: chapter
@flyerhzm
Copy link

flyerhzm commented Sep 6, 2013

+1

@robmathews
Copy link
Contributor Author

You can try using this fork in the meantime:
https://github.com/GoodMeasuresLLC/activerecord-import

It efficiently bulk loads multiple levels of objects, ie, not just the top
level, but any subobjects underneath.

On Fri, Sep 6, 2013 at 4:57 AM, Richard Huang notifications@github.comwrote:

+1


Reply to this email directly or view it on GitHubhttps://github.com//pull/104#issuecomment-23927311
.

@flyerhzm
Copy link

flyerhzm commented Sep 6, 2013

@robmathews already used your fork, works perfect, thanks for your contrib

@workmaster2n
Copy link

@robmathews @zdennis pointed me to this PR. Looks pretty sweet. Couple questions: Where does your code handle the multilevel import? Or is there an example I can mimic somewhere? And why did GoodMeasuresLLC@6364198#diff-8b155a870c20b622968f2d2480ea2379R63 change to select_values instead of insert?

@robmathews
Copy link
Contributor Author

of course it does multi level. really, the kids these days....

Typos courtesy of my iPhone

On Jan 29, 2014, at 7:38 PM, Tyler DeWitt notifications@github.com wrote:

@robmathews @zdennis pointed me to this PR. Looks pretty sweet. Couple questions: Where does your code handle the multilevel import? Or is there an example I can mimic somewhere? And why did GoodMeasuresLLC/activerecord-import@6364198#diff-8b155a870c20b622968f2d2480ea2379R63 change to select_values instead of insert?


Reply to this email directly or view it on GitHub.

@robmathews
Copy link
Contributor Author

goodmeasures and me are the same organization. that is, i wrote it and had john @ goodmeasures finish debugging it. so i would tend to trust the goodmeasures version more.

Typos courtesy of my iPhone

On Jan 29, 2014, at 8:21 PM, Rob Mathews robertvmathews@gmail.com wrote:

of course it does multi level. really, the kids these days....

Typos courtesy of my iPhone

On Jan 29, 2014, at 7:38 PM, Tyler DeWitt notifications@github.com wrote:

@robmathews @zdennis pointed me to this PR. Looks pretty sweet. Couple questions: Where does your code handle the multilevel import? Or is there an example I can mimic somewhere? And why did GoodMeasuresLLC/activerecord-import@6364198#diff-8b155a870c20b622968f2d2480ea2379R63 change to select_values instead of insert?


Reply to this email directly or view it on GitHub.

@johnnaegle
Copy link
Contributor

For an example, there is this test:

https://github.com/GoodMeasuresLLC/activerecord-import/blob/master/test/import_test.rb#L350

Thats a multi-level test Topic => Books => Chapters in a single import: Topic.import new_topics. As far as why that changed, I'm not entirely sure at this point.

@workmaster2n
Copy link

@robmathews Sorry for the confusion on the multilevel import. I was curious how you implemented that feature and missed the commit that added the nested capability. I see it now. Nifty tricks you do there.

@johnnaegle Thanks for the link. That's exactly what I needed.

To any future searchers: As far as select_values vs inserts, they come from ActiceRecord::ConnecitonAdapters::DatabaseStatments http://apidock.com/rails/v4.0.2/ActiveRecord/ConnectionAdapters/DatabaseStatements/

I'm not exactly sure how that module is included in the ActiveRecord::Import module, but select_values is a function that converts the values returned from a sql call to an array. insert just returns the last id of the affected table. I don't know how insert was returning all the ids, but it must have been working. The switch to select_values appears to make more sense (per the documented source).

tl;dr: select_values is a more appropriate function

@azisaka
Copy link

azisaka commented Feb 17, 2014

@robmathews I just ran into this error undefined method proxy' for ActiveRecord::Associations::HasManyAssociation:0x007f93a93480c0`

The relevant backtrace

/Users/azisaka/.rbenv/versions/2.1.0/lib/ruby/gems/2.1.0/bundler/gems/activerecord-import-88422a5f47ed/lib/activerecord-import/import.rb:197:in `each'
/Users/azisaka/.rbenv/versions/2.1.0/lib/ruby/gems/2.1.0/bundler/gems/activerecord-import-88422a5f47ed/lib/activerecord-import/import.rb:197:in `add_objects'
/Users/azisaka/.rbenv/versions/2.1.0/lib/ruby/gems/2.1.0/bundler/gems/activerecord-import-88422a5f47ed/lib/activerecord-import/import.rb:181:in `block in import'
/Users/azisaka/.rbenv/versions/2.1.0/lib/ruby/gems/2.1.0/bundler/gems/activerecord-import-88422a5f47ed/lib/activerecord-import/import.rb:181:in `each'
/Users/azisaka/.rbenv/versions/2.1.0/lib/ruby/gems/2.1.0/bundler/gems/activerecord-import-88422a5f47ed/lib/activerecord-import/import.rb:181:in `import'

What is that proxy thing? Can you help me here? 😄
I'm running on Rails 4 and using your master branch.

@robmathews
Copy link
Contributor Author

try the goodmeasures llc fork

On Mon, Feb 17, 2014 at 3:11 AM, Bruno Azisaka notifications@github.comwrote:

@robmathews https://github.com/robmathews I just ran into this error undefined
method `proxy' for
ActiveRecord::Associations::HasManyAssociation:0x007f93a93480c0

The relevant backtrace

/Users/azisaka/.rbenv/versions/2.1.0/lib/ruby/gems/2.1.0/bundler/gems/activerecord-import-88422a5f47ed/lib/activerecord-import/import.rb:197:in each' /Users/azisaka/.rbenv/versions/2.1.0/lib/ruby/gems/2.1.0/bundler/gems/activerecord-import-88422a5f47ed/lib/activerecord-import/import.rb:197:inadd_objects'
/Users/azisaka/.rbenv/versions/2.1.0/lib/ruby/gems/2.1.0/bundler/gems/activerecord-import-88422a5f47ed/lib/activerecord-import/import.rb:181:in block in import' /Users/azisaka/.rbenv/versions/2.1.0/lib/ruby/gems/2.1.0/bundler/gems/activerecord-import-88422a5f47ed/lib/activerecord-import/import.rb:181:ineach'
/Users/azisaka/.rbenv/versions/2.1.0/lib/ruby/gems/2.1.0/bundler/gems/activerecord-import-88422a5f47ed/lib/activerecord-import/import.rb:181:in `import'

What is that proxy thing? Can you help me here? [image: 😄]

Reply to this email directly or view it on GitHubhttps://github.com//pull/104#issuecomment-35235936
.

@azisaka
Copy link

azisaka commented Feb 17, 2014

@robmathews That's what I did. The fork is not working.
On Feb 17, 2014 10:13 AM, "rob mathews" notifications@github.com wrote:

try the goodmeasures llc fork

On Mon, Feb 17, 2014 at 3:11 AM, Bruno Azisaka <notifications@github.com

wrote:

@robmathews https://github.com/robmathews I just ran into this error
undefined
method `proxy' for
ActiveRecord::Associations::HasManyAssociation:0x007f93a93480c0

The relevant backtrace

/Users/azisaka/.rbenv/versions/2.1.0/lib/ruby/gems/2.1.0/bundler/gems/activerecord-import-88422a5f47ed/lib/activerecord-import/import.rb:197:in
`each'

/Users/azisaka/.rbenv/versions/2.1.0/lib/ruby/gems/2.1.0/bundler/gems/activerecord-import-88422a5f47ed/lib/activerecord-import/import.rb:197:in
`add_objects'

/Users/azisaka/.rbenv/versions/2.1.0/lib/ruby/gems/2.1.0/bundler/gems/activerecord-import-88422a5f47ed/lib/activerecord-import/import.rb:181:in
`block in import'

/Users/azisaka/.rbenv/versions/2.1.0/lib/ruby/gems/2.1.0/bundler/gems/activerecord-import-88422a5f47ed/lib/activerecord-import/import.rb:181:in
`each'

/Users/azisaka/.rbenv/versions/2.1.0/lib/ruby/gems/2.1.0/bundler/gems/activerecord-import-88422a5f47ed/lib/activerecord-import/import.rb:181:in
`import'

What is that proxy thing? Can you help me here? [image: 😄]

Reply to this email directly or view it on GitHub<
#104 (comment)

.


Reply to this email directly or view it on GitHubhttps://github.com//pull/104#issuecomment-35255470
.

@zdennis
Copy link
Owner

zdennis commented Feb 17, 2014

If someone can get this PR merge-able and add a test for postgres I'll merge this in.

@robmathews
Copy link
Contributor Author

Ok Bruno. I'm on vacation this week, so I can't take a look at it until
next week. No, the error isn't familar. Maybe you can post a sample project
or failing test case in the gem?

On Mon, Feb 17, 2014 at 9:59 AM, Zach Dennis notifications@github.comwrote:

If someone can get this PR merge-able and add a test for postgres I'll
merge this in.

Reply to this email directly or view it on GitHubhttps://github.com//pull/104#issuecomment-35288766
.

@bogdanRada
Copy link

@robmathews I tried using your fork https://github.com/GoodMeasuresLLC/activerecord-import but
there is in file lib/activerecord-import/import.rb line 184
the code : clazz=Module.const_get(class_name) and this is unused and for me is causing some issues. Can you please remove that line? Thanks

@robmathews
Copy link
Contributor Author

done - thanks

On Wed, Apr 9, 2014 at 7:39 AM, bogdan notifications@github.com wrote:

@robmathews https://github.com/robmathews I tried using your fork
https://github.com/GoodMeasuresLLC/activerecord-import but
there is in file lib/activerecord-import/import.rb line 184
the code : clazz=Module.const_get(class_name) and this is unused and for
me is causing some issues. Can you please remove that line? Thanks


Reply to this email directly or view it on GitHubhttps://github.com//pull/104#issuecomment-39954092
.


models = args.first # the import argument parsing is too tangled for me ... I only want to prove the concept of saving recursively
result = import_helper(models, options)
# now, for all the dirty associations, collect them into a new set of models, then recurse.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RE #146, This is where autosave associations also get imported.

@bogdanRada
Copy link

@zdennis I have just one question.
If i have a model class that has a attribute but this attribute is not a column in the model's table, is it possible to validate that class ? because i tried to run the import with option {:validate=> true} but that attribute is not being validated at all, in fact it loses his value, and becomes nil, so the validation fails every time.

Problem i think is that you're using the column_names of that model to collect the his attributes, but this doesn't contain also the virtual ones. I am not sure if this could be fixed, but will really be nice if we could validate also this attributes.
But i guess this would be solved with this PR #118.

@alainmeier
Copy link

This is an excellent PR, will this be merged?

@zdennis
Copy link
Owner

zdennis commented Jan 29, 2015

@alainmeier, it will become much more likely if it's become merge-able and stays green.

@johnnaegle
Copy link
Contributor

I've rebased the commits onto the latest master and got the tests working under mysql, postgresql and sqlite3. The branch is here: https://github.com/GoodMeasuresLLC/activerecord-import/tree/set_ids_in_bulk_import. I will open a PR from that branch when I think it is ready, but some early feedback would be helpful.

@johnnaegle
Copy link
Contributor

See #178

@koheisg
Copy link

koheisg commented Jan 22, 2016

Why don't you close? it seems to be resolved for me.

@zdennis
Copy link
Owner

zdennis commented Jan 22, 2016

Thanks for following up @koheisg, closing now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.