Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Importing an array of hashes ignores columns not in the first hash #507

Closed
ahmad-sanad opened this issue Mar 9, 2018 · 7 comments
Closed
Labels

Comments

@ahmad-sanad
Copy link

Title says it all. Say we have table Foo with columns id, bar, baz, and want to import the following:

arr = [
  { bar: 'abc' },
  { baz: 'xyz' },
  { bar: '123', baz: '456' }
]

Foo.import arr

This creates the rows:

+----+------+------+
| id | bar  | baz  |
+----+------+------+
| 1  | abc  | null |
+----+------+------+
| 2  | null | null |
+----+------+------+
| 3  | 123  | null |
+----+------+------+

instead of the expected:

+----+------+------+
| id | bar  | baz  |
+----+------+------+
| 1  | abc  | null |
+----+------+------+
| 2  | null | xyz  |
+----+------+------+
| 3  | 123  | 456  |
+----+------+------+

It's possible to get around this by using ActiveRecord objects, for example:

arr.map! { |args| Foo.new(args) }
Foo.import arr

but this is still a rather pernicious bug.

@yinghaochan
Copy link

points for use of pernicious

@jkowens
Copy link
Collaborator

jkowens commented Mar 9, 2018

Are you using a version older than 0.21.0? I believe this issue was addressed in that release.

@ahmad-sanad
Copy link
Author

Thanks for the response, looks like I'm off by a version. Can this be documented somewhere? Since this use case isn't supported (it raises an exception), I feel like it's a worthy addition to the docs. Maybe here? I realize that isn't what the example is doing, but it doesn't explicitly say that it isn't possible either.

Also, looking through the discussion of how the decision of raising an exception came to be, I'm not sure I understand the concerns with setting the values to column defaults. That is, after all, the default behavior for ActiveRecord (or any insert query where you don't specify the value of a column for that matter) and I think a valid expectation for this gem is that it would do the same. It would nice to have that as an option at least.

I'm happy to create PRs for any of my suggestions if we are in agreement.

@jkowens
Copy link
Collaborator

jkowens commented Mar 10, 2018

Absolutely, this should definitely be documented at that wiki page. For now, an example of how to group group hashes by keys as suggested would be nice as well:

One suggestion that doesn't require a change: The caller could group their hashes by keys and then run multiple import calls, one for each set of keys/columns. This is slightly for work for the caller but puts the caller in the situation where it's clear what their code is intending to do and that the data is expected to not conform to a homogenous set of keys/columns.

See: #465 (comment)

I'll have to give more thought to your suggestion of using the column default. One concern I possibly have is that when using on_duplicate_key_update someone might unintentionally overwrite existing column values.

@jkowens jkowens added the docs label Mar 15, 2018
@jkowens jkowens mentioned this issue Oct 3, 2018
10 tasks
@dillonwelch
Copy link
Contributor

To make sure I'm understanding correctly - this is a bug that is solved in the latest version and you still want it documented in the README?

@ahmad-sanad
Copy link
Author

This bug was addressed in 0.21.0 by raising an exception instead of silently doing the counter-intuitive behavior I described. My proposal was to document the fact that this use case isn't supported, that it will raise an exception, and that the work-around I talked about should be used instead.

dillonwelch added a commit to dillonwelch/activerecord-import that referenced this issue Nov 2, 2018
@9mm
Copy link

9mm commented Feb 8, 2024

@jkowens is there any possibility of just allowing a default value? (nil)

Part of the entire reason of using this gem is so I DONT have to instantiate tons of unecessary objects, I'm bulk inserting 100's of millions of records. It would be cool to just have it default to nil if its missing so the expected value is null like in OPs issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants