-
-
Notifications
You must be signed in to change notification settings - Fork 828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use "text" type mysql field when exporting any id-type field #12598
Conversation
…onverted into text of arbitrary length
(Standard links)
|
@adixon for now the test fails relate - https://test.civicrm.org/job/CiviCRM-Core-PR/21897/testReport/junit/(root)/CRM_Export_BAO_ExportTest__testGetSQLColumns/testGetSQLColumns_with_data_set__0/ but I agree that performance testing is the key here - we should make sure we get some exports done of ~60000+ rows & comparitive times on it |
So there's one special case - the primary civicrm id really is exported as an integer number, and it's used as an index on the table, so we need to give it a varchar type. I just updated my PR. |
Once I've got the rest of the cleanup on export merged to an rc I'll test this + export in general on some larger exports & get some stats |
As as aside @adixon the more I dig into the export the less sure I am that the temp table is used in a way that it improves performance. We actually iterate through every row & write it to the temp table & then pull it out & write to csv. I think the way the merge household & merge address was written is reliant on that process but at least in the case of merge household I'm sure that it could be done at least as efficiently without the temp table |
Is there anything I can do to help get this pull requested merged? |
$lookUp = ['prefix_id', 'suffix_id']; | ||
// special case: civicirm_primary_id exports the contact id and is an index field | ||
if ($fieldName == 'civicrm_primary_id') { | ||
return "$fieldName varchar(16)"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we use varchar
to store an integer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this change will need to be tested on servers with significant data sets for performance - we have a good theory as to what will work well but it needs some concerted testing. It's also possible we can & should eliminate the temp table per my comment above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@colemanw : in an export, in general, a civi field stored as an integer might actually end up getting a long string value (e.g. the gender field, which is what triggerred this PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok but in the case of civicrm_primary_id
we know the temp table is storing an integer, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so you're asking why varchar(16) for the civicrm_primary_id instead of an integer. The short answer is, I'm not sure, but we're generating the export table and perhaps there's a reason they should all be text fields so we can process them into a csv file more easily? The existing code is converting everything to text fields as well, so that issue is kind of outside the scope of the PR (though it's a reasonable question).
I'm closing this out because it's stale - I actually think the answer, after a lot of digging in that code, is not to write the translated fields to a temp table. We only actually need the temp table when merging same addresses |
Are you proposing an alternative fix? This is a bug that a client experienced and the patch does resolve it (and also cleans up a few other past hacks), so if you're suggesting an alternative way of resolving it then a few more details might be reasonable to request? |
The code has changed quite a bit so the patch doesn't apply anymore - but basically I think we need to review what we are writing to the temp table and why - there really isn't much reason for the temp table in most cases as we iterate through the entire php array anyway it doesn't help with performance and we could just write straight to the csv and skip the temp table. The one blocker to skipping the temp table is checking how we'd handle merge to address |
Use text type field when exporting any id-type field that might get converted into text of arbitrary length
Overview
Exports of fields that contain "id" type values may contain text of arbitrary length (e.g. values from option groups).
Before
Add a gender type "I am not sure and neither are you" (i.e. anything more than 16 characters). You will be unable to export contacts if you include gender in your export fields: you'll get the unhelpful "DB Error Unknown error" fatal error.
After
Export now works.
Technical Details
Comments
Testing the effects of the change in a large export would be worthwhile, i.e the performance and disk-usage implications.