Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dev/core#663 - Use InnoDB engine for extended log tables #14256

Merged
merged 2 commits into from
Jun 28, 2019

Conversation

pfigel
Copy link
Contributor

@pfigel pfigel commented May 15, 2019

Overview

CiviCRM's optional detailed logging feature creates a log table to mirror every table in the civicrm_% namespace. To preserve disk space, log tables use the ARCHIVE engine by default. ARCHIVE has a number of limitations, most importantly:

  • No index support
  • No support for UPDATE or DELETE
  • No crash-safety or transaction support

This has lead to a number of performance, reliability and durability issues in the past. It is possible to use different storage engines by installing extensions, but CiviCRM should offer sane and safe defaults.

Before

Detailed log tables use the ARCHIVE engine if available.

After

Detailed log tables on new installations use the InnoDB engine by default. Log tables created at some point after this version was installed also use the InnoDB engine.

Existing log tables will continue to use the ARCHIVE table.

Site administrators may convert existing log tables using the System.updatelogtables API with forceEngineMigration set.

Technical Details

Some tests were added to cover the new behaviour. Migration code will not trigger if an engine was set by the hook, so ARCHIVE could still be used that way.

Comments

An update for the sysadmin guide was prepared in civicrm/civicrm-sysadmin-guide#167, including notes on migration and a way to restore the old default (should anyone want to do that). It assumes we're targeting 5.16 for this.

@civibot
Copy link

civibot bot commented May 15, 2019

(Standard links)

@civibot civibot bot added the master label May 15, 2019
Copy link
Member

@xurizaemon xurizaemon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see in other parts of this PR that similar issues with potentially confusing naming exist (eg createLogTableFor() would be createDataLoggingTableFor() in my perfectly coloured bikeshed), so please consider my comments only in the scope of new additions I guess?

@@ -302,7 +302,7 @@ public function fixSchemaDifferences($enableLogging = FALSE) {
* and also implements the engine change defined by the hook (i.e. INNODB).
*
* Note changing engine & adding hook-defined indexes, but not changing back
* to ARCHIVE if engine has not been deliberately set (by hook) and not dropping
* to INNODB if engine has not been deliberately set (by hook) and not dropping
* indexes. Sysadmin will need to manually intervene to revert to defaults.
*/
public function updateLogTableSchema() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I suggest that here and in the naming of the API call System.updatelogtables, we are explicit about what this actually does? Eg convertDataLoggingTablesFromArchiveToInnoDB() is explicit, where "updateLogTables" could mean many things.

  • In future there will be other updates, and a function named like this will occupy a namingspace that also fits other functions.
  • Who wouldn't run a function called "update"? Sounds reasonable, but as @Stoob points out, this process has an impact on a functioning site.
  • CiviCRM has a pattern of recycling terminology in multiple spaces (I count six different meanings for "log"), I think it's helpful for both DX and UX if we refer to things using clear terminology.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment above was written thinking you'd added the function in this PR - on second view this need not block your proposed changes (but please consider it where adding new methods etc).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the current naming is confusing - I've spent time in this code in the past, but I still get System.updatelogtables mixed up with System.createmissinglogtables all the time.

What makes it a bit tricky is that, in addition to changing the engine schema, the function/API will potentially also do the following things:

  • Change the column definition of log_conn_id
  • Create any missing indexes (which may have been added via hook_civicrm_alterLogTables)

I'm okay with renaming the existing API if we can come up with a less confusing name for the API that encapsulates all of that.

@pfigel pfigel force-pushed the innodb-log-tables branch 3 times, most recently from 6bdff20 to adf428a Compare May 16, 2019 14:30
@eileenmcnaughton
Copy link
Contributor

@pfigel I feel like we would normally chunk things like this over a couple of releases with one release just changing the default behaviour for new installs / new enable loggings & the next chunk addressing existing sites

@pfigel pfigel force-pushed the innodb-log-tables branch from adf428a to 7b56be8 Compare June 8, 2019 09:51
pfigel added a commit to greenpeace-cee/civicrm-sysadmin-guide that referenced this pull request Jun 8, 2019
This updates the detailed logging documentation to reflect the changes
being made in core in civicrm/civicrm-core#14256.

Additionally, it adds some notes on log table migration and adds a
reference to a new extension allowing for more flexible configuration
of log tables.
@pfigel pfigel force-pushed the innodb-log-tables branch from 7b56be8 to 9eb81a0 Compare June 8, 2019 10:19
This changes the default storage engine used for log tables to
InnoDB in order to provide safer defaults. Existing log tables are
not upgraded unless explicitly requested using the new forceEngineMigration
System.updatelogtables API parameter. Hook overrides are respected.
@pfigel pfigel force-pushed the innodb-log-tables branch from 9eb81a0 to 204aa6f Compare June 8, 2019 12:04
@pfigel pfigel marked this pull request as ready for review June 8, 2019 12:04
@pfigel pfigel changed the title [WIP] dev/core#663 - Use InnoDB engine for extended log tables dev/core#663 - Use InnoDB engine for extended log tables Jun 8, 2019
@pfigel
Copy link
Contributor Author

pfigel commented Jun 8, 2019

@eileenmcnaughton I've extracted the system check/upgrade prompt for a future PR and changed System.updatelogtables to only trigger migration if a new forceEngineMigration parameter is set.

This is now ready for review. I've opened a corresponding doc update in civicrm/civicrm-sysadmin-guide#167.

@eileenmcnaughton
Copy link
Contributor

@Stoob ping

@eileenmcnaughton
Copy link
Contributor

This is good from a code point of view - I made minor comments - @Stoob is going to try to r-run it & if that is good we can merge

This adds API defaults for the updateChangedEngineConfig and
forceEngineMigration parameters of System.updatelogtables and removes
some fallback code that's unnecessary now.
@pfigel
Copy link
Contributor Author

pfigel commented Jun 14, 2019

@eileenmcnaughton @Stoob I added API defaults for forceEngineMigration (and updateChangedEngineConfig, to keep it consistent).

@mattwire
Copy link
Contributor

@pfigel If we're already using https://github.com/eileenmcnaughton/nz.co.fuzion.innodbtriggers and have already converted our log tables to innodb then applying this patch and disabling that extension should be fine?

@pfigel
Copy link
Contributor Author

pfigel commented Jun 28, 2019

@mattwire generally yes, although this change does not enable InnoDB compression (ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4), which nz.co.fuzion.innodbtriggers does, so it's not a drop-in replacement.

The behaviour would be like this:

  1. For new installations (or installations that have never had logging enabled): InnoDB is used
  2. For existing installations without nz.co.fuzion.innodbtriggers or similar: ARCHIVE is used for existing log tables, InnoDB for any new ones. Migration to InnoDB may be forced via forceEngineMigration.
  3. For existing installations with nz.co.fuzion.innodbtriggers or similar: The engine/config set by the extension wins.
  4. For existing installations that have previously used nz.co.fuzion.innodbtriggers or similar: Existing log tables will remain as-is, new ones will not use the engine config ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4 (i.e. will not be compressed).

New tables are generally any tables in the civicrm_% namespace that are added by core or an extensions, or a new custom field set created at some point after this change.

@seamuslee001
Copy link
Contributor

I have tested this and confirmed that without this change log tables are created as Archive, with this change they are created with INNODB, i have also confirmed that this change by default doesn't change the log tables themselves without the forceEngineMigration parameter and confirmed when that is set the tables are correctly converted to INNODB. This looks good to me I'm going to merge based on my Testing and EIleen's comment

@seamuslee001 seamuslee001 merged commit 64338ae into civicrm:master Jun 28, 2019
@eileenmcnaughton
Copy link
Contributor

Yay - Archive is on the way out!

@Stoob
Copy link
Contributor

Stoob commented Jun 29, 2019

My test site successfully implemented nz.co.fuzion.innodbtriggers on about 150k records. I'm eager to give this patch a try out too. Thanks for this! I can rebuild my dev site if needed @eileenmcnaughton

@eileenmcnaughton
Copy link
Contributor

@Stoob so I think this uses INNODB by default - but I don't think it adds any indexes by default - which you'd likely want for 150k records - so nz.co.fuzion.innodbtriggers may still make sense for you

@Stoob
Copy link
Contributor

Stoob commented Jun 29, 2019

Ok, thanks for the clarification @eileenmcnaughton. So the patch makes the table engine conversion, but does not add indexes. The extension adds indexes. For large installs, the extension adds value.

@eileenmcnaughton
Copy link
Contributor

@Stoob yes that is my understanding

pfigel added a commit to greenpeace-cee/civicrm-sysadmin-guide that referenced this pull request Aug 10, 2019
This updates the detailed logging documentation to reflect the changes
made in civicrm/civicrm-core#14256.

Additionally, it adds some notes on log table migration and adds a
reference to a new extension allowing for more flexible configuration
of log tables.
pfigel added a commit to greenpeace-cee/civicrm-sysadmin-guide that referenced this pull request Aug 11, 2019
This updates the detailed logging documentation to reflect the changes
made in civicrm/civicrm-core#14256.

Additionally, it adds some notes on log table migration and adds a
reference to a new extension allowing for more flexible configuration
of log tables.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants