Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(dev/core#174) CRM_Utils_Cache - Always use a prefix. Standardize delimiter #12331

Merged
merged 4 commits into from
Jun 21, 2018

Conversation

totten
Copy link
Member

@totten totten commented Jun 18, 2018

"Prefixes" are a way to have one cache-server (e.g. one instance of redis or memcached)
which stores several different data-sets. CRM_Utils_Cache uses prefixes in a couple ways:

  • (1) General site prefix (controlled via civicrm.settings.php)
    • (1a) If you have a single-site deployment, then the general prefix is blank.
    • (1b) If you have a multi-site deployment, then each site should use a different prefix (mysite_1, mysite_2, etc).
  • (2) Within a given deployment, prefixes may indicate different logical data-sets.
    • (2a) Civi::cache() or Civi::cache('default') or CRM_Utils_Cache::singleton() are the default data-set.
    • (2b) CRM_Utils_Cache::create() can instantiate new, special-purpose
      data-sets. For example, this is used for Civi::cache('js_strings').

This patch addresses two issues:

  • (Functional bug) Flushing the 'default' cache would flush all other caches because the 'default' cache didn't have a distinctive prefix. (Note: Bug may not manifest on all cache drivers.)
  • (Aesthetic) The full cache paths don't look consistent because they don't have a standard dlimiter.

To fully understand, it helps to see example cache keys produced in a few
configurations before and after the patch.

See also: https://lab.civicrm.org/dev/core/issues/174

Before

Deployment Type Logical Cache Combined Cache Prefix Example Cache Item (foobar)
1a,2a Single-site default (empty-string) foobar
1a,2b Single-site js_strings _js_strings _js_stringsfoobar
1b,2a Multi-site default mysite_1_ mysite_1_foobar
1b,2b Multi-site js_strings mysite_1_js_strings mysite_1_js_stringsfoobar
  • If you have a single-site deployment and try to flush default, you inadvertently flush js_strings because everything matches the empty-string prefix.
  • If you have a multi-site deployment and try to flush default, you inadvertently flush js_strings because the prefix overlaps.
  • The three parts of the key (deployment ID, logical cache, and cache item) are not necessarily separated.

After

Deployment Type Logical Cache Combined Cache Prefix Example Cache Item (foobar)
1a,2a Single-site default /default/ /default/foobar
1a,2b Single-site js_strings /js_strings/ /js_strings/foobar
1b,2a Multi-site default mysite_1/default/ mysite_1/default/foobar
1b,2b Multi-site js_strings mysite_1/js_strings/ mysite_1/js_strings/foobar
  • If you have a single-site deployment and try to flush default, you only flush default because the prefixes are distinct.
  • If you have a multi-site deployment and try to flush default, you only flush default because the prefixes are distinct.
  • The three parts of the key (deployment ID, logical cache, and cache item) are always separated by /.

Comments

When developing this patch, I found it helpful to:

  • Enable Redis driver
  • Open redis-cli and view the list of cache items with keys *.

@civibot
Copy link

civibot bot commented Jun 18, 2018

(Standard links)

@eileenmcnaughton eileenmcnaughton added sig:code maintenance readability testability sig:extension support Supports being able to integrate extensions with CiviCRM labels Jun 18, 2018
@totten
Copy link
Member Author

totten commented Jun 18, 2018

Trying to think about how one would review this... and I think I may have found a problem. (Though it took a bit to spot.)

Generally, the question is whether there's any case in which you flush the default caches and intend to also flush the OOP-style/non-default caches. We can make a list of all the caches in the system, e.g.

https://lab.civicrm.org/dev/core/issues/174#note_5812

and, analytically, we can say default/singleton style cases are pretty distinct from the other cases. For comparison/contrast... suppose you add a CustomField (html_type=Select). This creates an OptionGroup. When you flush, you probably intend to flush the caches for both CustomField and OptionGroup at the same time... and that'll work as expected because both are in default/singleton cache. However, it has no bearing on the list of JS Strings or Cxn HTTP cache records. There's no cross-over between those data-sets. Reviewing the lists, it's hard to imagine any cross-over scenarios...

Except for one -- the general use-case of a system-flush (aka System.flush aka /civicrm/clearmenu aka cv flush). For a system-flush, we do want to clear every cache. And, in fairness, there usually isn't a big harm in clearing all the caches at the same time. The issue which brought me to this change was memory-backed storage for sessions. In that case, we want to use the same storage-driver, but the data's lifecycle should be different.

Perhaps a good way to visualize is this:

screen shot 2018-06-18 at 10 40 59 am

  • A system-flush should clear all the yellow-gold bubbles (caches).
  • If someone cleared a gold bubble (default), you wouldn't expect it to clear another gold bubble (JS Strings). There might be a small performance penalty there, but... then again... there's no real harm because they are legitimately caches which will be rebuilt on-demand.
  • Flushing the yellow bubble (caches) and green bubble (sessions/form-state) are two different propositions:
    • Sometimes, clearing them both is crazy-talk. For example, if an admin is designing custom-data for an upcoming survey, it'll flush the default cache several times (whenever they add/edit a CustomGroup/CustomField/OptionGroup/OptionValue). But you wouldn't want to break the active flow in public-facing contribution pages everytime the admin twiddles the upcoming survey.
    • Sometimes, clearing them both is reasonable -- for example, when upgrading to a new version of the software, form-state retained in "Sessions" may become rather stale/buggy. So you do want to clear both.

Anyway, I'm gonna think a bit more today about this...

@totten
Copy link
Member Author

totten commented Jun 18, 2018

So I think there are a couple ways to get the desired impact

  • Surgical update: Update this PR. Find any places that currently flush default/singleton. Update them to also flush the OOP caches. This yields the same behavior as today, and (in the future) it allows session-management to be in its own space. (Note: Some of that flushing is and would be redundant, but it can be mechanically safe and equivalent to current behavior.)
  • Split caches and sessions to different code: Close this PR and leave the status quo. When it comes to moving sessions, do it in a different way that bypasses CRM_Utils_Cache. That might mean we wind up with separate drivers like CRM/Utils/Cache/Redis.php and CRM/Core/Session/Redis.php.
  • Wrangle all caches into an inspectable/tweakable framework; then reassess: Right now, when you flush anything, it's hard to know what will happen... because the cache logic is a yummy plate of pasta. This doesn't exactly solve, but the aim would be to make the system more transparent so that a solution feels easier.

@totten
Copy link
Member Author

totten commented Jun 18, 2018

Pushed up a surgical fix.

Note: if testing on Redis, there's a concurrent fix in #12330 which is also needed.

@eileenmcnaughton
Copy link
Contributor

fail likely related "/home/jenkins/buildkit/app/config/drupal-clean/install.sh: line 40: 22106 Segmentation fault drush -y en civicrm toolbar locale garland"

@eileenmcnaughton
Copy link
Contributor

test this please

@totten
Copy link
Member Author

totten commented Jun 18, 2018

Yup, that's why we have CI. :) In my Redis-based env, it worked -- but in the default SQL-backed config, it gets stuck in infinite recursion of the form


PHP  36. CRM_Core_BAO_Cache::deleteGroup() /Users/totten/bknix/build/dmaster/sites/all/modules/civicrm/CRM/Utils/Cache/SqlGroup.php:129
PHP  37. CRM_Utils_System::flushCache() /Users/totten/bknix/build/dmaster/sites/all/modules/civicrm/CRM/Core/BAO/Cache.php:211
PHP  38. CRM_Utils_Cache_SqlGroup->flush() /Users/totten/bknix/build/dmaster/sites/all/modules/civicrm/CRM/Utils/System.php:1422
PHP  39. CRM_Core_BAO_Cache::deleteGroup() /Users/totten/bknix/build/dmaster/sites/all/modules/civicrm/CRM/Utils/Cache/SqlGroup.php:129
PHP  40. CRM_Utils_System::flushCache() /Users/totten/bknix/build/dmaster/sites/all/modules/civicrm/CRM/Core/BAO/Cache.php:211
PHP  41. CRM_Utils_Cache_SqlGroup->flush() /Users/totten/bknix/build/dmaster/sites/all/modules/civicrm/CRM/Utils/System.php:1422
PHP  42. CRM_Core_BAO_Cache::deleteGroup() /Users/totten/bknix/build/dmaster/sites/all/modules/civicrm/CRM/Utils/Cache/SqlGroup.php:129

Investigating...

@totten
Copy link
Member Author

totten commented Jun 18, 2018

OK, I think that is fixed by another patch that was in my queue. Was going to file a separate PR, but guess it's easier to tack on here.

…imiter.

"Prefixes" are a way to have one cache-server (e.g. one instance of redis or memcached)
which stores several different data-sets. `CRM_Utils_Cache` uses prefixes in a couple ways:

* (1) General site prefix (controlled via `civicrm.settings.php`)
    * (1a) If you have a single-site deployment, then the general prefix is blank.
    * (1b) If you have a multi-site deployment, then each site should use a different prefix (`mysite_1`, `mysite_2`, etc).
* (2) Within a given deployment, prefixes may indicate different logical data-sets.
    * (2a) `Civi::cache()` or `Civi::cache('default')` or `CRM_Utils_Cache::singleton()` are the `default` data-set.
    * (2b) `CRM_Utils_Cache::create()` can instantiate new, special-purpose
      data-sets. For example, this is used for `Civi::cache('js_strings')`.

This patch addresses two issues:

 * (Functional) Flushing the 'default' cache would likely flush all other caches
   because the 'default' cache didn't have a distinctive prefix. (This was observed Redis. Theoretically,
   the bug would apply to some-but-not-all cache backends.)
 * (Aesthetic) The full cache paths don't look consistent because they don't have a standard dlimiter.

To fully understand, it helps to see example cache keys produced in a few
configurations before and after the patch.

See also: https://lab.civicrm.org/dev/core/issues/174

Before
-----------------------------

| |Deployment Type|Logical Cache  |Combined Cache Prefix |Example Cache Item (`foobar`)|
|-|-|---------------|-------|-----------------|
|1a,2a|Single-site|`default`    |(empty-string)|`foobar`|
|1a,2b|Single-site| `js_strings` |`_js_strings`|`_js_stringsfoobar`|
|1b,2a|Multi-site |`default`    |`mysite_1_`|`mysite_1_foobar`|
|1b,2b|Multi-site |`js_strings` |`mysite_1_js_strings`|`mysite_1_js_stringsfoobar`|

* If you have a single-site deployment and try to flush `default`, you
  inadvertently flush `js_strings` because everything matches the empty-string prefix.
* If you have a multi-site deployment and try to flush `default`, you
  inadvertently flush `js_strings` because the prefix overlaps.
* The three parts of the key (deployment ID, logical cache, and cache item) are not necessarily separated.

After
-----------------------------

| |Deployment Type|Logical Cache  |Combined Cache Prefix |Example Cache Item (`foobar`)|
|-|-|---------------|-------|-----------------|
|1a,2a|Single-site|`default`    |`/default/`|`/default/foobar`|
|1a,2b|Single-site|`js_strings` |`/js_strings/`|`/js_strings/foobar`|
|1b,2a|Multi-site |`default`    |`mysite_1/default/`|`mysite_1/default/foobar`|
|1b,2b|Multi-site |`js_strings` |`mysite_1/js_strings/`|`mysite_1/js_strings/foobar`|

* If you have a single-site deployment and try to flush `default`, you
  only flush `default` because the prefixes are distinct.
* If you have a multi-site deployment and try to flush `default`, you
  only flush `default` because the prefixes are distinct.
* The three parts of the key (deployment ID, logical cache, and cache item) are always separated by `/`.

Comments
--------

When developing this patch, I found it helpful to:

* Enable Redis driver
* Open `redis-cli` and view the list of cache items with `keys *`.
…ring fixing prefixes

The preceding update to `CRM_Utils_Cache` meant that `CRM_Utils_Cache::singleton()->flush()`
(aka `Civi::cache()->flush()`) would flush only the *default* cache. This revision
ensures that a general system-flush still hits the same caches.

However, we can now define *other* caches which *won't* be hit by system-flush.
The `deleteGroup(...$clearAll...)` option is heavy-handed and leads to
call-paths that hard to grok.

In the current design of CRM_Core_BAO_Cache (with multi-tier caching), one
does need to clear these things.  But let's not clear everything under the
sun...
@totten
Copy link
Member Author

totten commented Jun 20, 2018

There were some reported test failures. I patched/rebased to address them.

$cache->flush();
CRM_Utils_Cache::singleton()->flush();
if (Civi\Core\Container::isContainerBooted()) {
Civi::cache('settings')->flush();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So @totten what strikes me here is what if my extension uses a different key - it will not now be flushed & I have 'no say' in that...

Copy link
Member Author

@totten totten Jun 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1) In a hypothetical sense, I can see how that concern comes up. CRM_Utils_Cache::singleton()->flush(); previously flushed the big yellow circle, and now it just flushes the smaller default gold circle. But in a practical sense, I don't think the change has the effect you're concerned about.

Why not?

  • In universe, I could only find two extensions (be.chiro.civi.idcache and uk.co.compucorp.civicrm.pivotreport) that touch the default instance (Civi::cache() or CRM_Utils_Cache::singleton()). Neither of them does any flushing. They just read/write a couple keys.
  • Most developers and admins work with higher-level flush operations -- like CRM_Core_Invoke::rebuildMenuAndCaches(). At least, this is what shows up most in grepping, and it matches anecdotal experience.
  • Consider https://gist.github.com/totten/0f0a0d5ca03e8e4ff5b7b6260e3ff36c -- if your feeling is "I want to flush things", then most folks go for one of the big LISTs (like rebuildMenuAndCaches()) because they're too nervous about digging into the specific lifecycle of each cache. In the big picture, CRM_Utils_Cache::singleton()->flush() is just a BASIC. (This PR/issue is saying that the BASIC flush was slightly broader than one would expect -- but we're preserving the functionality of the bigger LISTs.)
  • There isn't much selective pressure that would force people to be aware of CRM_Utils_Cache::singleton()->flush(). On a vanilla/default deployment, you don't have memcache/redis/whatever. Instead, default is implemented by CRM_Utils_Cache_ArrayCache -- which means the cache flushes as soon as the page-request ends. You'd rarely think to flush a cache that automatically flushes for you in a typical page-request.

(2) The use of the word "key" suggests some confusion. Let's consider two ways in which two hypothetical extensions might be working with cache services:

/* 1 -- Using the default cache, and setting some keys inside there */
/* 1a */ Civi::cache()->set('multisite_mydata', 123);
/* 1b */ Civi::cache()->set('extended_report_otherdata', 456);
/* 1c */ Civi::cache()->flush();

/* 2 -- Using a custom cache object, and each is a separate namespace for keys */
/* 2a */ Civi::cache('default')->set('multisite_mydata', 123);
/* 2b */ Civi::cache('extended_report')->set('otherdata', 456);
/* 2c */ Civi::cache('default')->flush();
/* 2d */ Civi::cache('extended_report')->flush();

There are two cases in universe which looks like example 1. That example works the same as before.

The PR would hypothetically affect something like example 2 (i.e. because 2c would flush data from 2b) on systems using memory-backed cache, but I don't think anyone does example 2. It's just not well known; there are no tutorials or documentation explaining it, and I can't find any evidence in universe of code even vaguely like it. The issue really should be limited to 3-ish cache-services in civicrm-core.

Civi::cache('settings')->flush();
Civi::cache('js_strings')->flush();
Civi::cache('community_messages')->flush();
CRM_Extension_System::singleton()->getCache()->flush();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or is that totally covered here ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the extension concern discussed above is addressed more by saying, "Extensions don't use this thing".

The code there does mention extensions (CRM_Extension_System::singleton()->getCache()->flush();), but this is metadata used by the extension-system tracking info about extensions (specifically the data structure which says "the source code for foo is located in sites/default/files/civicrm/ext/foo). The line is probably redundant or irrelevant in most usage, but there's arguably an edge-case where it's needed to maintain the behavior of "flush the big yellow circle".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@totten I was more thinking about whether I might want to use cache prefixes in an extension - although I guess I can agree that is not something in use now so it could be punted

@eileenmcnaughton
Copy link
Contributor

I feel like this has been investigated pretty thoroughly & am merging. I think the test suite will be the most rigourous tester of all this!

@eileenmcnaughton eileenmcnaughton merged commit 74a8b12 into civicrm:master Jun 21, 2018
@totten totten deleted the master-default-prefix branch August 1, 2018 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
master sig:code maintenance readability testability sig:extension support Supports being able to integrate extensions with CiviCRM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants