-
-
Notifications
You must be signed in to change notification settings - Fork 825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(WIP) (dev/core#217) Allow replacement of PrevNextCache implementation for search screens #12377
Conversation
(Standard links)
|
3a647a0
to
fb309dc
Compare
ce1e8a4
to
3fb876e
Compare
…xtCache::fillWithSql()
…xtCache::fillWithArray()
…t->fillWithSql(...)`
…s of contact-search The `getPositions()` function is used by both contact-search and dedupe-merge use-cases. ``` CRM/Contact/Form/Merge.php: $pos = CRM_Core_BAO_PrevNextCache::getPositions($cacheKey, $this->_cid, $this->_oid, $this->_mergeId, $join, $where, $flip); CRM/Contact/Form/Merge.php: $pos = CRM_Core_BAO_PrevNextCache::getPositions($cacheKey, NULL, NULL, $this->_mergeId, $join, $where); CRM/Contact/Page/View.php: $pos = CRM_Core_BAO_PrevNextCache::getPositions("civicrm search $qfKey", $this->_contactId, $this->_contactId); ``` Our aim in developing `CRM_Core_PrevNextCache_Interface` is to allow contact-search to swap a MySQL backend with a Redis backend -- and dedupe-merge should continue as-is (whether or not Redis is available). This basically means: * Contact-search switches to using `Civi::service('prevnext')->getPositions()` * Dedupe-merge continues using `CRM_Core_BAO_PrevNextCache::getPositions()` Note that the `Interface::getPositions()` is simpler than the BAO's variant. This is good because: * Contact-search doesn't need as many parameters. * Dedupe-merge still needs all the parameters. * Adding all parameters would make it hard to implement on other backends. (This is esp true of SQL-style options `$join` and `$where`.)
…of contact-search The `deleteItem()` function is used by both contact-search and dedupe-merge use-cases. We can classify several of these just based on the files: * Contact-search use-cases. (These should be updated to use the interface.) * `CRM/Campaign/Selector/Search.php: CRM_Core_BAO_PrevNextCache::deleteItem(NULL, $cacheKey, 'civicrm_contact');` * `CRM/Contact/Form/Search.php: CRM_Core_BAO_PrevNextCache::deleteItem(NULL, $cacheKey);` * `CRM/Contact/Selector.php: CRM_Core_BAO_PrevNextCache::deleteItem(NULL, $cacheKey, 'civicrm_contact');` * Dedupe-merge use-cases. (These should contiue using the BAO.) * `CRM/Contact/Form/DedupeRules.php: CRM_Core_BAO_PrevNextCache::deleteItem(NULL, $cacheKey);` * `CRM/Contact/Page/DedupeFind.php: CRM_Core_BAO_PrevNextCache::deleteItem(NULL, CRM_Dedupe_Merger::getMergeCacheKeyString($rgid, $gid, $criteria))` * `CRM/Dedupe/Merger.php: CRM_Core_BAO_PrevNextCache::deleteItem(NULL, "{$cacheKeyString}_stats");` Additionally, there are two oddballs which are harder to categorize. * `CRM_Contact_BAO_Contact_Utils::clearContactCaches($isEmptyPrevNextTable = FALSE)` deletes *all* prev-next cache-records (`CRM_Core_BAO_PrevNextCache::deleteItem();`). It only does so in one scenario (as part of `CRM/Contact/Import/Form/Preview.php`), which has this explanatory comment: "Clear all caches, forcing any searches to recheck the ACLs or group membership as the import may have changed it." * `CRM_Contact_BAO_Contact::deleteContact(...)` deletes any prev-next cache-records which reference a specific contact (`CRM_Core_BAO_PrevNextCache::deleteItem($id)`). I suppose this provides a low-grade form of referential integrity. Part of me thinks those should be re-considered (e.g. to use a hook/event -- and reduce the coupling between `Contact` and `PrevNext` subsystems). However, for purposes of dev/core#217, it seems OK to send `deleteItem(...)` to both BAO (SQL-only) and service (SQL-or-memory) variants.
This function is used one time -- when you run a search, select some contacts, and perform a task (like "Delete contacts"), the `CRM/Contact/Form/Task.php` displays a table with the names of the selected contacts. This patch makes the logic portable -- so that it can work regardless of whether selections are stored in MySQL or Redis. Before ------ * The contacts are selected `FROM civicrm_prevnext_cache WHERE cacheKey LIKE {$key}% AND cacheKey NOT LIKE {$key}_alphabet%`. * The contact names come from `civicrm_prevnext_cache.data`, which has been pre-filled with either `civicrm_contact.sort_name` (for most contact searches) or `civicrm_contact.display_name` (for campaign respondent searches). After ----- * The selections are chosen with `FROM civicrm_prevnext_cache WHERE cacheKey LIKE {$key}%`. * The contact names are loaded directly from `civicrm_contact.sort_name`. Comments -------- * The use of wildcards with `cacheKey` seems like a code-smell suggest a somewhat deeper problem in how `cacheKey` is understood. However, for our purposes, it shouldn't matter. * The before and after queries are very similar in how they use `cacheKey`... and slightly different. (Ugh.) Is the new one better or worse? Well, look at how `CRM_Contact_Form_Task` uses `getSelection()` (for finding contact IDs, which feed into the task logic) and `getSelectedContacts()` (for finding contact names, which are displayed to user). If the subtle difference in `cacheKey` filtering matters, then our UX is buggy because we're showing the user one contact-list (based on old `getSelectedContacts()`) and we're executing on a different contact-list (based on the old `getSelection()`). * Is the old technique for getting names (querying `civicrm_prevnext_cache.data`) better than the new technique (querying civicrm_contact.sort_name)? I haven't benchmarked, but I'm skepitcal. Both techniques transfer the full `O(n)` list from mysql to php. In typical usage, the size of `n` is limited by what an admin is willing to click through in the UI (which is probably a few hundred IDs). The contact tables is indexed by ID. Maybe... if you manually check several thousand records, it might make a small difference. But if you're clicking that many, then other things are also getting more expensive (like the actual task). In this case, it feels like an unnecessary optimization.
… contact-search The `getCount()` function is used by both contact-search and dedupe-merge use-cases. * Contact-search * `CRM/Contact/Selector.php: $countRow = CRM_Core_BAO_PrevNextCache::getCount($cacheKey, NULL, "entity_table = 'civicrm_contact'");` * Dedupe-merge * `CRM/Contact/Page/AJAX.php: $iTotal = CRM_Core_BAO_PrevNextCache::getCount($cacheKeyString, $join, $whereClause, '=', $queryParams);` * `CRM/Contact/Page/DedupeMerge.php: $total = CRM_Core_BAO_PrevNextCache::getCount($cacheKeyString, NULL, ($onlyProcessSelected ? "pn.is_selected = 1" : NULL));` Our aim in developing `CRM_Core_PrevNextCache_Interface` is to allow contact-search to swap a MySQL backend with a Redis backend -- and dedupe-merge should continue as-is (whether or not Redis is available). This basically means: * Contact-search switches to using `Civi::service('prevnext')->getCount()` * Dedupe-merge continues using `CRM_Core_BAO_PrevNextCache::getCount()` Note that the `Interface::getCount()` is simpler than the BAO's variant. This is good because: * Contact-search doesn't need as many parameters. * Dedupe-merge still needs all the parameters. * Adding all parameters would make it hard to implement on other backends. (This is esp true of SQL-style options `$join` and `$where`.)
The contract feels quirky -- e.g. * Who would guess that `markSelection()` defaults to `$action == 'unselect'`? * What's the point of accepting `$entity_table` if it's never used? Fortunately, this function is only called from `CRM_Contact_Page_AJAX::selectUnselectContacts`, so it's fairly easy to audit and see that: * The `$action` is always passed in -- it never relies on the default value. * The `$entity_table` is never specified explicitly -- it always relies on the default value.
1. Improve docblock formatting 2. The `$entity_table` is never passed in. You can see this by grepping universe for `getSelection`.
…ing filters Suppose you run a search ("Find Contact", "Advanced Search", "Custom Search", etc). The result screen includes: 1. Standard pagination (Previous/Next; First/Last; Jump-To) 2. Numerical option for page-size 3. Sortable columns 4. An alphabetical filter 5. Checkboxes As you work with these options, the content of the `civicrm_prevnext_cache` table may change. This patch does not substantively change what's in that cache, but makes of the column `cacheKey` simpler and more consistent. Both Before and After (Unchanged) --------------------------------- * The form's qfKey identifies the current screen/filters/cache. * If you navigate to the next/previous page (`#1`) or adjust the page-size (`#2`), the content in `civicrm_prevnext_cache` remains the same (for the given qfKey). * If you change the sort column (`civicrm#3`) or alphabetic filter (`civicrm#4`), the content in `civicrm_prevnext_cache` is deleted and repopulated (for the given qfKey). * If you toggle a checkbox, the `civicrm_prevnext_cache.is_selected` property updates accordingly. These selections are retained when changing pages (`#1`/`#2`), but they're reset if you use sort or alphabet options (`civicrm#3`/`civicrm#4`). Before ------ * The content of `civicrm_prevnext_cache.cacheKey` takes one of two forms, depending on whether you're using an alphabetic filter (`civicrm#4`). * `civicrm search {qfKey}` (typical) * `civicrm search {qfKey}_alphabet` (less common) * The queries which read or delete the query-cache use a prefix+wildcard, i.e. `WHERE cacheKey LIKE 'civicrm search {qfKey}%'`. After ----- * The content of `civicrm_prevnext_cache.cacheKey` takes only one form * `civicrm search {qfKey}` * The queries which read or delete the query-cache use an exact match, i.e. `WHERE cacheKey = 'civicrm search {qfKey}'`.` * The text `_alphabet` does not appear in the PHP source folders (CRM, Civi, bin, api, extern, tests). Comments -------- In theory, one can imagine that it's desireable to keep the cached results for each of the sorted/filtered variants of the query. That might allow the user to quickly switch among different sortings and different alphabetic-filters, or it might allow some kind of clever management of the selections. But this is not so. As we see (both before and after), the substance of the cache is deleted whenever the user changes `civicrm#3`/`civicrm#4`. In reality, one user browsing a search screen corresponds to exactly one query-cache. As near as I can tell, the old code made the names change for now real reason at all. To observe the behavior empirically, I would twiddle the UI widgets and concurrently inspect the content of the cache tables: ``` mysql> select group_name, path, FROM_BASE64(data), expired_date from civicrm_cache where path like 'civicrm search%'; select 'Total records in' as label, cacheKey, count(*), min(id), max(id) from civicrm_prevnext_cache group by cacheKey union select 'Selected records in ', cacheKey, count(*), min(id), max(id) from civicrm_prevnext_cache where is_selected=1 group by cacheKey; +------------------------------+------------------------------------------------------+--------------------------------------------------------------+--------------+ | group_name | path | FROM_BASE64(data) | expired_date | +------------------------------+------------------------------------------------------+--------------------------------------------------------------+--------------+ | CiviCRM Search PrevNextCache | civicrm search a8ed1e2039241c41457a88f65aa8a8ee_7845 | s:52:"civicrm search a8ed1e2039241c41457a88f65aa8a8ee_7845"; | NULL | +------------------------------+------------------------------------------------------+--------------------------------------------------------------+--------------+ 1 row in set (0.00 sec) +----------------------+------------------------------------------------------+----------+---------+---------+ | label | cacheKey | count(*) | min(id) | max(id) | +----------------------+------------------------------------------------------+----------+---------+---------+ | Total records in | civicrm search a8ed1e2039241c41457a88f65aa8a8ee_7845 | 6 | 787 | 792 | | Selected records in | civicrm search a8ed1e2039241c41457a88f65aa8a8ee_7845 | 1 | 789 | 789 | +----------------------+------------------------------------------------------+----------+---------+---------+ 2 rows in set (0.01 sec) ```
…y_id2 from service The PrevNext service/interface aims to be a replaceable component for use the search-caching (but not deduping). The interface that we produced from refactoring includes several references to `entity_table` and `entity_id2` -- these values are part of the SQL table, and they're needed for dedupe, but they don't seem to convey anything meaningful for search-caching. Including these fields makes the interface more complicated -- which will make it hard to implement other variants. The general gist of this commit is that we no longer fill those two columns, and we no longer read them. In a couple functions, we split the new OOP implementation (`CRM_Core_PrevNextCache_Sql`) from the traditional static BAO implementation (`CRM_Core_BAO_PrevNext`) so that we can omit these fields.
…ctions Before ------ * Every time you create an instance of `CRM_Utils_Cache_Redis`, it makes a new instance of the driver class `Redis`. After ----- * Every time you create an instance of `CRM_Utils_Cache_Redis`, it uses a shared instance of the driver class `Redis` (assuming the host/port are the same). * If you're writing a driver somewhere else (e.g. the upcomgin `CRM_Core_PrevNextCache_Redis`), you can use the shared instance of `Redis`.
…mory-backed or SQL-backed) Before ------ The `prevnext` service is always an instance of `CRM_Core_PrevNextCache_Sql`. After ----- If you define `CIVICRM_DB_CACHE_CLASS`, and if there's a corresponding service `prevnext.driver.{mydriver}`, then it will use that service. Otherwise, it will use `prevnext.driver.sql` (`CRM_Core_PrevNextCache_Sql`).
bbbeac6
to
713bdad
Compare
…k to its old place. The general context of this code is roughly as follows: "We've already filled up the prevnext cache with a bunch of contact-IDs. The user wants to view a page of 50 contacts. The prevnext cache has all the contact-IDs, but we want to lookup full information about the specific 50 contacts on this page." The `CRM_Contact_BAO_Query` is good for looking up the "full information" some contacts, but it doesn't know how to filter based on the prevnext cache. One needs a way to add that bit of information. Before ------ * The contract for `CRM_Core_PrevNextCache_Interface::fetch()` was extremely nuanced -- it basically requires you to call the Query BAO and hack its SQL output. * `CRM_Core_PrevNextCache_Sql::fetch()` (and its equivalent predecessor `CRM_Contact_BAO_Query::getCachedContacts()`) achieved this by splicing some extra clause into the SQL query -- specifically, a `JOIN` on the `civicrm_prevnext_cache` and some `GROUP BY`/`ORDER BY`/`LIMIT`. * `CRM_Core_PrevNextCache_Redis::fetch()` achieved this with two steps -- first, reading the contact-list from the cache; and then splicing an extra `WHERE` into the SQL query. * There is not unit test coverage for `::fetch()`. After ----- * The contract for `CRM_Core_PrevNextCache_Interface::fetch()` is much simpler - you don't need to know anything about the query BAO. You just return one page worth of ID's from the cache. * The SQL manipulation has moved back to its original place in `CRM_Contact_BAO_Query::getCachedContacts()`. However, we do the same manipulation regardless of cache-backend -- ie we always splice in the `WHERE` (and not the `JOIN`). * There is unit test coverage for `::fetch()`.
Note: With the overall patch-set, I did a lot I mention this because we're now breaking it up into smaller chunks for review purposes. It may be useful in reviewing the cherry-picked pieces to know that the contact-search works in the overall patch-set. However, I have not done such careful |
The new name is prettier and matches the names in `CRM_Core_PrevNextCache_{Interface,Redis}`.
@totten closing as this is just a tracking branch now I think |
Overview
A brief description of the pull request. Try to keep it non-technical.
Before
The current status. Please provide screenshots or gifs (LICEcap, SilentCast) where appropriate.
After
What has been changed. Please provide screenshots or gifs (LICEcap, SilentCast) where appropriate.
Technical Details
If the PR introduces noteworthy technical changes, please describe them here. Provide code snippets if necessary
Comments
Anything else you would like the reviewer to note