Alter the query used to render the yearly totals on the contact contr… · civicrm/civicrm-core@1ec168e

Commit

Alter the query used to render the yearly totals on the contact contr…

…ibution tab to make a better index choice.

This PR restructures the  annual query to use HAVING clause for contribution_status_id in order to
avoid the (edge case) situation where it uses a combined index resulting in a slow query. When a contribution is entered
the screen loaded on-close is the contribution tab so when this is slow the data entry process is much slowed.

Mmsql index merge optimisation is a default enabled setting which can choose to use 'merged indexes' rather than
just one. In this query the best index is contact_id because even when a contact has a relatively large number of
contributions they are still trivial compared to the overall DB. By contrast contribution_status_id is not a good
choice as it has fairly low variability and when used in combination with contact_id adds very little as trade off for the time
it takes to construct a merge index

What we are seeing is that the index choice on this query is
index_contribution_status,received_date,FK_civicrm_contribution_contact_id,contact_id_contribution_status | FK_civicrm_contribution_contact_id,index_contribution_status

which results in a query that takes around 6 seconds on a contact with 18k contributions in a DB of around 30 million contributions.

If we restructure the query so that it looks like

```
     SELECT COUNT(*) as count,
             SUM(total_amount) as amount,
             AVG(total_amount) as average,
             currency
      FROM civicrm_contribution  b

      WHERE b.contact_id IN (72) AND b.receive_date >= 20180701 AND b.receive_date <  20190701
      GROUP BY currency, contribution_status_id
      HAVING contribution_status_id = 1;
```

the selected index is  FK_civicrm_contribution_contact_id and the query takes .05 seconds.

This restructure achieves much the same as using an index hint but as there is some index naming inconsistency
'in the wild' changing the query structure like this makes more sense.

 Note that I ALSO tested this
1) on the same database on a contact with a negligible number of contributions
2) on a much smaller database (~400k contributions) on a contact with 122 contributions.

In both cases it made no difference to which key was chosen as the merge index appears to only be chosen by mysql
when more than a certain number of on contributions adhere to a single contact.

In the latter case the query time for both versions was recorded as 0.00 so I couldn't see any difference.

In the former case the query change sped up the query from 0.02 to 0.00 so there is some additional efficiency
in the new query beyond the index choice

Links on index merge optimisation
https://www.percona.com/blog/2012/12/14/the-optimization-that-often-isnt-index-merge-intersection/
http://gobitcan.com/blog/2014-12-03-what-to-do-when-mysql-ignores-your-index
http://mysqlopt.blogspot.com/2013/05/mysql-slow-query-example-of-index-merge.html
https://www.percona.com/blog/2012/12/14/the-optimization-that-often-isnt-index-merge-intersection/
https://stackoverflow.com/questions/16283472/why-is-mysql-showing-index-merge-on-this-query
https://mariadb.com/kb/en/library/fair-choice-between-range-and-index_merge-optimizations/
https://dba.stackexchange.com/questions/178286/mysql-on-rds-avoiding-index-merge
https://mariadb.com/kb/en/library/index_merge-sort_intersection/
https://stackoverflow.com/questions/47943755/mysql-optimizing-subquery-with-index-merge
https://dev.mysql.com/doc/refman/8.0/en/index-merge-optimization.html

Change-Id: I223a3388f59f2608434b36ebe5a887318abae96a

Loading branch information

eileenmcnaughton committed Jan 29, 2019

1 parent 5366609 commit 1ec168e

CRM/Contribute/BAO/Contribution.php

-Original file line number
+Diff line change
@@ Expand Up / @@ -5584,10 +5584,10 @@ public static function getAnnualQuery($contactIDs) { @@
         $whereClauses = [
           'contact_id' => 'IN (' . $contactIDs . ')',
-          'contribution_status_id' => '= ' . (int) CRM_Core_PseudoConstant::getKey('CRM_Contribute_BAO_Contribution', 'contribution_status_id', 'Completed'),
           'is_test' => ' = 0',
           'receive_date' => ['>=' . $startDate, '<  ' . $endDate],
         ];
+        $havingClause = 'contribution_status_id = ' . (int) CRM_Core_PseudoConstant::getKey('CRM_Contribute_BAO_Contribution', 'contribution_status_id', 'Completed');
         CRM_Financial_BAO_FinancialType::addACLClausesToWhereClauses($whereClauses);
         $clauses = [];
@@ Expand All / @@ -5603,7 +5603,8 @@ public static function getAnnualQuery($contactIDs) { @@
                  currency
           FROM civicrm_contribution b
           WHERE " . $whereClauseString . "
-          GROUP BY currency
+          GROUP BY currency, contribution_status_id
+          HAVING $havingClause
           ";
         return $query;
       }
@@ Expand Down @@

0 comments on commit `1ec168e`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `1ec168e`

Commit

There are no files selected for viewing

0 comments on commit 1ec168e

0 comments on commit `1ec168e`