Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract query to generate annual totals and add test #12810

Merged
merged 1 commit into from
Oct 11, 2018

Conversation

eileenmcnaughton
Copy link
Contributor

Overview

This PR extracts the function that generates the annual totals & adds a test.

screenshot 2018-09-12 17 30 28

Before

No test

After

Test exists both for with & without financial type ACLS. One unnecessary clause is removed from non-ACL sites

Technical Details

If the PR introduces noteworthy technical changes, please describe them here. Provide code snippets if necessary

Comments

This is primarily a preliminary cleanup. We are seeing contribution tabs on our contact with the most contributions (18k) take 6 seconds on this query. From my tests the reason is a combination of

  1. unnecessary filter on financial_type_id (given no financial acls are enabled)
  2. unecessary join on civicrm_line_item (given no financial acls are enabled)
  3. poor index selection between contact_id & status_id - explain shows
|    1 | SIMPLE      | b     | index_merge | index_contribution_status,FK_civicrm_contribution_contact_id | FK_civicrm_contribution_contact_id,index_contribution_status | 4,5     | NULL | 17649 | Using intersect(FK_civicrm_contribution_contact_id,index_contribution_status); Using where |
``

I'm going to follow up with a PR that makes more performance change - this is primarily preliminary to do the refactor first.

@eileenmcnaughton
Copy link
Contributor Author

eileenmcnaughton commented Sep 12, 2018

Performance discussion - the below should not prevent this being merged - it is about the next steps

@monishdeb @lcdservices @seamuslee001 @pradpnayak @davejenx I'm interested to see someone else's performance testing results - I realise our Db is a bit of an anomaly & it's likely no-one else has a single contact with 18k donations or however many million contributions we have. However, even confirmation that changing the query to something that looks like the last one in the comment below 'does no harm' would be good.

Note that the last one above is what would happen for sites WITH financial acls enabled. There are 2 significant changes to the query

  1. contribution status id clause moved to an outer query - this was the one that made most of the difference as it started making better key choices. This query is not used for vast numbers of contacts - in fact I believe it's only ever one contact - so honing in on the contact first is good.
  2. changing the line item filter to being in a subquery - this means I can restructure it to either use the (recommended) selectWhereClause hook or use a function that follows the same format with a goal to move to the selectWhereClause hook down the track.

As an aside I am not fully understanding the line item subquery - ie. the use of NOT IN - see the very first query above

Current query

   SELECT count(*) as count,
             sum(total_amount) as amount,
             avg(total_amount) as average,
             currency
      FROM civicrm_contribution b
      LEFT JOIN civicrm_line_item i ON i.contribution_id = b.id AND i.entity_table = 'civicrm_contribution'  AND i.financial_type_id NOT IN (9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,26)
      WHERE b.contact_id IN ( 72 ) AND b.contribution_status_id = 1 AND b.is_test = 0 AND b.receive_date >= 20180101 AND b.receive_date <  20190101 AND b.contribution_status_id = 1 AND  b.financial_type_id IN (9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,26)
      GROUP BY currency;

6 seconds 20

Without unnecessary ACL clauses

     SELECT count(*) as count,
             sum(total_amount) as amount,
             avg(total_amount) as average,
             currency
      FROM civicrm_contribution b

      WHERE b.contact_id IN ( 72 ) AND b.contribution_status_id = 1 AND b.is_test = 0 AND b.receive_date >= 20180101 AND b.receive_date <  20190101 AND b.contribution_status_id = 1      GROUP BY currency;

6 seconds 20

Restructured for performance

      SELECT count(*) as count,
             sum(total_amount) as amount,
             avg(total_amount) as average,
             currency
      FROM 
      (SELECT total_amount, currency, contribution_status_id, receive_date FROM
      civicrm_contribution b

      WHERE b.contact_id IN ( 72 ) AND b.is_test = 0 
      ) as conts
       WHERE conts.receive_date >= 20180101 AND conts.receive_date <  20190101
           GROUP BY currency, contribution_status_id;

0.06 seconds

Restructured for performance (with extra Having)

    SELECT count(*) as count,
             sum(total_amount) as amount,
             avg(total_amount) as average,
             currency
      FROM 
      (SELECT total_amount, currency, contribution_status_id, receive_date FROM
      civicrm_contribution b

      WHERE b.contact_id IN ( 72 ) AND b.is_test = 0 
      ) as conts
       WHERE conts.receive_date >= 20180101 AND conts.receive_date <  20190101
           GROUP BY currency, contribution_status_id
           HAVING contribution_status_id = 1;

0.05 seconds

Restructured for performance (with extra Having) & financial types back

    SELECT count(*) as count,
             sum(total_amount) as amount,
             avg(total_amount) as average,
             currency
      FROM 
      (SELECT total_amount, currency, contribution_status_id, receive_date 
      FROM
      civicrm_contribution b

      WHERE b.contact_id IN ( 72 ) AND b.is_test = 0 AND b.id IN (SELECT c.id FROM civicrm_contribution c INNER JOIN civicrm_line_item li ON c.id = li.entity_id  AND li.entity_table = 'civicrm_contribution' AND li.financial_type_id IN (9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,26)
) AND b.financial_type_id IN (9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,26)
      ) as conts
       WHERE conts.receive_date >= 20180101 AND conts.receive_date <  20190101
           GROUP BY currency, contribution_status_id
           HAVING contribution_status_id = 1;

.07 seconds

Proposed final query Restructured for performance (with extra Having) & financial types back for ACL enabled sites

SELECT SQL_NO_CACHE count(*) as count,
            sum(total_amount) as amount,
          avg(total_amount) as average,
            currency
     FROM 
     (SELECT total_amount, currency, contribution_status_id, receive_date 
    FROM
     civicrm_contribution b

 WHERE b.contact_id IN ( 72 ) AND b.is_test = 0 AND b.id IN (SELECT c.id FROM civicrm_contribution c INNER JOIN civicrm_line_item li ON c.id = li.entity_id  AND li.entity_table = 'civicrm_contribution' AND li.financial_type_id IN (9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,26)
)
 ) as conts
 WHERE conts.receive_date >= 20180101 AND conts.receive_date <  20190101
 GROUP BY currency, contribution_status_id
HAVING contribution_status_id = 1;

0.09 seconds

@JoeMurray
Copy link
Contributor

@monishdeb we have an extension that provides financial acls for reports. It was created after we factored out this functionality from core due to performance reasons. We should a) ensure that there is a message displayed when enabling financial acl in core that the extension is required (I'm not sure that we did this) and b) see if some of the performance refactoring here might be useful in the extension.

@eileenmcnaughton
Copy link
Contributor Author

test this please

Copy link
Contributor

@JohnFF JohnFF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor feedback around standards and clarity. Otherwise looks good.

'b.is_test = 0',
'b.receive_date >= ' . $startDate,
'b.receive_date < ' . $endDate,
'b.contribution_status_id = 1',
Copy link
Contributor

@JohnFF JohnFF Oct 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need b.contribution_status_id = 1 in there twice (I appreciate it is important).

Far more important, it should not be 1. 1 isn't clear. I don't know what 1 maps to - a Constant or PseudoConstant would be better here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

$monthDay = $config->fiscalYearStart['M'] . $config->fiscalYearStart['d'];
}
else {
$monthDay = '0101';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment like "First of January" at the end of the 0101 line would help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

*
* @return string
*/
public static function getAnnualQuery($contactIDs) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some detail in the comment would help. What does this function do? I know it returns a query, but for what? etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

$config = CRM_Core_Config::singleton();
$currentMonth = date('m');
$currentDay = date('d');
if ((int ) $config->fiscalYearStart['M'] > $currentMonth ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be (int).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

$currentDay = date('d');
if ((int ) $config->fiscalYearStart['M'] > $currentMonth ||
((int ) $config->fiscalYearStart['M'] == $currentMonth &&
(int ) $config->fiscalYearStart['d'] > $currentDay
Copy link
Contributor

@JohnFF JohnFF Oct 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please bracket this - I know the order of execution but it can be confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the line breaks which I think makes it clear

$endDate = "$nextYear$monthDay";
$financialTypes = [];
CRM_Financial_BAO_FinancialType::getAvailableFinancialTypes($financialTypes);
$liWhere = " AND i.financial_type_id IN (0)";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the significance of 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment / todo added

CRM_Financial_BAO_FinancialType::buildPermissionedClause($whereClauses, NULL, 'b');
$query = "
SELECT count(*) as count,
sum(total_amount) as amount,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COUNT, SUM, AVG AS should probably be all in caps (I think is the convention)

$newFiscalYearStart['M'] = '0' . $newFiscalYearStart['M'];
}
if ($newFiscalYearStart['d'] < 10) {
$newFiscalYearStart['d'] = '0' . $newFiscalYearStart['d'];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably this is to convert 1 into 01. I'm 50/50 on whether or a comment should say that's what this is doing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment / todo added

public function testAnnualQueryWithFinancialACLsDisabled() {
$sql = CRM_Contribute_BAO_Contribution::getAnnualQuery([1, 2, 3]);
$this->assertContains('sum(total_amount) as amount,', $sql);
$this->assertContains('WHERE b.contact_id IN ( 1,2,3 )', $sql);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be IN (1, 2, 3)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -1316,68 +1316,11 @@ public static function sortName($id) {
* @return array
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing as we're in the headspace - please could you add a helpful comment into annual function's header? What is annual? What does it do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@seamuslee001
Copy link
Contributor

@eileenmcnaughton test fail looks related (noting i think it might just be a correction to the test - whitespace issue

@JoeMurray Joe there is this

$preUpgradeMessage .= '<br />' . ts('CiviCRM will in the future require the extension %1 for CiviCRM Reports to work correctly with the Financial Type ACLs. The extension can be downloaded <a href="%2">here</a>', array(
which may be something like what your thinking off, note that that is only shown on upgrade or maybe its a different extension all together

@JoeMurray
Copy link
Contributor

Yeah, that would be a message along the lines I described. Thx.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants