Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dev/core#2142 Adds performance improvement when browsing the report logs #18851

Merged
merged 1 commit into from
Nov 6, 2020

Conversation

VangelisP
Copy link
Contributor

Overview

This issue relates to a discussion we had in CiviCRM Dev channel where it was being identified that when we got advanced logging and a contact triggered batch updates/changes, the detailed report most of the time does not work due to extremely long delays. More info on issue 2142

Before

If the records associated are too many, the report will never finish in a timely manner and produce a timeout.

After

  • Report renders properly again by:
  • Showing a list of first x records
  • Optionally, rendering a pager for review

Technical Details

Adds LIMIT & OFFSET to the query inside function getAllChangesForConnection to support limit & offset of the rendering rows when displaying in report CRM_Logging_ReportDetail

Comments

All comments are on https://lab.civicrm.org/dev/core/-/issues/2142

@civibot
Copy link

civibot bot commented Oct 23, 2020

(Standard links)

@civibot civibot bot added the master label Oct 23, 2020
@demeritcowboy
Copy link
Contributor

I did some r-run and seems good but the pager seems to show the wrong counts even though the table has all the rows. But it might be something local - not sure yet.

@VangelisP
Copy link
Contributor Author

I did some r-run and seems good but the pager seems to show the wrong counts even though the table has all the rows. But it might be something local - not sure yet.

What you see is indeed correct although not properly clarified: While the maximum number of rows per page is 50, those rows refer to the contacts that a change has been detected and not their actual changes. You do get 50 contacts per page but their changes can be more (eg for one contact you might have name + address while for another just the name change). The 'ReportDetail' report renders all the changes for those 50 contacts.

Here is an example that diff returns for 1 row (brings back 2 arrays) :

[
    [action] => Update
    [id] => 38797
    [field] => addressee_display
    [from] => Beverly (Creamer) Burris
    [to] => Ms Beverly A (Creamer) Burris
    [table] => civicrm_contact
    [log_date] => 2019-11-06 23:00:03
    [log_conn_id] => 5dc342615a99d
]

[
    [action] => Update
    [id] => 38797
    [field] => modified_date
    [from] => 2019-05-02 08:34:07
    [to] => 2019-11-06 23:00:03
    [table] => civicrm_contact
    [log_date] => 2019-11-06 23:00:03
    [log_conn_id] => 5dc342615a99d
]

Frankly, unless there's a layout change on how the report renders so that we can fit 50 rows of actual contacts within a page along with their modifications, i don't see any other way around this. Any ideas?

@VangelisP
Copy link
Contributor Author

I've just added a condition where if we're rendering results in the overlay window (mouseover on a contact's changelog) , the pager should not be displayed.

@sunilpawar
Copy link
Contributor

@VangelisP only issue i could see is as mentioned by @demeritcowboy that, number of rows showing on page is more than what is shown on pagination.

Is there way we can track and use pagination on number of modified field per connection instead of updated entities ?

@VangelisP
Copy link
Contributor Author

Thanks @sunilpawar for the feedback.

I'm going to try to restructure (hopefully a little bit) the report datasource so that we can actually apply the limit to the proper source.
I'll be frank, I have no plan yet but I'll think of something while working on it.

@VangelisP
Copy link
Contributor Author

@sunilpawar & @demeritcowboy after some thought, I am now pushing a new commit that deals with this, the multiple changes per contact (when we apply a contact limit).

What it does:

I am grouping the returned changes per contact ID. If a contact appears to be having more than 1 change, I am concatenating the values of the arrays using a <br/> string so that all changes can be shown on the same line. This will gives us back the consistency of 50 rows per page.

Here's an example before:
image

Here's an example after:
image

I'm also removing the 'modified date' value that makes no sense (at least to me!). Lastly, I'm taking care the case when displaying values through the hovering function but that's a minor one.

Let me know what you think please!

@demeritcowboy
Copy link
Contributor

Looks good. Thanks for the improvement. The commits will probably need to be squashed but I have no more comments.

@VangelisP
Copy link
Contributor Author

Looks good. Thanks for the improvement. The commits will probably need to be squashed but I have no more comments.

Thanks for the valuable feedback! I will update the GitLab issue with the findings/final solution and also I will squash the commits once the tests are done.

@VangelisP
Copy link
Contributor Author

@eileenmcnaughton this PR is ready for merge, if that's ok with you!

@sunilpawar
Copy link
Contributor

@VangelisP field grouping looks ok, Another problem i noticed (not related to this task).

For testing i changed Row limit to 5 to get pagination, i could able to see pagination on report.

I got SQL error when i click on Next Link of Pagination.

-- 1
SELECT DISTINCT lt.id FROM `phpunit_civicrm_1`.`log_civicrm_batch` lt

WHERE lt.log_conn_id = '5f9fd7cb6fbd5'
    
    AND contact_id = 202 [nativecode=1054 ** Unknown column 'contact_id' in 'where clause']
-- 2
SELECT DISTINCT lt.id FROM `phpunit_civicrm_1`.`log_civicrm_mailing_abtest` lt

WHERE lt.log_conn_id = '5f9fd7cb6fbd5'
    
    AND contact_id = 202 [nativecode=1054 ** Unknown column 'contact_id' in 'where clause']

I think we need to add following in CRM/Logging/Schema.php and other tables in Differ.php file

diff --git a/CRM/Logging/Differ.php b/CRM/Logging/Differ.php
index 863322bb2a..2107a4ff22 100644
--- a/CRM/Logging/Differ.php
+++ b/CRM/Logging/Differ.php
@@ -87,6 +87,47 @@ class CRM_Logging_Differ {
         case 'civicrm_relationship':
           $contactIdClause = "AND (contact_id_a = %3 OR contact_id_b = %3)";
           break;
+        case 'civicrm_batch':
+          $contactIdClause = "AND (created_id = %3 OR modified_id = %3)";
+          break;
+
+        case 'civicrm_group_organization':
+          $contactIdClause = "AND (organization_id = %3)";
+          break;
+
+        case 'civicrm_dedupe_exception':
+          $contactIdClause = "AND (contact_id1 = %3 OR contact_id2 = %3)";
+          break;
+
+        case 'civicrm_campaign':
+        case 'civicrm_survey':
+          $contactIdClause = "AND (created_id = %3 OR last_modified_id = %3)";
+          break;
+
+        case 'civicrm_event_carts':
+          $contactIdClause = "AND (user_id = %3)";
+          break;
+
+        case 'civicrm_membership_type':
+          $contactIdClause = "AND (member_of_contact_id = %3)";
+          break;
+
+        case 'civicrm_custom_group':
+        case 'civicrm_file':
+        case 'civicrm_tag':
+        case 'civicrm_print_label':
+        case 'civicrm_group':
+        case 'civicrm_contribution_page':
+        case 'civicrm_payment_token':
+        case 'civicrm_report_instance':
+        case 'civicrm_uf_group':
+        case 'civicrm_event':
+          $contactIdClause = "AND (created_id = %3)";
+          break;
+
+        case 'civicrm_mailing':
+          $contactIdClause = "AND (created_id = %3 OR scheduled_id = %3 OR approver_id = %3)";
+          break;
 
         case 'civicrm_activity':
           $activityContacts = CRM_Activity_BAO_ActivityContact::buildOptions('record_type_id', 'validate');
diff --git a/CRM/Logging/Schema.php b/CRM/Logging/Schema.php
index 7adf83beef..1e166a96fb 100644
--- a/CRM/Logging/Schema.php
+++ b/CRM/Logging/Schema.php
@@ -147,6 +147,10 @@ AND    TABLE_NAME LIKE 'civicrm_%'
     // dev/core#1762 Don't log subscription_history
     $this->tables = preg_grep('/^civicrm_subscription_history/', $this->tables, PREG_GREP_INVERT);
 
+    // mailing test
+    $this->tables = preg_grep('/^civicrm_mailing_abtest/', $this->tables, PREG_GREP_INVERT);
+
+
     // do not log civicrm_mailing_recipients table, CRM-16193
     $this->tables = array_diff($this->tables, ['civicrm_mailing_recipients']);
     $this->logTableSpec = array_fill_keys($this->tables, []);

@eileenmcnaughton , @demeritcowboy do you know any reason for not having these tables in Differ.php file ?

@demeritcowboy
Copy link
Contributor

demeritcowboy commented Nov 5, 2020

As a general note I have noticed that the Contact Logging Report doesn't always work well for things that aren't Contacts. There may be several things it needs to fully support those other objects. When I looked about a year ago it seemed like it might take up a fair amount of time so I stopped. Do you want to post that diff as a separate PR?

@demeritcowboy
Copy link
Contributor

D'oh typo. I meant "doesn't always work well...".

@eileenmcnaughton
Copy link
Contributor

Merging based on @demeritcowboy's review.

@sunilpawar I'm a bit confused as to "these tables in Differ.php file" - is that 'is there are problem with my proposed patch'?

In general we should be able to use metadata rather than a hard-coded list to get the contact id links

Something along the lines of CRM_Core_DAO::getReferencesToContactTable();

@eileenmcnaughton eileenmcnaughton merged commit 658ddf1 into civicrm:master Nov 6, 2020
@VangelisP VangelisP deleted the dev_issue#2142 branch November 23, 2020 11:41
@mlutfy mlutfy changed the title Adds performance improvement when browsing the report logs dev/core#2142 Adds performance improvement when browsing the report logs Dec 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants