Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dev/core#192 - Search builder fails for != smart group filter #12344

Merged
merged 3 commits into from
Dec 17, 2018

Conversation

jitendrapurohit
Copy link
Contributor

@jitendrapurohit jitendrapurohit commented Jun 20, 2018

Overview

Fix smart group in search builder with != operator.

Before

Smart Group search on Search Builder result incorrect rows with != operator

image

After

Correct results are displayed.

Comments

Added unit test.


Gitlab - https://lab.civicrm.org/dev/core/issues/192

@civibot
Copy link

civibot bot commented Jun 20, 2018

(Standard links)

@monishdeb
Copy link
Member

@jitendrapurohit I encountered an issue, with this patch. Let's say Contact A and B is in regular group RG AND contact B and C is smart group SG. Then when I do a search with 'Group != SG' then it returns all the 4 contacts. Shouldn't we expect only Contact A in the result list?

On debugging I found that at this query it excludes the smart group ID and eventually didn't build any where clause for the chosen smart group.

But the intent of this patch is to make = and != operator works like IN/NOT IN respectively for smart group, but the use-case described above bothers me to as it still includes the contact in the result list, which is in SG

@jitendrapurohit
Copy link
Contributor Author

jitendrapurohit commented Jun 22, 2018

Thanks @monishdeb, there seems to be multiple issues related to smart and regular group search. Not sure, but they seem to stem from work done on #9506 and #8645. Listing some of the issues that I encountered while fixing the above comment -

  • Group -> IS NULL will list contact C eventhough it has smart group SG associated with it.
  • Group -> NOT IN -> SG will lead to DB error if only one smart group is present in the database.
  • Group -> NOT NULL won't list Contact C in the results.
  • If database has multiple smart groups, searching for an SG will limit the resulting rows to itself. For eg If an SG2 is created and a search is made like Group -> != -> SG will only list SG2 contacts. Regular groups contacts are simply ignored. It used to be shown in 4.6 version.

I've updated the PR with the fixes for the above use-cases.

@monishdeb
Copy link
Member

monishdeb commented Jun 25, 2018

Considering A, B, C and D are 4 contacts, where Contact A and B is in regular group RG AND contact B and C is smart group SG, C and D are in second SG2. Here's my test cases after applying the patches:

  1. Group = IS NULL : Return all contacts which is not added in any group - correct
  2. Group = SG: Return B and C - correct
  3. Group != SG: Return A,B,C,D - is it expected? or as per the fix if a contact is ONLY present in SG then it should be excluded from the list BUT if the contact is present is more then 1 smart grouo that it is not? The underlying where clause is:
// here 6 is the group ID of SG
`civicrm_group_contact-6`.group_id != 6 OR  ( `civicrm_group_contact_cache_6`.group_id NOT IN ("6") )
  1. Group NOT IN (SG) : Return A,B,C,D produce same result as of case3. But there is difference in where clause fragment:
civicrm_group_contact-6`.group_id NOT IN ( 6 ) OR  ( `civicrm_group_contact_cache_6`.group_id NOT IN ("6") )

Here we should exclude the OR operator which is expensive in terms of performance and both the sub-clause is same.
5. Group = IS NOT NULL : Return only those contacts which are added in Group.

@jitendrapurohit so overall :

  1. Can you fix the where clause as reported in 4rth case.
  2. Test failure seems to be related, can you fix those 2?
  3. Can you extend your added UT to check with Contacts other then substring check for whereClause?

@jitendrapurohit jitendrapurohit changed the title dev/core#192 - Search builder fails for != smart group filter wip - dev/core#192 - Search builder fails for != smart group filter Jun 25, 2018
@jitendrapurohit jitendrapurohit force-pushed the core-192 branch 2 times, most recently from 2b231e0 to 60185cd Compare June 28, 2018 10:39
@jitendrapurohit
Copy link
Contributor Author

jitendrapurohit commented Jun 28, 2018

Updated the PR and handled the NOT operator clauses in the same way as we do for civicrm_group_contact table. Tested all the above scenarios and looks fine on my local. Also updated the unit test. Can you verify if it works for you as well. Thanks @monishdeb

@jitendrapurohit jitendrapurohit changed the title wip - dev/core#192 - Search builder fails for != smart group filter dev/core#192 - Search builder fails for != smart group filter Jun 28, 2018
@monishdeb
Copy link
Member

@jitendrapurohit it seems that the 1 test failure is related to this patch. Can you fix that?

@jitendrapurohit
Copy link
Contributor Author

jitendrapurohit commented Jun 28, 2018

weird, the failing test is the one added in this PR and it works fine on my local. Anyway, I've pushed a commit which should fix the same on jenkin too 👍

@eileenmcnaughton
Copy link
Contributor

test this please

@eileenmcnaughton
Copy link
Contributor

(re-running tests as 22 days ago)

@eileenmcnaughton
Copy link
Contributor

@monishdeb looks like this needs a final ok from you

@monishdeb
Copy link
Member

monishdeb commented Jul 20, 2018

@eileenmcnaughton I am not sure in this case, as for smart group, it adds a subquery when != or NOT IN operator is used, which will cause a performance issue. As per the patch when I do a search on criteria Group != Smart Group (id = 10), the where clause for groups built as:

...
WHERE (  (  ( `civicrm_group_contact-10`.group_id != 10 AND `civicrm_group_contact_cache_10`.group_id IS NULL OR  ( `civicrm_group_contact_cache_10`.contact_id NOT IN (SELECT contact_id FROM civicrm_group_contact_cache cgcc WHERE cgcc.group_id IN ( 10 ) ) )  )  )  ) 

The use of OR operator and subquery will cause a performance hit on large site. As I recall, I had a discussion with you earlier to have separate LEFT JOINS for smart and regular groups respectively and in this way we can resolved the performance issue for groups. So should we implement this change in core under this issue?

@eileenmcnaughton
Copy link
Contributor

Hmm - yeah - we should get some performance testing for sure! @seamuslee001 has a good site for testing group search performance on

@jackrabbithanna
Copy link
Contributor

I've tried this out on a client site which we had found the accuracy problem while testing a 4.6 -> 5.6 upgrade. Upgraded to 5.7 now and applied this patch, and it actually get's the right results and doesn't WSOD. Not exactly sure out this site compares to Seamus's site...but we got not quite 100K contacts, several hundred smart groups, smart groups based on activities, with millions of activities....and logging enabled...At the very least, this is better than it was with 5.6 and no patch!

@seamuslee001
Copy link
Contributor

@jackrabbithanna we have somewhere about 700K maybe of contacts about 1K of groups (we are nuts) but that being said @eileenmcnaughton i would recommend merging based on @jackrabbithanna 's testing

@eileenmcnaughton
Copy link
Contributor

@seamuslee001 Any chance you can give it a quick spin to double check?

Looking at the query I think the OR might not be the problem OR that gave all those WSODs in 4.6 - in that it is in the WHERE not the join and it's 2 fields on the same table.

@jackrabbithanna
Copy link
Contributor

jackrabbithanna commented Nov 13, 2018

I was testing with 4 criteria, using 3 there where contact = group, and one where contact != group..

All groups were smart groups. It was a little slow, takes about 220s to return the search results....so by better than not working, yeah, but pretty slow...I'll take really slow over nothing at all any day though.

All the groups use criteria looking for if a contact has an activity of a certain type, between certain activity_date_times , with certain activity status...

MySQL 5.5 / PHP 5.6 instance, plenty of memory, no memcache, no varnish, no redis...

@eileenmcnaughton eileenmcnaughton added the merge ready PR will be merged after a few days if there are no objections label Nov 13, 2018
@eileenmcnaughton
Copy link
Contributor

OK - I've added merge-ready - I'd LIKE @seamuslee001 to give it a spin BUT if I look at these criteria

  1. it has a unit test
  2. it is broken / never worked & this unbreaks it / makes it work
  3. it has been performance tested

Then I think there is a good case to merge. I'll just recheck string escaping now

@eileenmcnaughton eileenmcnaughton removed the merge ready PR will be merged after a few days if there are no objections label Nov 13, 2018
@eileenmcnaughton
Copy link
Contributor

actually let's just check that escaping some more - I think it could be 'better'

@eileenmcnaughton
Copy link
Contributor

@jitendrapurohit the test works :-)

@monishdeb
Copy link
Member

Jenkins test this please

@jitendrapurohit
Copy link
Contributor Author

jenkins, retest this please

$groupClause[] = " ( " . $this->addGroupContactCache($smartGroupIDs, NULL, "contact_a", $op) . " ) ";
$groupContactCacheClause = '';
if (count($smartGroupIDs) || empty($value)) {
$isNullOp = (strpos($op, 'NULL') !== FALSE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we not just doing

 ($op == 'IS NULL') 
  • we don't support people playing fast & loose with the $op var do we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I see this variable is not required in this function after the latest changes. Pushed the commit to remove this line completely. Thanks for identifying.

if (!empty($groupContactCacheClause)) {
if ($isNotOp) {
$groupIds = implode(',', (array) $smartGroupIDs);
$gcTable = "`civicrm_group_contact-{$groupIds}`";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jitendrapurohit don't we still have appended input here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still some places in civi where id is appended to the group table and was little sceptic about removing it here. But it looks like they are not dependent on changes made in this file so i've just removed it now. will wait for the test build to run clean this time too.

@jitendrapurohit
Copy link
Contributor Author

test this please

@jitendrapurohit
Copy link
Contributor Author

With the last change in removing id from civicrm_group_contact- - the test that fails is -

$group = civicrm_api3('Contact', 'get', [
    'sequential' => 1,
    'return' => ["group"],
    'group' => array(group_title => 1),
  ]);

i.e, searching for contact with group title should retrieve all the group for the contacts which was done with the additional alias added to the query.

The latest update to this PR adds uniqid to the table name in the from clause which fixes it and retrieves correct number of groups in the result.

@jitendrapurohit
Copy link
Contributor Author

jenkins, retest this please

@eileenmcnaughton
Copy link
Contributor

woot tests are passing !!

@jackrabbithanna do you want to give this a last spin & then we can merge if all good

@jackrabbithanna
Copy link
Contributor

Sorry, I just noticed you pinged me on this.....I'll test this over the weekend and let you know....

@jackrabbithanna
Copy link
Contributor

Tried this out on one of our dev servers of a site that uses groups heavily, seems accurate, and faster....we'll be testing this more as the client and us use it.

@eileenmcnaughton
Copy link
Contributor

@jackrabbithanna thanks - I'm going to merge this now since it has been tested by you more than once now & also has unit tests. It will be another 6 weeks before it goes out live - during which time you'll be testing further - so please let us know any more updates from your testing

@eileenmcnaughton eileenmcnaughton merged commit 4ac1772 into civicrm:master Dec 17, 2018
eileenmcnaughton added a commit to eileenmcnaughton/civicrm-core that referenced this pull request Feb 14, 2019
We seem to be checking if the cache needs a rebuild and IF SO doing 2 actions
1) rebuilding it
2) permitting access to contacts related to it

When really it should do only 1) above - this patch changes as such

This seems to be whackamole off civicrm#12344
going too far one way & civicrm#13448
fixing but adding this new variant
eileenmcnaughton added a commit to eileenmcnaughton/civicrm-core that referenced this pull request Feb 14, 2019
We seem to be checking if the cache needs a rebuild and IF SO doing 2 actions
1) rebuilding it
2) permitting access to contacts related to it

When really it should do only 1) above - this patch changes as such

This seems to be whackamole off civicrm#12344
going too far one way & civicrm#13448
fixing but adding this new variant
eileenmcnaughton added a commit to eileenmcnaughton/civicrm-core that referenced this pull request Feb 14, 2019
We seem to be checking if the cache needs a rebuild and IF SO doing 2 actions
1) rebuilding it
2) permitting access to contacts related to it

When really it should do only 1) above - this patch changes as such

This seems to be whackamole off civicrm#12344
going too far one way & civicrm#13448
fixing but adding this new variant
@eileenmcnaughton
Copy link
Contributor

This turned out tricksy - we also got #13662 - the underlying issue is poor test coverage of a fairly tricky bit of functionality. This change DID improve that & @seamuslee001 has promised to add follow up tests but hoping you can check the new PR @jitendrapurohit

@jitendrapurohit jitendrapurohit deleted the core-192 branch February 22, 2019 04:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants