Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full-text search for Activities in Advanced Search form. #7833

Closed
wants to merge 3 commits into from

Conversation

quoma
Copy link

@quoma quoma commented Feb 21, 2016

Implemented the functionality to search activities by a text query and operator ("Contains", "Starts with", etc.) in Advanced Search form.

Search is performed only upon columns of civicrm_activity table, regardless of custom fields (which is the next step).
@civicrm-builder
Copy link

Can one of the admins verify this patch?

@quoma
Copy link
Author

quoma commented Feb 21, 2016

@civicrm-builder, it needs a code review

@colemanw
Copy link
Member

jenkins add to whitelist

@colemanw
Copy link
Member

@quoma - civicrm-builder (aka jenkins) is a bot ;)

@agh1
Copy link
Contributor

agh1 commented May 12, 2016

Jenkins, retest this please

@agh1
Copy link
Contributor

agh1 commented May 12, 2016

@quoma I'm going through a number of open PRs that may be considered for 4.7.8, and to be considered, this will need the tests to be passing. It looks like you've just got style problems such as lacking empty lines between the breaks within your switch, extra spaces, missing spaces, and so forth. Please read the checkstyle warnings (use the Details tab to see the actual issues and line numbers).

If you can commit your fixes in the next couple of days and the tests pass, we'll be able to consider it for merging this month.

@JoeMurray
Copy link
Contributor

Heh @quoma thanks for all your contributions. As Release Manager this month, I'm trying to recruit people to help pare down the backlog of almost 100 PRs, some going back to last summer. I'm wondering if you would be able to help QA another PR if I got someone to QA this PR?

@JoeMurray
Copy link
Contributor

@quoma could you create an issue in JIRA indicating what you intended this PR to do it would be easier . Thanks.for it to be QA'd.

@JoeMurray
Copy link
Contributor

@anuditverma would you be interested in QA'ing this PR?

@JoeMurray
Copy link
Contributor

@anuditverma, as you haven't responded, I'm going to have @bsilvern take a look at this PR. Let me know if you would have time to work on reviewing something next week in the release candidate.

@bsilvern
Copy link
Contributor

QA Review: This new feature creates in the Activities section of the Advanced Search page a mode selection drop-down with nine search modes and a search string text box, providing comprehensive text search capabilities for activities. I focused my testing only on activity types Phone calls, Emails, and Tell a friend (out of around 70 defined activity types) for 27 tests in total, in addition to additional random poking around.

Generally, this PR performs as expected, but I noted the following issues for which there is room for improvement:

  1. A Help popup should be added to explain:
    ● Which fields are searched in each activity type. Those seem to include subject and details, but not location. I did not attempt to test other fields than those three.
    ● Any known quirks or limitations (see below).
    ● A reminder of the RegEx characters which have special meaning and therefore need to be escaped might be useful.
  2. Searching with modes "Does not contain", "Does not start with" and "Does not end with": NULL text fields are not matched.
    Not sure if this is a bug or a feature, but it was not what I expected and should be explained in the Help.
  3. Searching with "Contains all the words" mode: Matches if all words appear anywhere within the searchable fields of an activity (not necessarily with all search word in a single field). Not unexpected, but not obvious. Should be mentioned in the Help.
  4. HTML fields, e.g. Details field of Email or Phone call activity, cause unexpected behavior in the following search modes:
    ● Starts with
    ● Does not start with
    ● Ends with
    ● Does Not End with
    ● RegEx expressions that involve ^ (beginning of string) or $ (end of string).
    This is due to the implicit <p> and </p> that surrounds each paragraph.
    I found that ^.{3}start_of_text did match in my test cases, but I was unable to explicitly match the HTML characters. Using ^<p>start_of_text, for example, resulted in no matches and ^&lt;p&gt;start_of_text was displayed in the search box after the search. I also tried ^\<p\>start_of_text, also receiving no match, and resulted in ^\&lt;p\&gt;start_of_text displayed in the search box.
    Similarly, I found that end_of_text.{6}$ matches, presumably due to the implicit </p>\r\n at the end of the string.
    Matching would similarly be expected to fail due to any other HTML tags embedded within the otherwise matching text.
    To add a layer of confusion to this, angle brackets in HTML are stored inconsistently in civicrm_activity. Assume CKeditor shows the text value in SOURCE mode as <p>Hello <strong>Bob</strong></p>.
    This is stored in civicrm_activity as <p>Hello &lt;strong&gt;Bob&lt;/strong&gt;</p>
    Note that while <p> is stored as <p>, <strong> is stored as &lt;strong&gt;. Argh!
  5. Other problematic characters: Apostrophes in HTML fields are not matched by an apostrophe in the search box as they are stored as &#39;. I did not test other non-alphanumeric characters. Matching on special characters could possibly be improved by a transliterating operation on the search string prior to the query, but this is complicated by the inconsistent encoding methods used in each field. In any case, any special characters which behave unexpectedly should be documented in the Help.
  6. "OR" mode: This PR did not function as I expected in OR mode, but then neither did version 4.7.8 without this PR. For example, with OR mode selected I searched activities for Subject=Bob, Has Followup Activity=No, Activity Status=Cancelled. I got 0 results returned which is not the expected result as there were activities which met some of the criteria. The search performed seems to be an AND of the criteria, despite the message provided:
    Activities without Followup Activities ...OR...
    Subject Like %Bob% ...OR...
    Activity Status In Cancelled
    Searching activities in OR mode therefore appears to be broken, and this prevented testing this PR in that mode.
    Submitted CRM-19010.

In conclusion, the primary issue with this PR is that searches on HTML text fields may return unexpected results due to the HTML tags and encoding of certain characters. Possible solutions include:
● Strip each field of tags and fix character encoding on the fly during the search. Probably a significant performance hit.
● Strip HTML tags and fix character encoding when saving activity details. I'm not sure what the implications of this might be. Does this field need to contain HTML?
● Add a details_text field to the table and a trigger to ensure that it stays sync'd with the details field. Would increase the DB size while providing little benefit to most users.
● Leave it as-is, but well-document the quirks of HTML searches in the Help. Most users of course will never read the Help, even when they don't get the expected results. Some may not even realize the results are wrong.

Note that these same issues currently exist in Full Text Search or Search Builder when performing a search on activity details.

@JoeMurray
Copy link
Contributor

@bsilvern could you clarify if all of the concerns you have regarding functionality are ones that are already present with Full Text Search and activity Search Builder results? If so, it might be useful to create a separate issue to modify the functionality for all of them together so that there are consistent results between different approaches.

@quoma would you be able to address the issues identified by @bsilvern in time for next month's merge? We're starting to identify potential PRs to merge in July, and it would be great to see this one included.

@bsilvern
Copy link
Contributor

@JoeMurray: Yes, HTML fields do present a problem when using any of the search methods.

Issue Description Advanced Search (PR-7833) Search Builder Full Text Search
1 Help Needed Yes, refers to Civi book Yes
2 "Does not contain", "Does not start with" and "Does not end with": Match Null fields No N/A N/A
3 "All the words": Matches if words appear in any searched field in any combination. Yes N/A N/A
4a RegEx: ^ and $: Match only with wildcards due to implicit HTML tags, e.g. <p> Yes Yes N/A
4b RegEx: Match fails due to HTML formatting tags, e.g. <strong> Yes Yes N/A
4c Like: Matches only with wildcards due to implicit HTML tags, e.g. <p> N/A Yes N/A
4d Like: Match fails due to HTML formatting tags, e.g. <strong> N/A Yes N/A
4e Other modes: Match fails due to HTML formatting tags, e.g. <strong> Yes (using "Contains" operator) Yes (using "=" operator) Yes
5 Apostrophes in HTML fields: Match fails Yes Yes Yes
6 OR mode performs as expected No Yes N/A

@JoeMurray
Copy link
Contributor

@colemanw As the person responsible for UX of CiviCRM in general, I'm wondering if you can comment on where we should go next on dealing with searches of html fields, based on @bsilvern's great summary here. It doesn't appear that @quoma is still engaged, despite the very useful functionality the PR provides.

@eileenmcnaughton
Copy link
Contributor

I think if it's not working, & not being worked on we should open a JIRA, link to this & note that there is code & discussion here & close the PR.

JIRA is the place to track work that is not on the verge of completion

@litespeedmarc
Copy link
Contributor

litespeedmarc commented Sep 26, 2016

Echo @eileenmcnaughton's comments. If we're not going to follow-up on items, then it's time to close this (there seem to be quite a few "almost complete" PRs waiting to get merged).

Created https://issues.civicrm.org/jira/browse/CRM-19416 for the original feature request, which is linked to the outstanding issue mentioned above, (https://issues.civicrm.org/jira/browse/CRM-19010)

Can an admin please close this?

@colemanw colemanw closed this Sep 30, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants