CRM-21753 add json validation for incoming criteria #11666

eileenmcnaughton · 2018-02-14T06:33:39Z

This is an adjunct to #11658 & intended to clarify discussion of the potential security aspect of processing input from the url. I will add the decision to that PR

the dedupe code has a mechanism to handle a nuanced array of criteria.

This PR is purely to discuss the passing of that criteria through the UI, with
a proposed validation rule.

The criteria in the url might look like

criteria={"contact" : {"first_name" :{"IN" : ["Peter", "Paul", "Mary"]]]}

This is passed to the contact api to limit the contacts to look for matches
for, prior to applying permissions (which is done when actually finding the
matches.

My specific concern is that the criteria is passed back to the template and
used in constructed urls - so I am looking to validate it on the way in
for xss in this PR.

CRM-21753: Support 'criteria' in url on batch merge form

eileenmcnaughton · 2018-02-14T10:22:07Z

test this please

monishdeb · 2018-02-22T09:08:38Z

@eileenmcnaughton I have tested and extended UT to include Json type check. Is there anything else that need to addressed before merging this PR?

eileenmcnaughton · 2018-02-22T20:19:18Z

Note from my point of view but I wanted to get some confirmation from security team that they agree in principle (@xurizaemon @totten @mlutfy @seamuslee001 ) - longer usage is in the linked PR but this is really covering it

totten · 2018-02-28T00:59:50Z

From a security perspective, the mere act of parsing JSON should be fairly safe. It's more akin to parse_str() or explode() than unserialize().

Note:

unserialize() is special because it accepts class names. Parsing JSON and XML and producing an array-tree (or stdClass-tree) is comparatively safe/simple.
I think there might once have been a security bug in php.net's implementation of json_decode(), but that would be a PHP platform issue and would be addressed by apt-get upgrade php-{foo}. Most PHP apps have public-facing end-points that parse JSON, and we already have several end-points which assume that php.net's json_decode() does its job safely.

The main security concerns of passing through JSON are subjective/contextual -- e.g. it would be a mistake for someone to say:

/* Line 1 */ $criteria = CRM_Utils_Request::retrieve('criteria', 'Json', $this, FALSE);
/* Line 2 */ civicrm_api3('Contact', 'get', $criteria)

The problem there isn't really with JSON parsing in line 1. Getting $criteria from parse_str() or $_GET or $_POST or SimpleXML would be just as problematic. The problem is that line 2 will pass anything to civicrm_api3() -- a dataflow like that requires more thoughtful validation or mapping, e.g. breaking down the keys/values that are allowed:

/* Line 1 */ $criteria = CRM_Utils_Request::retrieve('criteria', 'Json', $this, FALSE);
/* Line 2 */ civicrm_api3('Contact', 'get', array_intersect_key($criteria, ['first_name'=>1, 'last_name'=>1, 'contact_id'=>1]))

eileenmcnaughton · 2018-02-28T01:15:58Z

Hmm - what is the issue with 'passing anything' to the api? I guess the main concern would be to block any chaining. Ie. any criteria that alter the get in this context is fine (& permissions are applied later in the dupe calculation). So I feel like this is mergeable but in the larger PR (#11658) I should ensure any chaining fields are culled before calling

totten · 2018-02-28T03:54:11Z

...the main concern would be to block any chaining...

Right.

I'd steer towards whitelisting fields/notations rather blacklisting fields/notations because it's hard to anticipate what notations might be supported in future revisions of the API framework.

Tangentially, I don't know how much of a system you need for that use-case. The array_intersect_key() sounds promising+simple, but there's also a more sophisticated helper for validating an API call: Civi\API\WhitelistRule.

$apiRequest = \Civi\API\Request::create($entity, $action, $params);
$rule = new \Civi\API\WhitelistRule(/* see docblock */);
echo $rule->matches($apiRequest) ? "Good to go" : "No go";

So I feel like this is mergeable but in the larger PR (#11658) I should ensure any chaining fields are culled before calling

Makes sense.

eileenmcnaughton · 2018-02-28T06:50:17Z

Hmm - I specifically don't want to be whitelisting because I think there is a wide array of criteria you might want to use. But I think I can block anything that starts with 'api' & I'd be happy to block anything with a '.' in it - that would block any joins. I could also block any params that pass in arrays where the key is not an accepted sql operator. That covers what I can think of that would involve chaining. Being the contact api you could still get away with doing a search for contacts with contributions I think but if not then we can cross that when the demand arises

totten · 2018-03-02T06:41:41Z

All the changes in CRM/Utils/* seem OK to merge.

All those heuristics in your last comment sound plausible, but I wanted to check... in this case (CRM/Contact/Page/DedupeFind.php), do they give adequate protection? We can trace the path down, e.g.

CRM_Contact_Page_DedupeFind parses$criteria...
And passes it to CRM_Dedupe_Merger::batchMerge() ...
Then ::getDuplicatePairs() ...
And eventually CRM_Core_BAO_PrevNextCache::refillCache() ...
Which would indeed call civicrm_api3('Contact','get',...$criteria...).

To do any of this, one needs to get on the page CRM_Contact_Page_DedupeFind, which requires the permission merge duplicate contacts. How much information should be available to someone with merge duplicate contacts?

You might think merge duplicate contacts means "merge duplicate contacts anywhere in the database". (Why go this way? Because the user needs full access to the records to make good decisions about deduping/merging.) In this case, we'd handle $criteria rather permissively. (The $criteria should be validated to avoid API-chaining, but filtering with API-joining might be useful.)
You might think merge duplicate contacts means "merge duplicate contacts that are otherwise readable&writable based on my level of authorization". (Why? Because the user works for subdivision in a large org, and it wouldn't be practical to require all deduping be done by the head office.) In this case, we'd handle $criteria more restrictively. (The Contact.get query should be permission-checked, and the $criteria shouldn't allow you to identify entities or contacts that you can't see otherwise.)

I can't really say if interpretation #1 or #2 would have more appeal among admins, but... (please check my logic here, because I haven't looked at this code as much as you).... based on skimming the pre-existing code... it feels likely the existing behavior is more akin to #1? When refillCache() calls civicrm_api3(), it already returns a list of all contact IDs by default (with $criteria==array()). That sounds permissive -- passing $criteria is a way to voluntarily narrow a pool for which you already have visibility. In which case, adding lots of weird criteria is fine.

eileenmcnaughton · 2018-03-02T07:19:02Z

#2 is the way it is currrently - ie. AFTER the api call a list of mergeable contacts is generated & ACLs are applied to both sides of this list (ie. contact_a & contact_b). Adding 'check_permissions' to the api call would not add more permissioning but I am inclined to do it because

a) it would be clearer to someone reading the code and
b) it might be more performant to reduce the list earlier rather than later.

Regarding preventing chaining - the main risk would be that someone redirected someone to a url that has a chained call in it. I'm inclined to add an api option 'no_chaining' - it seems easier at the calling end to add that parameter & have it interpreted in the api than to code up the heuristics in some helper function that the calling function calls before calling the api.

eileenmcnaughton · 2018-03-23T06:03:17Z

@totten I've updated the filter at the point where the api call is actioned per discussion. Can we merge this now (it won't actually do anything without the larger PR but with this merged I can get the review of the larger PR to focus on the functionality & the code that really relates to that issue)

Per civicrm#11658 the dedupe code has a mechanism to handle a nuanced array of criteria. This PR is purely to support the passing of that json through the url criteria with a proposed validation rule. The criteria in the url might look like criteria={"contact" : {"first_name" :{"IN" : ["Peter", "Paul", "Mary"]]]} This is passed to the contact api to limit the contacts to look for matches for (before actually finding the matches). In discussion the biggest risk seemed to be the possibility of chaining so I have added a filter on the criteria & set check_permissions (acls are applied in the next step but having it on the api call is clearer & may reduce matches for the next step).

eileenmcnaughton · 2018-04-04T08:00:34Z

@monishdeb discussion on security channel has been from @seanmadsen

"Hmmm. My initial gut read is that that's probably pretty safe
If I get some time later this week, I'll see if I can find a way to break it. "

the concerns raised have been about the existing XSS String validation - in this context that is ADDITIONAL security - ie the key security is

api filtering - the only way the variables are used is by passing them to the api - which does it's own filtering AFTER filtering off any params that are not specifically in getfields
the output url has the criteria string appended, but passed through htmlentities to escape " & < & >
Also note it is json decoded & re-encoded

I think that we should merge & if @seanmadsen can find time to try to break the rc we can fix it in the rc since this has kicked a lot of cans without an actual objection

monishdeb · 2018-04-04T09:09:08Z

Ok tested it again on my local and also it got UT now. Also we got feedback from @totten I am going to merge this PR. If there is any issue/security concern like already discussed, the PR fix would be submitted against rc.

(CiviCRM Review Template WORD-1.1)

(r-explain) Pass
(r-test) Pass
(r-code) Pass
(r-doc) Pass
(r-maint) Pass
(r-run) Pass (confirmed on my local)
(r-user) Pass
(r-tech) Pass

civicrm-builder added the master label Feb 14, 2018

eileenmcnaughton force-pushed the validate_json branch from 185846c to 3cb1096 Compare March 23, 2018 06:00

eileenmcnaughton changed the title ~~[discussion] CRM-21753 proposal for json validation~~ CRM-21753 add json validation for incoming criteria Mar 23, 2018

eileenmcnaughton force-pushed the validate_json branch from 3cb1096 to 8825143 Compare March 27, 2018 07:01

monishdeb merged commit 33e74ac into civicrm:master Apr 4, 2018

eileenmcnaughton deleted the validate_json branch April 4, 2018 09:48

monishdeb mentioned this pull request Apr 16, 2018

CRM-21753 add pass-through support for criteria in urls on dedupe pages #11658

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CRM-21753 add json validation for incoming criteria #11666

CRM-21753 add json validation for incoming criteria #11666

eileenmcnaughton commented Feb 14, 2018 •

edited by civicrm-builder

Loading

eileenmcnaughton commented Feb 14, 2018

monishdeb commented Feb 22, 2018

eileenmcnaughton commented Feb 22, 2018

totten commented Feb 28, 2018

eileenmcnaughton commented Feb 28, 2018

totten commented Feb 28, 2018

eileenmcnaughton commented Feb 28, 2018

totten commented Mar 2, 2018 •

edited

Loading

eileenmcnaughton commented Mar 2, 2018

eileenmcnaughton commented Mar 23, 2018

eileenmcnaughton commented Apr 4, 2018 •

edited

Loading

monishdeb commented Apr 4, 2018 •

edited

Loading

CRM-21753 add json validation for incoming criteria #11666

CRM-21753 add json validation for incoming criteria #11666

Conversation

eileenmcnaughton commented Feb 14, 2018 • edited by civicrm-builder Loading

eileenmcnaughton commented Feb 14, 2018

monishdeb commented Feb 22, 2018

eileenmcnaughton commented Feb 22, 2018

totten commented Feb 28, 2018

eileenmcnaughton commented Feb 28, 2018

totten commented Feb 28, 2018

eileenmcnaughton commented Feb 28, 2018

totten commented Mar 2, 2018 • edited Loading

eileenmcnaughton commented Mar 2, 2018

eileenmcnaughton commented Mar 23, 2018

eileenmcnaughton commented Apr 4, 2018 • edited Loading

monishdeb commented Apr 4, 2018 • edited Loading

eileenmcnaughton commented Feb 14, 2018 •

edited by civicrm-builder

Loading

totten commented Mar 2, 2018 •

edited

Loading

eileenmcnaughton commented Apr 4, 2018 •

edited

Loading

monishdeb commented Apr 4, 2018 •

edited

Loading