Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRM-21753 add json validation for incoming criteria #11666

Merged
merged 1 commit into from
Apr 4, 2018

Conversation

eileenmcnaughton
Copy link
Contributor

@eileenmcnaughton eileenmcnaughton commented Feb 14, 2018

This is an adjunct to #11658 & intended to clarify discussion of the potential security aspect of processing input from the url. I will add the decision to that PR

the dedupe code has a mechanism to handle a nuanced array of criteria.

This PR is purely to discuss the passing of that criteria through the UI, with
a proposed validation rule.

The criteria in the url might look like

criteria={"contact" : {"first_name" :{"IN" : ["Peter", "Paul", "Mary"]]]}

This is passed to the contact api to limit the contacts to look for matches
for, prior to applying permissions (which is done when actually finding the
matches.

My specific concern is that the criteria is passed back to the template and
used in constructed urls - so I am looking to validate it on the way in
for xss in this PR.


@eileenmcnaughton
Copy link
Contributor Author

test this please

@monishdeb
Copy link
Member

@eileenmcnaughton I have tested and extended UT to include Json type check. Is there anything else that need to addressed before merging this PR?

@eileenmcnaughton
Copy link
Contributor Author

Note from my point of view but I wanted to get some confirmation from security team that they agree in principle (@xurizaemon @totten @mlutfy @seamuslee001 ) - longer usage is in the linked PR but this is really covering it

@totten
Copy link
Member

totten commented Feb 28, 2018

From a security perspective, the mere act of parsing JSON should be fairly safe. It's more akin to parse_str() or explode() than unserialize().

Note:

  • unserialize() is special because it accepts class names. Parsing JSON and XML and producing an array-tree (or stdClass-tree) is comparatively safe/simple.
  • I think there might once have been a security bug in php.net's implementation of json_decode(), but that would be a PHP platform issue and would be addressed by apt-get upgrade php-{foo}. Most PHP apps have public-facing end-points that parse JSON, and we already have several end-points which assume that php.net's json_decode() does its job safely.

The main security concerns of passing through JSON are subjective/contextual -- e.g. it would be a mistake for someone to say:

/* Line 1 */ $criteria = CRM_Utils_Request::retrieve('criteria', 'Json', $this, FALSE);
/* Line 2 */ civicrm_api3('Contact', 'get', $criteria)

The problem there isn't really with JSON parsing in line 1. Getting $criteria from parse_str() or $_GET or $_POST or SimpleXML would be just as problematic. The problem is that line 2 will pass anything to civicrm_api3() -- a dataflow like that requires more thoughtful validation or mapping, e.g. breaking down the keys/values that are allowed:

/* Line 1 */ $criteria = CRM_Utils_Request::retrieve('criteria', 'Json', $this, FALSE);
/* Line 2 */ civicrm_api3('Contact', 'get', array_intersect_key($criteria, ['first_name'=>1, 'last_name'=>1, 'contact_id'=>1]))

@eileenmcnaughton
Copy link
Contributor Author

Hmm - what is the issue with 'passing anything' to the api? I guess the main concern would be to block any chaining. Ie. any criteria that alter the get in this context is fine (& permissions are applied later in the dupe calculation). So I feel like this is mergeable but in the larger PR (#11658) I should ensure any chaining fields are culled before calling

@totten
Copy link
Member

totten commented Feb 28, 2018

...the main concern would be to block any chaining...

Right.

I'd steer towards whitelisting fields/notations rather blacklisting fields/notations because it's hard to anticipate what notations might be supported in future revisions of the API framework.

Tangentially, I don't know how much of a system you need for that use-case. The array_intersect_key() sounds promising+simple, but there's also a more sophisticated helper for validating an API call: Civi\API\WhitelistRule.

$apiRequest = \Civi\API\Request::create($entity, $action, $params);
$rule = new \Civi\API\WhitelistRule(/* see docblock */);
echo $rule->matches($apiRequest) ? "Good to go" : "No go";

So I feel like this is mergeable but in the larger PR (#11658) I should ensure any chaining fields are culled before calling

Makes sense.

@eileenmcnaughton
Copy link
Contributor Author

Hmm - I specifically don't want to be whitelisting because I think there is a wide array of criteria you might want to use. But I think I can block anything that starts with 'api' & I'd be happy to block anything with a '.' in it - that would block any joins. I could also block any params that pass in arrays where the key is not an accepted sql operator. That covers what I can think of that would involve chaining. Being the contact api you could still get away with doing a search for contacts with contributions I think but if not then we can cross that when the demand arises

@totten
Copy link
Member

totten commented Mar 2, 2018

All the changes in CRM/Utils/* seem OK to merge.

All those heuristics in your last comment sound plausible, but I wanted to check... in this case (CRM/Contact/Page/DedupeFind.php), do they give adequate protection? We can trace the path down, e.g.

  • CRM_Contact_Page_DedupeFind parses$criteria...
  • And passes it to CRM_Dedupe_Merger::batchMerge() ...
  • Then ::getDuplicatePairs() ...
  • And eventually CRM_Core_BAO_PrevNextCache::refillCache() ...
  • Which would indeed call civicrm_api3('Contact','get',...$criteria...).

To do any of this, one needs to get on the page CRM_Contact_Page_DedupeFind, which requires the permission merge duplicate contacts. How much information should be available to someone with merge duplicate contacts?

  1. You might think merge duplicate contacts means "merge duplicate contacts anywhere in the database". (Why go this way? Because the user needs full access to the records to make good decisions about deduping/merging.) In this case, we'd handle $criteria rather permissively. (The $criteria should be validated to avoid API-chaining, but filtering with API-joining might be useful.)

  2. You might think merge duplicate contacts means "merge duplicate contacts that are otherwise readable&writable based on my level of authorization". (Why? Because the user works for subdivision in a large org, and it wouldn't be practical to require all deduping be done by the head office.) In this case, we'd handle $criteria more restrictively. (The Contact.get query should be permission-checked, and the $criteria shouldn't allow you to identify entities or contacts that you can't see otherwise.)

I can't really say if interpretation #1 or #2 would have more appeal among admins, but... (please check my logic here, because I haven't looked at this code as much as you).... based on skimming the pre-existing code... it feels likely the existing behavior is more akin to #1? When refillCache() calls civicrm_api3(), it already returns a list of all contact IDs by default (with $criteria==array()). That sounds permissive -- passing $criteria is a way to voluntarily narrow a pool for which you already have visibility. In which case, adding lots of weird criteria is fine.

@eileenmcnaughton
Copy link
Contributor Author

#2 is the way it is currrently - ie. AFTER the api call a list of mergeable contacts is generated & ACLs are applied to both sides of this list (ie. contact_a & contact_b). Adding 'check_permissions' to the api call would not add more permissioning but I am inclined to do it because

a) it would be clearer to someone reading the code and
b) it might be more performant to reduce the list earlier rather than later.

Regarding preventing chaining - the main risk would be that someone redirected someone to a url that has a chained call in it. I'm inclined to add an api option 'no_chaining' - it seems easier at the calling end to add that parameter & have it interpreted in the api than to code up the heuristics in some helper function that the calling function calls before calling the api.

@eileenmcnaughton eileenmcnaughton changed the title [discussion] CRM-21753 proposal for json validation CRM-21753 add json validation for incoming criteria Mar 23, 2018
@eileenmcnaughton
Copy link
Contributor Author

@totten I've updated the filter at the point where the api call is actioned per discussion. Can we merge this now (it won't actually do anything without the larger PR but with this merged I can get the review of the larger PR to focus on the functionality & the code that really relates to that issue)

Per civicrm#11658

the dedupe code has a mechanism to handle a nuanced array of criteria.

This PR is purely to support the passing of that json through the url criteria  with
a proposed validation rule.

The criteria in the url might look like

criteria={"contact" : {"first_name" :{"IN" : ["Peter", "Paul", "Mary"]]]}

This is passed to the contact api to limit the contacts to look for matches
for (before actually finding the matches). In discussion the biggest risk seemed to
be the possibility of chaining so I have added a filter on the
criteria & set check_permissions (acls are applied in the next step but
having it on the api call is clearer & may reduce matches for the next step).
@eileenmcnaughton
Copy link
Contributor Author

eileenmcnaughton commented Apr 4, 2018

@monishdeb discussion on security channel has been from @seanmadsen

"Hmmm. My initial gut read is that that's probably pretty safe
If I get some time later this week, I'll see if I can find a way to break it. "

the concerns raised have been about the existing XSS String validation - in this context that is ADDITIONAL security - ie the key security is

  1. api filtering - the only way the variables are used is by passing them to the api - which does it's own filtering AFTER filtering off any params that are not specifically in getfields
  2. the output url has the criteria string appended, but passed through htmlentities to escape " & < & >
  3. Also note it is json decoded & re-encoded

I think that we should merge & if @seanmadsen can find time to try to break the rc we can fix it in the rc since this has kicked a lot of cans without an actual objection

@monishdeb
Copy link
Member

monishdeb commented Apr 4, 2018

Ok tested it again on my local and also it got UT now. Also we got feedback from @totten I am going to merge this PR. If there is any issue/security concern like already discussed, the PR fix would be submitted against rc.

(CiviCRM Review Template WORD-1.1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants