Skip to content
This repository has been archived by the owner on Aug 19, 2022. It is now read-only.

Disallow reserved chars on id fields #529

Merged
merged 18 commits into from
May 11, 2018

Conversation

cianfoley-nearform
Copy link
Contributor

@cianfoley-nearform cianfoley-nearform commented May 3, 2018

I've enforced all ids to take the format [A-Za-z0-9-]+ enforced in both database (to enforce for imports) and interface

It was problematic with ltree column as that only allows [A-Za-z0-9_]+

Our autogenerated ids returned back uuids with - and so that lead to inconsistencies between entities
(teams autogenerted ids were replacing - with _ before insertion)

Rather than having inconsistencies i've enforced all ids to one format, and before insertion it translates - to _ for ltree paths, and when returning paths to interface translates back to - (done in mapping)

The reason I did not allow both characters on input is because ids could clash if we convert all - to _ and also allow _ on input

This makes for nice looking urls and protects from sql injection (a lot of sql injection tests were nullified with these input constraints)

Copy link
Contributor

@ShogunPanda ShogunPanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@dberesford dberesford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concern I have about this is that we're restricting what id's people can use in Udaru, where these id's map to ids of external systems. For example, user ids that come from an identity provider which are then used in Udaru and provide the mapping back to the real user. We shouldn't be enforcing that these id's are now possibly invalid, as then you can't use udaru.
@mihaidma can you check if that would be the case on your project for example?

@mihaidma
Copy link
Contributor

mihaidma commented May 4, 2018

Regarding the ids that i saw used in solutions: string ids, Auth0 has ids in the form auth0|5ad9b4708036f90f3cf2929f, Okta has string ids, there are also UUIDs in the form 9c5cbc3a-f7e9-4400-7f90-5d47e9a27a8e

@mihaidma
Copy link
Contributor

mihaidma commented May 4, 2018

@cianfoley-nearform so i spot one issue with the Auth0 ids that contain a |
I would go with the minimal needed restrictions on characters

@cianfoley-nearform
Copy link
Contributor Author

Fair points guys, but we restrict team IDs already to uuid with _ instead of dashes (nothing else can go in there because of the way paths work.

If we have a restriction on team ids why not on other entities? should they not be consistent?

@cianfoley-nearform
Copy link
Contributor Author

See david's comment on best practice here: #523

Here's the link https://www.owasp.org/index.php/Input_Validation_Cheat_Sheet#Whitelisting_vs_blacklisting

based on this we should look to see what characters we should allow... i've often seen {} around uuids, and I in oath are 2 candidates, bear in mind that we won't be able to use these on team_ids

So, the question is do we we allow a free for all (minus url reserved characters that would affect the route), or do we whitelist and then have inconsistency between teams and other entities?

If there is a link to id in an external system, the metadata field could be used for the external id, or the external system could maintain a mapping... we currently don't store metadata on policies however, which I restricted in this PR

@mihaidma
Copy link
Contributor

mihaidma commented May 7, 2018

I would go with accepting all chars excepting the ones that are restricted by html or ltree but would not expand the limitations just for consistency reasons:

  • on teams limit to ltree allowed characters
  • on others limit to characters with no html meaning

@cianfoley-nearform
Copy link
Contributor Author

cianfoley-nearform commented May 8, 2018

ok if we go with whitelisting, which is the correct approach, we need to define what is allowed, it could be a long list, unless we stick with the RFCs unreserved characters

From RFC: http://www.ietf.org/rfc/rfc3986.txt

2.3.  Unreserved Characters

   Characters that are allowed in a URI but do not have a reserved
   purpose are called unreserved.  These include uppercase and lowercase
   letters, decimal digits, hyphen, period, underscore, and tilde.

      unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"

except on team we only allow _

this does not allow for '|', google does recommend having pipe

do we want to allow others that require encoding?

* update of swagger json

* latest swagger-ui-dist

* swagger-gen removed
* policy assignment must use objects

* label update for swagger docs
Copy link
Contributor

@dberesford dberesford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@coveralls
Copy link

coveralls commented May 10, 2018

Coverage Status

Coverage increased (+0.2%) to 93.174% when pulling 4f4b033 on disallow-reserved-chars-on-id-fields into 28414a3 on master.

@dberesford dberesford merged commit 5533272 into master May 11, 2018
@dberesford dberesford deleted the disallow-reserved-chars-on-id-fields branch May 11, 2018 08:37
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants