Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Detector for Softlayer tokens #254

Merged
merged 10 commits into from
Oct 24, 2019

Conversation

killuazhu
Copy link
Contributor

Add Detector for Softlayer tokens, it supports looking for 2 factors: username and password. It also supports verification.

justineyster and others added 9 commits October 18, 2019 09:12
* SoftLayerDetector

Detects Softlayer API Keys

Supports git-defenders/detect-secrets-discuss#120

* Lines were too long

* Address PR comments 1

* Address PR comments 2
* Strengthen SoftLayer regex by including "password" keywords

Noticed the legacy crawler searched for the words `password`, `pwd`, `pass` in addition to `key`. Adapted our solution to follow.

* Var name update

* Add additional assign operator from crawler, too
* get_username function

* Adapt regex

* Add username regex

* Complete 'get_username()' function

* Softlayer verify function

* First attempt at tests

* Finished writing tests

* Address @edwarj2 PR comments

* Address PR comments 1

* Address PR comments 2

* Address PR comments 3

* Valid if status_code=200
* feat: use assign regex in cloudant

* feat: use assign regex in db2

* feat: use assign regex in gh

* feat: use assign regex in iam

* feat: use assign regex in sl

* address comments

* address comments

* address comments
@killuazhu killuazhu force-pushed the kyle-contribute-softlayer-filtered branch from 0378f90 to 8b8b9d7 Compare October 18, 2019 15:23
@KevinHock KevinHock self-requested a review October 23, 2019 17:52
Copy link
Collaborator

@KevinHock KevinHock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review 👍

Comment on lines +272 to +278
begin = r'(?:(?<=\W)|(?<=^))'
opt_quote = r'(?:"|\'|)'
opt_open_square_bracket = r'(?:\[|)'
opt_close_square_bracket = r'(?:\]|)'
opt_dash_undrscr = r'(?:_|-|)'
opt_space = r'(?: *)'
assignment = r'(?:=|:|:=|=>| +|::)'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++, great use of non-capturing groups! (https://stackoverflow.com/questions/3512471/what-is-a-non-capturing-group-in-regular-expressions)
I should refactor some of our existing code to use those

@staticmethod
def assign_regex_generator(prefix_regex, password_keyword_regex, password_regex):
"""Generate assignment regex
It read 3 input parameters, each stands for regex. The return regex would look for
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It read 3 input parameters, each stands for regex. The return regex would look for
It reads 3 input parameters, each stands for regex. The return regex would look for

@@ -260,6 +260,39 @@ class FooDetector(RegexBasedDetector):
def denylist(self):
raise NotImplementedError

@staticmethod
def assign_regex_generator(prefix_regex, password_keyword_regex, password_regex):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nit: Maybe a more genericized naming could be keyword_regex(/secret_keyword_regex) and secret_regex, for cases like e.g. password_regex=username vs. secret_regex=username.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed password_keyword_regex -> secret_keyword_regex, password_regex -> secret_regex. I have left out the prefix_regex. Let me know if that's what you mean.

def find_username(content):
# opt means optional
username_keyword = r'(?:username|id|user|userid|user-id|user-name|' + \
r'name|user_id|user_name|uname)'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
r'name|user_id|user_name|uname)'
r'(?:'
r'username|id|user|userid|user-id|user-name|'
r'name|user_id|user_name|uname'
r')'
)

Copy link
Collaborator

@KevinHock KevinHock Oct 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, GitHub suggestions merged my line 46 and 47 suggestions :/ Mainly meant to say I prefer parenthesis over \, feel free to ignore though. Oh wait never mind, it is below.


def find_username(content):
# opt means optional
username_keyword = r'(?:username|id|user|userid|user-id|user-name|' + \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
username_keyword = r'(?:username|id|user|userid|user-id|user-name|' + \
username_keyword = (

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Nit: Just like to do () over \, other people don't, feel free to ignore.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using () looks good. I will update.

return VerifiedResult.VERIFIED_TRUE
else:
return VerifiedResult.VERIFIED_FALSE
except Exception:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can make this more specific, like a tuple of specific exceptions the requests.get call can throw, or just requests.exceptions.RequestException. (Up to you if you want to try retrying on e.g. requests.exceptions.Timeout)

To improve readability, since other lines won't raise exceptions, maybe we can only wrap the requests.get call with try/except.

For posterity, this made me think about if we want to do this for other plugins like Slack Stripe etc. 🤔 , I think I'm okay with, on the off-chance the Slack API is down when someone runs detect-secrets, that they get an exception and try running d-s later. (or having the user knowingly choose --no-verify, if the API is down). Just so there are not verifiability false-negatives in very rare cases.
On the other hand, we may have horrible luck if one hard-coded endpoint in d-s changes and always causes a requests exception, I'm kind of okay with taking that risk as it seems quite unlikely things wouldn't just 404 in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call can throw, or just requests.exceptions.RequestException.

Good point.

(Up to you if you want to try retrying on e.g. requests.exceptions.Timeout)

I will leave retry out.

For posterity, this made me think about if we want to do this for other plugins like Slack Stripe etc

Wrapping some exception would be necessary, otherwise with current code, if Slack, Stripe or AWS verification endpoint is down or unreachable for whatever reason, we will throw exception instead of gracefully report unverified. As you said, it can force user to use --no-verify to keep using d-s.

@@ -9,3 +9,4 @@ pyyaml
tox>=3.8
tox-pip-extensions
unidiff
responses
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe put after pyyaml to keep the requirements alphabetical.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted.

),
),
)
def test_find_username(self, content, expected_output):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in case you find that this type of secret is used in function calls, the following commit may help you add detection for that. (Code still in review -- c6037d9)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestions. @KevinHock. Unlike AWS tokens which has a specific format, the Softlayer username here is almost free format. It might be hard to match them in function calls. I will prefer to leave them out for now.

@killuazhu
Copy link
Contributor Author

@KevinHock all comments addressed.

@KevinHock KevinHock merged commit 133ad79 into Yelp:master Oct 24, 2019
killuazhu pushed a commit to IBM/detect-secrets that referenced this pull request May 28, 2020
killuazhu pushed a commit to IBM/detect-secrets that referenced this pull request Jul 9, 2020
killuazhu pushed a commit to IBM/detect-secrets that referenced this pull request Sep 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants