-
Notifications
You must be signed in to change notification settings - Fork 482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New keywords in the denylist #430
Conversation
Hi @domanchi, what do you think about that? Thank you! |
From my new testing system, I found a good chunk more results with this change (2 unique results -> 11 unique results). From these additional results, I got only two false positives:
Strangely, it was able to find
Pretty good so far -- I wonder whether we should/can do anything about these FPs? They seem like regex issues, rather than candidates for additional filtering. |
Hi @domanchi, I tested your lines and I get only the real secrets: I also include the test file that I used to check your comment: test.txt. Can you check if this test file is correct and can you test again if you get the same results? Thank you very much in advance. I can understand the detection of these false positives because the presence of keywords like Thank you, we are waiting your feedback about this tests! |
Try without the $ cat false-positives
data = JSON.stringify {service_key: pagerDutyServiceApiKey, event_type: "trigger", description: description}
$this->privateKey = new PrivateKey($keys['privatekey']); |
Okey I will test it, thank you! |
Ohh! I think I know what's up. You might have the
|
Ohh you are right, the I'm not sure about how to apply a filter to exclude this results because I think that this cases are very specific of some languages, but in the other cases won't make sense. The first idea that i had was filter the lines with In the If you have any better idea, please tell us. If not, can you give us some feedback about this? Thank you very much! |
I think this is partially due to the introduction of spaces in a previous PR. After all, if we ignore the spaces, then a case could be made that this is behaving "as expected", since: service_key: pagerDutyServiceApiKey and $this->privateKey = new would have triggered the KeywordDetector (as I understand it). In this perspective, this issue lay dormant, mainly because our keyword list was short. Perhaps we should invert our "default" logic: given that we want to support spaces by default in our "secret values", we should require quotes by default (and have other filetypes map to no quotes required, like |
I was thinking in the spaces problem and in these false positives, and I think that there is not easy solution for the But in the case of the If you think that it is not a valid solution, we can implement a filter to detect the secrets that includes a comma following by a white space ( Please, share your opinion about this proposal @domanchi @syn-4ck . |
After looking at the experiment results, I think that this inversion makes a lot of sense. It reduced the total number of KeywordDetector unique results by 147 (out of 2631, ~5.5%), and almost all of the results that were eliminated were false positives. Some examples include: scss: .#{$fa-css-prefix}-user-secret:before { content: $fa-var-user-secret; } coffee: API_KEY = process.env.API_KEY python: def get_or_create_user(self, username, request=None, password=None): rst: .. _Bitbucket app passwords: https://bitbucket.org/account/admin/app-passwords ipynb: "os.environ['AWS_SECRET_ACCESS_KEY'] = creds.secret_key" php: final SecretKeySpec secret_key = new SecretKeySpec(asciiCs.encode(secret).array(), "HmacSHA256"); jsx: const { ResetPasswordPage: componentState } = state.components; Situations where it missed includes:
This is what I propose:
If you think about it, I think it makes sense. Many programming languages make a distinction between strings and variables, and we only want to flag the former. |
Hi @domanchi, we discussed your proposal and it makes sense for us. We can implement it today, do you prefer an independent Pull Request for that? In this way, this Pull Request can be merged and the change in the default regex of the KeywordDetector can be tested and modified in the new one. Please, tell me your preference to contributing with this change. Thank you for your great idea! |
Sure, that sounds good. |
Co-authored-by: detect-secrets-updater <detect-secrets-updater@ibm.com>
This Pull Request includes new items in the
DENYLIST
variable of theKeywordDetector
plugin. Moreover, It also removes the following words:aws_secret_access_key
: it will be detected due to the keywordsecret
. Theaws_
section will be captured as a prefix and_access_key
as a suffix.secrete
: it will be detected due to the keywordsecret
and the charactere
will be captured as a suffix.All the tests has been passed and I think the performance is not affected by this changes.