Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YAML files don't support inline whitelisting #50

Merged
merged 4 commits into from
Jul 13, 2018

Conversation

LouisTrezzini
Copy link
Contributor

No description provided.

@domanchi
Copy link
Contributor

domanchi commented Jul 5, 2018

Does this only fail with YAML files? Or does it fail on other file types too?

@LouisTrezzini
Copy link
Contributor Author

It works fine with python files

@domanchi
Copy link
Contributor

domanchi commented Jul 5, 2018

@LouisTrezzini, please open an issue to report this bug, or complete this pull request to fix it. Pull requests should be used for code wanting to be merged in with the master branch.

@domanchi domanchi changed the title High entropy string should be whitelist-able YAML files don't support inline whitelisting Jul 5, 2018
@@ -38,17 +38,17 @@ def analyze_string(self, string, line_num, filename): # pragma: no cover

NOTE: line_num and filename are used for PotentialSecret creation only.
"""
pass
raise NotImplementedError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really needed, because abc.abstractmethod will prevent it from even being initialized.

filename,
),
)
if not item['__line__'] in ignored_lines:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use and instead, to avoid hadouken code?

http://i.imgur.com/BtjZedW.jpg

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you look more attentively, the continue statement is shared

but I agree it's not very elegant

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, gotcha. ++

data = YamlLineInjector(file).json()
parser = YamlFileParser(file)
data = parser.json()
ignored_lines = parser.get_ignored_lines()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why we want to do this all at one time, as compared to scanning as needed?

e.g.

if '__line__' in item and not WHITELIST_REGEX.search(item['__value__']):
    pass

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the current behavior is doing what you suggest:

                    if '__line__' in item:
                        potential_secrets.update(
                            self.analyze_string(
                                item['__value__'],
                                item['__line__'],
                                filename,
                            ),
                        )

But value is actually the value, not the full line, so the comment is dropped
pyYAML drops comments during preprocessing so we can't use it

I decided to scan the file once to identify all ignored lines and later do a simple O(1) line in ignored_lines check

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh. Right. Good point.

Can you explain this in the docstring for get_ignored_lines? Specifically regarding the fact that the parser drops the comments, and thus, we need to parse the file separately from yaml parsing.

Copy link
Contributor

@domanchi domanchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix'n'ship!

filename,
),
)
if not item['__line__'] in ignored_lines:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, gotcha. ++

data = YamlLineInjector(file).json()
parser = YamlFileParser(file)
data = parser.json()
ignored_lines = parser.get_ignored_lines()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh. Right. Good point.

Can you explain this in the docstring for get_ignored_lines? Specifically regarding the fact that the parser drops the comments, and thus, we need to parse the file separately from yaml parsing.

@LouisTrezzini
Copy link
Contributor Author

LouisTrezzini commented Jul 13, 2018

I pushed the changes you asked for, feel free to merge

@domanchi domanchi merged commit 11b8768 into Yelp:master Jul 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants