Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add org permissions fetcher #53

Merged
merged 1 commit into from
Mar 2, 2021

Conversation

mic67mel
Copy link
Contributor

@mic67mel mic67mel commented Feb 10, 2021

What

Add a fetcher that will gather forks list for each repo of a Github organization.

Why

This fetcher is needed to control access to source code.

How

  • fetch_org_permissions.py
  • README.md

Test

N/A

Context

@mic67mel mic67mel changed the title Forks fetcher Add forks fetcher Feb 10, 2021
@@ -0,0 +1,61 @@
# -*- mode:python; coding:utf-8 -*-
# Copyright (c) 2020 IBM Corp. All rights reserved.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copyright should be 2021.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, thanks

Copy link
Contributor

@alfinkel alfinkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pausing... I think #53 (comment) needs resolution before we proceed.

CHANGES.md Outdated
Comment on lines 1 to 3
# [0.10.0](https://github.com/ComplianceAsCode/auditree-arboretum/releases/tag/v0.10.0)

- [ADDED] Added Github org forks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Rebase and change this to 0.11.0.
  • Add punctuation.
Suggested change
# [0.10.0](https://github.com/ComplianceAsCode/auditree-arboretum/releases/tag/v0.10.0)
- [ADDED] Added Github org forks
# [0.11.0](https://github.com/ComplianceAsCode/auditree-arboretum/releases/tag/v0.11.0)
- [ADDED] GitHub org forks fetcher added to `Permissions`.

@@ -14,4 +14,4 @@
# limitations under the License.
"""Arboretum - Checking your compliance & security posture, continuously."""

__version__ = '0.9.0'
__version__ = '0.10.0'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.11.0 now

@@ -95,6 +95,78 @@ day.
from arboretum.permissions.fetchers.github.fetch_org_collaborators import GithubOrgCollaboratorsFetcher
```

### Organization Integrity (Repository Forks)

* Class: [GithubRepoForksFetcher][gh-org-fetcher]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't the gh-org-fetcher. You should provide a proper link and reference it here correctly.

Also, it seems like it would make sense to rename the class as GithubOrgForksFetcher? Suggest renaming the class and changing the link ref for gh-org-fetcher to gh-org-collaborators-fetcher and adding a gh-org-forks-fetcher link reference to button it all up.


* Class: [GithubRepoForksFetcher][gh-org-fetcher]
* Purpose: Writes the repository forks in Github organizations to the evidence locker. This fetcher class is only meant for use with Github or Github Enterprise organizations.
* Behavior: For each Github organization specified, an evidence file is stored in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Behavior: For each Github organization specified, an evidence file is stored in
* Behavior: For each Github organization specified, an evidence file is stored in

Comment on lines 107 to 118
* `org.permissions.org_integrity.orgs`
* Required
* List of dictionaries each containing Github organization retrieval configuration.
* `url`
* Required
* String in the form of `"https://github.com/my-org"` or `"https://github.<company>.com/my-org"`.
* Use to define the Github organization url to use.
* `repos`
* Optional
* List of strings in the form of `["my-repo", "my-other-repo"]`.
* Defaults to all repositories in the organization.
* Use if looking to filter collaborator evidence to a subset of repositories in the organization otherwise do not include.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By doing this you're tying the collaborators fetcher and the forks fetcher together. If someone chooses to run both of these fetchers in their environment then they're forced to run them for the exact same set of orgs and repos. Is that the desired behavior?

@mic67mel mic67mel changed the title Add forks fetcher Add org permissions fetcher Feb 17, 2021
Copy link
Contributor

@alfinkel alfinkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some cosmetic refactoring needed. Feel free to squash commits down to one as part of the next review cycle. I think we're almost GTG here.

@@ -16,14 +16,12 @@ how to include the fetchers and checks from this library in your downstream proj

## Fetchers

### Organization Integrity (Repository Collaborators)
### Organization Integrity (Repository Collaborators and Forks)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Organization Integrity (Repository Collaborators and Forks)
### Organization Integrity Permissions

Comment on lines 23 to 24
* Behavior: For each Github organization specified an evidence file per collaborator type (affiliation) and an evidence file with the list of repository forks are stored in
the locker. The default is to retrieve all collaborators by affiliation and all forks from all repositories in each specified Github organization. TTL is set to 1 day.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Behavior: For each Github organization specified an evidence file per collaborator type (affiliation) and an evidence file with the list of repository forks are stored in
the locker. The default is to retrieve all collaborators by affiliation and all forks from all repositories in each specified Github organization. TTL is set to 1 day.
* Behavior: For each Github organization specified, Github collaborator and Github fork evidence files per collaborator type (affiliation) are stored in
the locker containing details for the specified repositories in the organization. The default is to
retrieve all collaborators and all forks by affiliation from all repositories in each specified Github organization. TTL is set to 1 day.

@@ -36,7 +34,7 @@ day.
* Optional
* List of strings in the form of `["my-repo", "my-other-repo"]`.
* Defaults to all repositories in the organization.
* Use if looking to filter collaborator evidence to a subset of repositories in the organization otherwise do not include.
* Use if looking to filter collaborator and fork evidence to a subset of repositories in the organization otherwise do not include.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Use if looking to filter collaborator and fork evidence to a subset of repositories in the organization otherwise do not include.
* Use if looking to filter permissions evidence to a subset of repositories in the organization otherwise do not include.

[repository-permissions]: https://docs.github.com/en/free-pro-team@latest/github/setting-up-and-managing-organizations-and-teams/repository-permission-levels-for-an-organization
[org-collaborators-check]: https://github.com/ComplianceAsCode/auditree-arboretum/blob/main/arboretum/permissions/checks/test_org_collaborators.py
[fetch-org-permissions]: https://github.com/ComplianceAsCode/auditree-arboretum/blob/main/arboretum/permissions/fetchers/github/fetch_org_permissions.py
[repository-permissions]: https://docs.github.com/en/free-pro-team@latest/github/setting-up-and-managing-organizations-and-teams/repository-permission-levels-for-an-organization[org-collaborators-check]: https://github.com/ComplianceAsCode/auditree-arboretum/blob/main/arboretum/permissions/checks/test_org_collaborators.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repository-permissions doesn't seem right here. I think you're breaking the link.



class GithubOrgPermissionsFetcher(collabs.GithubOrgCollaboratorsFetcher):
"""Fetch Github repository colaborators and forks."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Fetch Github repository colaborators and forks."""
"""Fetch Github repository permissions evidence."""

BTW you should turn auto spell check on in your IDE or editor (collaborators is misspelled). It would make the review go smoother.

Comment on lines 28 to 33
@classmethod
def setUpClass(cls):
"""Initialize the fetcher object with configuration settings."""
super(GithubOrgPermissionsFetcher, cls).setUpClass()
cls.gh_pool = {}
return cls
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this. You're just replicating the parent class method and running it twice.

Comment on lines 20 to 22
from compliance.evidence import DAY, RawEvidence, raw_evidence
from compliance.utils.data_parse import get_sha256_hash
from compliance.utils.services.github import Github
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you've imported the entire fetch_orgs_collaborators module as collabs you don't really need to reimport these. Instead you should be able to just reference them in the code via collabs like collabs.DAY, collabs.Github, etc...

"""Fetch Github repository forks."""
for config in self.config.get('org.permissions.org_integrity.orgs'):
host, org = config['url'].rsplit('/', 1)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the purpose of having this empty line? If not then I suggest removing it.

Comment on lines 41 to 42
json_file = f'gh_forks_{url_hash}.json'
path = ['permissions', json_file]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
json_file = f'gh_forks_{url_hash}.json'
path = ['permissions', json_file]
path = ['permissions', f'gh_forks_{url_hash}.json']

Comment on lines 58 to 51
forks_url = f'repos/{org}/{repo}/forks'
forks[repo] = self.gh_pool[host].paginate_api(
forks_url
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
forks_url = f'repos/{org}/{repo}/forks'
forks[repo] = self.gh_pool[host].paginate_api(
forks_url
)
forks[repo] = self.gh_pool[host].paginate_api(
f'repos/{org}/{repo}/forks'
)

@mic67mel mic67mel force-pushed the add-ghe-forks-fetcher branch 2 times, most recently from 02ded94 to 95c7b01 Compare March 1, 2021 16:44
Copy link
Contributor

@alfinkel alfinkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a deprecated decorator with a reason (similar to the entry in the CHANGES.md) to the org integrity collabs fetcher class? The deprecated module is already being brought into arboretum through the framework.

ref: https://github.com/tantale/deprecated#usage

see here for framework example.

I think once you do that we should be good to go.

CHANGES.md Outdated
@@ -1,3 +1,7 @@
# [0.11.0](https://github.com/ComplianceAsCode/auditree-arboretum/releases/tag/v0.11.0)

- [ADDED] Organization repository permissions fetcher added to `permissions`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add more detail to your [ADDED] entry and add a [DEPRECATED] entry for the org repo collabs fetcher and state that its functionality is now part of the permissions fetcher? Something like:

[ADDED] Github org integrity fetcher added to `permissions`.
[ADDED] Github org integrity forks fetcher functionality added Github org integrity fetcher.
[DEPRECATED] Github org integrity collaborators fetcher functionality moved to Github org integrity fetcher. 

host, org = config['url'].rsplit('/', 1)
url_hash = collabs.get_sha256_hash([config['url']], 10)
path = ['permissions', f'gh_forks_{url_hash}.json']
description = (f'Fork of the {org} GH org')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a bit of clarification and the parentheses aren't needed. Suggest:

Suggested change
description = (f'Fork of the {org} GH org')
description = f'Forks of repos in the {org} GH org'

Copy link
Contributor

@alfinkel alfinkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not for my dumb suggestions in the last review this would be good to go. Make the few tweaks I mentioned below and we can merge. Thanks!

CHANGES.md Outdated

- [ADDED] Github org integrity fetcher added to `permissions`.
- [ADDED] Github org integrity forks fetcher functionality added Github org integrity fetcher.
- [DEPRECATED] Github org integrity collaborators fetcher functionality moved to Github org integrity permissions fetcher.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- [DEPRECATED] Github org integrity collaborators fetcher functionality moved to Github org integrity permissions fetcher.
- [CHANGED] Github org integrity collaborators fetcher functionality added to Github org integrity permissions fetcher.

Comment on lines 26 to 34
from deprecated import deprecated


@deprecated(
reason=(
'Github org integrity collaborators fetcher functionality'
'moved to Github org integrity permissions fetcher'
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove all of this. We shouldn't deprecate the fetcher since it is the parent class for the new org integrity fetcher. Sorry for causing the unnecessary thrash here.

Copy link
Contributor

@alfinkel alfinkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@alfinkel alfinkel merged commit 5d63dd0 into ComplianceAsCode:main Mar 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants