Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing endpoints available in singer-io/tap-github #93

Merged
merged 50 commits into from
Apr 7, 2022
Merged
Show file tree
Hide file tree
Changes from 48 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
214bbbf
Add assignees stream
Ry-DS Mar 10, 2022
d2e10df
Fix issues with assignee stream
Ry-DS Mar 10, 2022
08aa7e7
Add collaborators stream
Ry-DS Mar 10, 2022
231596b
Add review comments and reviews stream
Ry-DS Mar 10, 2022
4d967cf
Fix review comments stream and use repo parent instead
Ry-DS Mar 10, 2022
e3b01bd
Fix mypy issue
Ry-DS Mar 10, 2022
f2d7f53
Fix tests
Ry-DS Mar 12, 2022
7f17a65
add milestone and commit comment streams
Ry-DS Mar 12, 2022
313b742
Fix mypy
Ry-DS Mar 12, 2022
9ea76f5
Fix tests
Ry-DS Mar 13, 2022
2b66a73
commit wip todo streams
Ry-DS Mar 13, 2022
b27871d
fix formatting
Ry-DS Mar 13, 2022
ff14da1
[ci skip] format todo file and fix arraytype usage
Ry-DS Mar 13, 2022
f146c2a
[ci skip] more regex magic to convert everything to classes
Ry-DS Mar 13, 2022
89fb408
Add paths [ci skip]
Ry-DS Mar 13, 2022
d0352c7
Move all streams to main file
Ry-DS Mar 13, 2022
fc70b2a
Add replication keys
Ry-DS Mar 13, 2022
7730b86
fix tests (change type to datetime)
Ry-DS Mar 14, 2022
f68c260
introduce streams enum
Ry-DS Mar 17, 2022
3132b36
Fix up organization stream
Ry-DS Mar 17, 2022
cf7c467
Reverse order of testing versions
Ry-DS Mar 18, 2022
9258fd3
remove unsupported types from class
Ry-DS Mar 18, 2022
38e41a0
Fix format
Ry-DS Mar 18, 2022
c3d621e
Try use capital types to pass ci
Ry-DS Mar 18, 2022
549e399
Fix tap not including org streams on organization given
Ry-DS Mar 18, 2022
47dc14d
Add test for org stream
Ry-DS Mar 18, 2022
46257e1
Add rest of org streams
Ry-DS Mar 18, 2022
8f702cb
[ci skip] Temp changes for testing
Ry-DS Mar 18, 2022
1f5aad3
Fix parent context being missing
Ry-DS Mar 18, 2022
25c05c5
Set ignore parent replication to true for project
Ry-DS Mar 18, 2022
abf8407
fix mypy issue
Ry-DS Mar 18, 2022
53cca95
fix mistyped params
Ry-DS Mar 19, 2022
17192aa
Add parent keys
Ry-DS Mar 19, 2022
b423cd2
Fix mistyped params
Ry-DS Mar 19, 2022
b35d675
Fix mistyped ids in events
Ry-DS Mar 19, 2022
66f050c
[ci skip] Remove pointless comment
Ry-DS Mar 21, 2022
ac6b554
Change ignore parent key to true
Ry-DS Mar 22, 2022
6b6d934
update ignore_parent_replication and remove unneeded import
Ry-DS Mar 25, 2022
ef162de
Simple comment [ci skip]
Ry-DS Mar 25, 2022
47e42d9
Work on comments [ci skip]
Ry-DS Mar 25, 2022
91bbcd3
Work on comments [ci skip]
Ry-DS Mar 25, 2022
e6298e0
Fix mistyped stuff (good catch Laurent) and more comment addressing
Ry-DS Mar 25, 2022
643c99b
Update fixture comment [ci skip]
Ry-DS Mar 27, 2022
45a2981
Merge remote-tracking branch 'oviohub/83-missing-streams' into 83-mis…
Ry-DS Mar 27, 2022
3ff49ec
Add bunch of meltano lab sample projects
Ry-DS Mar 27, 2022
0b3f52f
update state partitioning keys
Ry-DS Mar 27, 2022
36801e5
Merge branch 'main' into 83-missing-streams
Ry-DS Mar 31, 2022
0ec1d16
Fix merge
Ry-DS Mar 31, 2022
36136ed
Add ORG_LEVEL_TOKEN to be used only for specific streams
ericboucher Apr 5, 2022
202845b
Add docstring to alternative_sync_chidren
ericboucher Apr 6, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions .github/workflows/test_tap.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ jobs:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
strategy:
matrix:
python-version: [3.7, 3.8, 3.9, "3.10"]
python-version: ["3.10", 3.9, 3.8, 3.7]
ericboucher marked this conversation as resolved.
Show resolved Hide resolved
# run the matrix jobs one after the other so they can benefit from caching
max-parallel: 1

Expand All @@ -28,7 +28,9 @@ jobs:
path: '**/api_calls_tests_cache.sqlite'
# github cache expires after 1wk, and we expire the content after 24h
# this key should not need to change unless we need to clear the cache
key: api-cache-v2
key: api-cache-v3
restore-keys: |
api-cache-v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ tap-github --config CONFIG --discover > ./catalog.json
```

## Contributing
This project uses parent-child streams. Learn more about them [here.](https://gitlab.com/meltano/sdk/-/blob/main/docs/parent_streams.md)

### Initialize your Development Environment

Expand Down
4 changes: 2 additions & 2 deletions tap_github/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ def get_next_page_token(
return (previous_token or 1) + 1

def get_url_params(
self, context: Optional[dict], next_page_token: Optional[Any]
self, context: Optional[Dict], next_page_token: Optional[Any]
) -> Dict[str, Any]:
"""Return a dictionary of values to be used in URL parameterization."""
params: dict = {"per_page": self.MAX_PER_PAGE}
Expand Down Expand Up @@ -261,7 +261,7 @@ def parse_response(self, response: requests.Response) -> Iterable[dict]:
yield from extract_jsonpath(self.query_jsonpath, input=resp_json)

def get_url_params(
self, context: Optional[dict], next_page_token: Optional[Any]
self, context: Optional[Dict], next_page_token: Optional[Any]
) -> Dict[str, Any]:
"""Return a dictionary of values to be used in URL parameterization."""
params = context or dict()
Expand Down
164 changes: 164 additions & 0 deletions tap_github/organization_streams.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
"""User Stream types classes for tap-github."""

from typing import Dict, List, Optional, Iterable, Any

from singer_sdk import typing as th # JSON Schema typing helpers

from tap_github.client import GitHubRestStream


class OrganizationStream(GitHubRestStream):
ericboucher marked this conversation as resolved.
Show resolved Hide resolved
"""Defines a GitHub Organization Stream.
API Reference: https://docs.github.com/en/rest/reference/orgs#get-an-organization
"""

name = "organizations"
path = "/orgs/{org}"

@property
def partitions(self) -> Optional[List[Dict]]:
return [{"org": org} for org in self.config["organizations"]]

def get_child_context(self, record: Dict, context: Optional[Dict]) -> dict:
return {
"org": record["login"],
}

def get_records(self, context: Optional[Dict]) -> Iterable[Dict[str, Any]]:
"""
Override the parent method to allow skipping API calls
if the stream is deselected and skip_parent_streams is True in config.
This allows running the tap with fewer API calls and preserving
quota when only syncing a child stream. Without this,
the API call is sent but data is discarded.
"""
if (
not self.selected
and "skip_parent_streams" in self.config
and self.config["skip_parent_streams"]
and context is not None
):
# build a minimal mock record so that self._sync_records
# can proceed with child streams
yield {
"org": context["org"],
}
else:
yield from super().get_records(context)

schema = th.PropertiesList(
th.Property("login", th.StringType),
th.Property("id", th.IntegerType),
th.Property("node_id", th.StringType),
th.Property("url", th.StringType),
th.Property("repos_url", th.StringType),
th.Property("events_url", th.StringType),
th.Property("hooks_url", th.StringType),
th.Property("issues_url", th.StringType),
th.Property("members_url", th.StringType),
th.Property("public_members_url", th.StringType),
th.Property("avatar_url", th.StringType),
th.Property("description", th.StringType),
).to_dict()


class TeamsStream(GitHubRestStream):
"""
API Reference: https://docs.github.com/en/rest/reference/teams#list-teams
"""

name = "teams"
primary_keys = ["id"]
path = "/orgs/{org}/teams"
ignore_parent_replication_key = True
parent_stream_type = OrganizationStream
state_partitioning_keys = ["org"]

def get_child_context(self, record: Dict, context: Optional[Dict]) -> dict:
new_context = {"team_slug": record["slug"]}
if context:
return {
**context,
**new_context,
}
return new_context

schema = th.PropertiesList(
# Parent Keys
th.Property("org", th.StringType),
# Rest
th.Property("id", th.IntegerType),
th.Property("node_id", th.StringType),
th.Property("url", th.StringType),
th.Property("html_url", th.StringType),
th.Property("name", th.StringType),
th.Property("slug", th.StringType),
th.Property("description", th.StringType),
th.Property("privacy", th.StringType),
th.Property("permission", th.StringType),
th.Property("members_url", th.StringType),
th.Property("repositories_url", th.StringType),
th.Property("parent", th.StringType),
).to_dict()


class TeamMembersStream(GitHubRestStream):
"""
API Reference: https://docs.github.com/en/rest/reference/teams#list-team-members
"""

name = "team_members"
primary_keys = ["id"]
path = "/orgs/{org}/teams/{team_slug}/members"
ignore_parent_replication_key = True
parent_stream_type = TeamsStream
state_partitioning_keys = ["team_slug", "org"]

def get_child_context(self, record: Dict, context: Optional[Dict]) -> dict:
new_context = {"username": record["login"]}
if context:
return {
**context,
**new_context,
}
return new_context

schema = th.PropertiesList(
# Parent keys
th.Property("org", th.StringType),
th.Property("team_slug", th.StringType),
# Rest
th.Property("login", th.StringType),
th.Property("id", th.IntegerType),
th.Property("node_id", th.StringType),
th.Property("avatar_url", th.StringType),
th.Property("gravatar_id", th.StringType),
th.Property("url", th.StringType),
th.Property("html_url", th.StringType),
th.Property("type", th.StringType),
th.Property("site_admin", th.BooleanType),
).to_dict()


class TeamRolesStream(GitHubRestStream):
"""
API Reference: https://docs.github.com/en/rest/reference/teams#get-team-membership-for-a-user
"""

name = "team_roles"
path = "/orgs/{org}/teams/{team_slug}/memberships/{username}"
ignore_parent_replication_key = True
primary_keys = ["url"]
parent_stream_type = TeamMembersStream
state_partitioning_keys = ["username", "team_slug", "org"]

schema = th.PropertiesList(
# Parent keys
th.Property("org", th.StringType),
th.Property("team_slug", th.StringType),
th.Property("username", th.StringType),
# Rest
th.Property("url", th.StringType),
th.Property("role", th.StringType),
th.Property("state", th.StringType),
).to_dict()
Loading