Skip to content

Commit

Permalink
fix: github adapter only applies to URLs formatted as expected (#1708)
Browse files Browse the repository at this point in the history
- fixes part of #1607 

In the absence of other hints (e.g. a Controller object associated to
another plugin), all URLs from `github.com` were being processed by the
`GithubPlugin`, to create a Github Data Portal adapter.

With this PR, `GithubPlugin` only handles the URL if it has the expected
format "https://github.com/user_or_org/repo", and
"https://github.com/user_or_org", otherwise it is passed other to be
treated as a plain URL.

I also made some refactorings along the way, to have more readable and
less nested code.

As tests github portal are currently skipped, I checked manually that I
have not broken this functionality.
  • Loading branch information
pierrecamilleri authored Nov 22, 2024
1 parent 72b74a8 commit d97ae46
Showing 1 changed file with 37 additions and 12 deletions.
49 changes: 37 additions & 12 deletions frictionless/portals/github/plugin.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,21 +24,46 @@ def create_adapter(
basepath: Optional[str] = None,
packagify: bool = False,
):
if isinstance(source, str):
parsed = urlparse(source)
if not control or isinstance(control, GithubControl):
if parsed.netloc == "github.com":
control = control or GithubControl()
splited_url = parsed.path.split("/")[1:]
if len(splited_url) == 1:
control.user = splited_url[0]
return GithubAdapter(control)
if len(splited_url) == 2:
control.user, control.repo = splited_url
return GithubAdapter(control)
"""Checks if the source is meant to access a Github Data portal, and returns the adapter if applicable
The source is expected to be in one of the formats :
- https://github.com/user_or_org
- https://github.com/user_or_org/repo
Alternatively, user and/or repo information can be provided in the
GithubControl, with an empty source.
"""
if control and not isinstance(control, GithubControl):
# Explicit control for other plugin
return

if source is None and isinstance(control, GithubControl):
# Source informations are inside the control
return GithubAdapter(control=control)

if not isinstance(source, str):
return

parsed_url = urlparse(source)
splited_url = parsed_url.path.split("/")[1:]

has_expected_format = (
parsed_url.netloc == "github.com"
and len(splited_url) >= 1
and len(splited_url) <= 2
)

if has_expected_format:
control = control or GithubControl()
control.user = splited_url[0]

if len(splited_url) == 2:
control.repo = splited_url[1]

return GithubAdapter(control)

def select_control_class(self, type: Optional[str] = None):
if type == "github":
return GithubControl

0 comments on commit d97ae46

Please sign in to comment.