Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strip URLs before adding slashes #2015

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

peregrineshahin
Copy link
Contributor

this adds the practice of always stripping before adding slashes to the URLs.. one example is that the run page doesn't work with a repo with a slash at the end.

@peregrineshahin peregrineshahin force-pushed the stripUrls branch 2 times, most recently from fd5c322 to f954b95 Compare May 16, 2024 05:15
@dubslow
Copy link
Contributor

dubslow commented May 16, 2024

In general, URL manipulation has specified RFC semantics, and we should be using libraries to do that, e.g. https://docs.python.org/3/library/urllib.parse.html (specifically urlsplit and urljoin)

For example:

def get_sha(branch, repo_url):
    """Resolves the git branch to sha commit"""
    urlparts = urllib.urlsplit(api_url)
    urlparts.netloc = urlparts.netloc.replace(
        "github.com", "api.github.com"
    )
    api_url = urllib.urlunsplit(urlparts)
    get_url = urllib.urljoin(api_url, f"/repos/commits/{branch}")
    try:
        commit = requests.get(get_url).json()
    except:
        raise Exception("Unable to access developer repository")
    if "sha" in commit:
        return commit["sha"], commit["commit"]["message"].split("\n")[0]
    else:
        return "", ""

that way we delegate the handling of slashes to RFC standards to people who've already done the legwork

@peregrineshahin
Copy link
Contributor Author

that's a good suggestion, I will draft this and code that later.

@peregrineshahin peregrineshahin marked this pull request as draft May 16, 2024 05:37
@peregrineshahin peregrineshahin force-pushed the stripUrls branch 5 times, most recently from 999899b to e6c16b2 Compare July 1, 2024 06:05
@ppigazzini ppigazzini marked this pull request as ready for review July 3, 2024 09:14
@ppigazzini ppigazzini added enhancement worker update code changes requiring a worker update labels Jul 3, 2024
@ppigazzini
Copy link
Collaborator

ppigazzini commented Jul 3, 2024

The worker (python 3.6.x) stops, wrong url, missing the repo info. (all urls before the wrong one are fine).

  • PR, wrong:
Running master vs master
Variable task sizes used. Opening offset = 14196
Downloading https://api.github.com/zipball/6138a0fd0e43753a86e4a170a5f6e2b7b6752677
Exception in requests.get():
404 Client Error: Not Found for url: https://api.github.com/zipball/6138a0fd0e43753a86e4a170a5f6e2b7b6752677

Exception running games:
Get request to https://api.github.com/zipball/6138a0fd0e43753a86e4a170a5f6e2b7b6752677 failed
  • master:
Running master vs master
Variable task sizes used. Opening offset = 14022
Downloading https://api.github.com/repos/official-stockfish/Stockfish/zipball/6138a0fd0e43753a86e4a170a5f6e2b7b6752677
Build uses default net: nn-ddcfb9224cdb.nnue

@peregrineshahin
Copy link
Contributor Author

peregrineshahin commented Jul 3, 2024

it looks like this library gives 3 different combination results of joining two URLs when using .join(),

  1. if the base URL ends with a slash
  2. if the second URL begins with a slash
  3. if no slashes are used in either
    so stripping the slash at the end is still necessary and it didn't solve anything.

hopefully I'm missing something

@peregrineshahin
Copy link
Contributor Author

peregrineshahin commented Jul 3, 2024

@dubslow unfortunately your example seems to give three different results based on the slashes, I don't think splitting and then unsplitting, and then joining is reasonable even if it's done in a helper function..

    api_url = urllib.urlunsplit(urlparts)
    get_url = urllib.urljoin(api_url, f"/repos/commits/{branch}")

@peregrineshahin peregrineshahin force-pushed the stripUrls branch 5 times, most recently from 9a6b1a7 to 49f4d6d Compare July 4, 2024 11:05
this adds the practice of always stripping before adding slashes to the URLs.. one example is that the run page doesn't work with a repo with a slash at the end.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement worker update code changes requiring a worker update
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants