Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error handling and propagation #79

Merged
merged 3 commits into from
Sep 27, 2024
Merged

Conversation

minrk
Copy link
Contributor

@minrk minrk commented Sep 27, 2024

Changes:

  • proxy: relay all kubernetes API errors back to the user, not just create_namespaced_job
  • client: show error status and detail without traceback for HTTPStatus errors, and without HTTP request logs

As a result, invalid input like

kbatch job submit --name=List-files --command='["ls", "-lh"]' --image=alpine

goes from producing large amounts of output with no actionable info because of a generic 500 error and traceback:

[18:02:21] INFO     HTTP Request: GET https://pccompute.westeurope.cloudapp.azure.com/compute/services/kbatch/jobs/logs/add-coastline-distances-job-q6pbj/ "HTTP/1.1 500 Internal Server Error"      _client.py:1026
Traceback (most recent call last):
  File "/Users/calkoen/mambaforge/envs/jl-full/bin/kbatch", line 8, in <module>
    sys.exit(cli())
             ^^^^^
  File "/Users/calkoen/mambaforge/envs/jl-full/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/calkoen/mambaforge/envs/jl-full/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/Users/calkoen/mambaforge/envs/jl-full/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/calkoen/mambaforge/envs/jl-full/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/calkoen/mambaforge/envs/jl-full/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/calkoen/mambaforge/envs/jl-full/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/calkoen/mambaforge/envs/jl-full/lib/python3.11/site-packages/kbatch/cli.py", line 348, in logs
    result = _core.logs(pod_name, kbatch_url, token, read_timeout=read_timeout)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/calkoen/mambaforge/envs/jl-full/lib/python3.11/site-packages/kbatch/_core.py", line 196, in logs
    result = next(gen)
             ^^^^^^^^^
  File "/Users/calkoen/mambaforge/envs/jl-full/lib/python3.11/site-packages/kbatch/_core.py", line 233, in _logs
    r.raise_for_status()
  File "/Users/calkoen/mambaforge/envs/jl-full/lib/python3.11/site-packages/httpx/_models.py", line 761, in raise_for_status
    raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Server error '500 Internal Server Error' for url 'https://pccompute.westeurope.cloudapp.azure.com/compute/services/kbatch/jobs/logs/add-coastline-distances-job-q6pbj/'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500

to a short(-ish, depending on the kubernetes error), actionable error:

$ kbatch job submit --name=List-files --command='["ls", "-lh"]' --image=alpine
kbatch-proxy error 422: Job.batch "List-files-45l74" is invalid: [metadata.generateName: Invalid value: "List-files-": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*'), metadata.name: Invalid value: "List-files-45l74": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')]

This is especially improved for 404s, where kbatch pod logs nosuchpod would produce the same big 500 traceback, but now gets the short and clear:

kbatch-proxy error 404: pods "nosuchpod" not found

I'd say this closes #52 . I don't think we need to support names k8s doesn't, and this gives users the right error to know what they need to change.

turns lots of user errors from vague 500 into actionable errors

e.g. 404 not found, invalid names, configmaps too big, etc.
show only the status and error message
@yuvipanda yuvipanda merged commit 1186f5c into kbatch-dev:main Sep 27, 2024
2 checks passed
This was referenced Sep 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can't have names with uppercase letters
2 participants