-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] DAG Run not being marked as successful due to failing telemetry call #1438
Comments
The issue you're experiencing is related to the telemetry feature in Cosmos, which attempts to send data over HTTPS. The error occurs because the To resolve this, you can disable telemetry by setting the environment variable For a more permanent solution, you could improve the error handling in the To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
Closes: #1438 A user reported httpx raised an exception while trying to emit telemetry, leading the DAG to not being successful. This PR aims to solve the issue: ``` File "/usr/local/lib/python3.12/site-packages/cosmos/listeners/dag_run_listener.py", line 60, in on_dag_run_success telemetry.emit_usage_metrics_if_enabled(DAG_RUN, additional_telemetry_metrics) File "/usr/local/lib/python3.12/site-packages/cosmos/telemetry.py", line 73, in emit_usage_metrics_if_enabled is_success = emit_usage_metrics(metrics) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/cosmos/telemetry.py", line 50, in emit_usage_metrics response = httpx.get(telemetry_url, timeout=constants.TELEMETRY_TIMEOUT, follow_redirects=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/httpx/_api.py", line 198, in get return request( ^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/httpx/_api.py", line 106, in request return client.request( ^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 827, in request return self.send(request, auth=auth, follow_redirects=follow_redirects) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 914, in send response = self._send_handling_auth( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 942, in _send_handling_auth response = self._send_handling_redirects( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 979, in _send_handling_redirects response = self._send_single_request(request) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 1015, in _send_single_request response = transport.handle_request(request) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/httpx/_transports/default.py", line 232, in handle_request with map_httpcore_exceptions(): ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/contextlib.py", line 158, in __exit__ self.gen.throw(value) File "/usr/local/lib/python3.12/site-packages/httpx/_transports/default.py", line 86, in map_httpcore_exceptions raise mapped_exc(message) from exc httpx.ConnectError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1000) ```
Thank you very much for reporting this, @mjohansenwork ! |
Handle errors similar to: ``` File "/usr/local/lib/python3.12/site-packages/httpx/_api.py", line 198, in get return request( ^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/httpx/_api.py", line 106, in request return client.request( ^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 827, in request return self.send(request, auth=auth, follow_redirects=follow_redirects) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 914, in send response = self._send_handling_auth( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 942, in _send_handling_auth response = self._send_handling_redirects( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 979, in _send_handling_redirects response = self._send_single_request(request) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/httpx/_client.py", line 1015, in _send_single_request response = transport.handle_request(request) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/httpx/_transports/default.py", line 232, in handle_request with map_httpcore_exceptions(): ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/contextlib.py", line 158, in __exit__ self.gen.throw(value) File "/usr/local/lib/python3.12/site-packages/httpx/_transports/default.py", line 86, in map_httpcore_exceptions raise mapped_exc(message) from exc httpx.ConnectError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1000) ``` As observed in Cosmos: astronomer/astronomer-cosmos#1438
Astronomer Cosmos Version
1.8.0
dbt-core version
1.8.6
Versions of dbt adapters
No response
LoadMode
AUTOMATIC
ExecutionMode
AWS_EKS
InvocationMode
None
airflow version
2.10.3
Operating System
Astro docker image
If a you think it's an UI issue, what browsers are you seeing the problem on?
No response
Deployment
Other Docker-based deployment
Deployment details
Astronomer local docker image
What happened?
After updating cosmos from 1.7.0 to 1.8.0, the DAG run is not being marked as successful even after the final task succeeds. The scheduler logs show an HTTPS error when the DAG run telemetry listener tries to submit telemetry data. This is likely because the telemetry URL is not permitted by the firewall on my machine. The issue appears to be related to the telemetry support added in #1397. Specifically, it seems like the
on_dag_run_success()
hook throws an exception when the HTTPS call fails, which prevents Airflow from proceeding with the DAG run lifecycle and actually marking the DAG run as successful.I tried setting the env var
DO_NOT_TRACK=True
to disable telemetry collection and that seems to have resolved the issue.Relevant log output
How to reproduce
This depends on the particular firewall configuration on our machines, so it's likely not easily reproducible by external parties.
Anything else :)?
It looks like there is some error handling code here https://github.com/astronomer/astronomer-cosmos/pull/1397/files#diff-e39094327c419564d75b9530a764f213c57e83b62492e39d3fd042344b779458R50 but it doesn't seem to handle if
httpx.ConnectError: [SSL: CERTIFICATE_VERIFY_FAILED]
is raised by the http call. Perhaps the call needs to be wrapped in atry
/except
block.Are you willing to submit PR?
Contact Details
No response
The text was updated successfully, but these errors were encountered: