Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python(go?): Error when using TLS with FlightSQL ADBC driver #715

Closed
aiguofer opened this issue May 30, 2023 · 8 comments
Closed

python(go?): Error when using TLS with FlightSQL ADBC driver #715

aiguofer opened this issue May 30, 2023 · 8 comments

Comments

@aiguofer
Copy link
Contributor

I've been trying to do some testing using the Python ADBC driver against our FlightSQL service and I've run into issues when TLS is enabled. If I disable TLS and test locally it works as expected, but with TLS enabled I get some very strange errors.

If I try to hit our service in Prod, where the TLS handshake happens in our load balancer, this is the error on the client:

Traceback (most recent call last):
  File "/Users/aiguofer/work/dbtlabs/sl-example-clients/python/src/adbc_example.py", line 49, in <module>
    main(sys.argv)
  File "/Users/aiguofer/work/dbtlabs/sl-example-clients/python/src/adbc_example.py", line 40, in main
    print(conn.adbc_get_info())
          ^^^^^^^^^^^^^^^^^^^^
  File "/Users/aiguofer/Library/Application Support/hatch/env/virtual/afs-example/vPTxZbV9/afs-example/lib/python3.11/site-packages/adbc_driver_manager/dbapi.py", line 348, in adbc_get_info
    handle = self._conn.get_info()
             ^^^^^^^^^^^^^^^^^^^^^
  File "adbc_driver_manager/_lib.pyx", line 688, in adbc_driver_manager._lib.AdbcConnection.get_info
  File "adbc_driver_manager/_lib.pyx", line 385, in adbc_driver_manager._lib.check_error
adbc_driver_manager._lib.ProgrammingError: ADBC_STATUS_UNAUTHENTICATED (13): [FlightSQL] rpc error: code = Unauthenticated desc = Invalid credentials

However, when I look in DataDog I see the request and the correct response with a status 0. What's strange is that the Invalid credentials error is one that we actually do send back from the authenticate function, but I don't see any traces or logs in DataDog returning this error when I test this.

I tried to debug this locally by enabling TLS with a self signed cert. When I try this, the logs in my service seem like everything worked fine:

13:54:15.485 [grpc-nio-worker-ELG-3-3] DEBUG io.grpc.netty.NettyServerHandler [id: 0x86eb57a0, L:/127.0.0.1:9876 - R:/127.0.0.1:64926] OUTBOUND HEADERS: streamId=3 headers=GrpcHttp2OutboundHeaders[:status: 200, content-type: application/grpc, grpc-encoding: identity, grpc-accept-encoding: gzip] padding=0 endStream=false | MDC:  | extra: 
13:54:15.485 [flight-server-default-executor-1] INFO  c.d.s.logging.LoggingMiddleware Call completed for method='GET_FLIGHT_INFO' | extra: {duration=2, callStatus={code=OK, cause=null, description=, metadata=org.apache.arrow.flight.ErrorFlightMetadata@12087791}}
13:54:15.485 [grpc-nio-worker-ELG-3-3] DEBUG io.grpc.netty.NettyServerHandler [id: 0x86eb57a0, L:/127.0.0.1:9876 - R:/127.0.0.1:64926] OUTBOUND DATA: streamId=3 padding=0 endStream=false length=1143 bytes=00000004720aa007ffffffff980300001000000000000a000e0006000d0008000a000000000004001000000000010a000c000000080004000a00000008000000... | MDC:  | extra: 
13:54:15.485 [grpc-nio-worker-ELG-3-3] DEBUG io.grpc.netty.NettyServerHandler [id: 0x86eb57a0, L:/127.0.0.1:9876 - R:/127.0.0.1:64926] OUTBOUND HEADERS: streamId=3 headers=GrpcHttp2OutboundHeaders[grpc-status: 0] padding=0 endStream=true | MDC:  | extra: 
13:54:15.486 [grpc-nio-worker-ELG-3-3] DEBUG io.grpc.netty.NettyServerHandler [id: 0x86eb57a0, L:/127.0.0.1:9876 - R:/127.0.0.1:64926] INBOUND WINDOW_UPDATE: streamId=0 windowSizeIncrement=1143 | MDC:  | extra: 
13:54:15.486 [grpc-nio-worker-ELG-3-3] DEBUG io.grpc.netty.NettyServerHandler [id: 0x86eb57a0, L:/127.0.0.1:9876 - R:/127.0.0.1:64926] INBOUND PING: ack=false bytes=145258749040133895 | MDC:  | extra: 
13:54:15.486 [grpc-nio-worker-ELG-3-3] DEBUG io.grpc.netty.NettyServerHandler [id: 0x86eb57a0, L:/127.0.0.1:9876 - R:/127.0.0.1:64926] OUTBOUND PING: ack=true bytes=145258749040133895 | MDC:  | extra: 

But on the client side I get:

Traceback (most recent call last):
  File "/Users/aiguofer/work/dbtlabs/sl-example-clients/python/src/adbc_example.py", line 52, in <module>
    main(sys.argv)
  File "/Users/aiguofer/work/dbtlabs/sl-example-clients/python/src/adbc_example.py", line 43, in main
    print(conn.adbc_get_info())
          ^^^^^^^^^^^^^^^^^^^^
  File "/Users/aiguofer/Library/Application Support/hatch/env/virtual/afs-example/vPTxZbV9/afs-example/lib/python3.11/site-packages/adbc_driver_manager/dbapi.py", line 348, in adbc_get_info
    handle = self._conn.get_info()
             ^^^^^^^^^^^^^^^^^^^^^
  File "adbc_driver_manager/_lib.pyx", line 688, in adbc_driver_manager._lib.AdbcConnection.get_info
  File "adbc_driver_manager/_lib.pyx", line 385, in adbc_driver_manager._lib.check_error
adbc_driver_manager._lib.OperationalError: ADBC_STATUS_IO (10): [FlightSQL] rpc error: code = Unavailable desc = connection error: desc = "error reading server preface: EOF"

Here's the code for the example client, and it takes in a single argument which is a JDBC connection string like jdbc:arrow-flight-sql://localhost:9876?customId=<id>&token=<token>.

import sys
from dataclasses import dataclass
from urllib.parse import parse_qs, urlparse

from adbc_driver_flightsql import DatabaseOptions
from adbc_driver_flightsql.dbapi import connect


@dataclass
class ConnAttr:
    host: str
    params: dict
    auth_header: str


def parse_jdbc_uri(uri):
    parsed = urlparse(uri)
    params = {k.lower(): v[0] for k, v in parse_qs(parsed.query).items()}
    return ConnAttr(
        host=parsed.path.replace("arrow-flight-sql", "grpc")
        if params.pop("useencryption") == "false"
        else parsed.path.replace("arrow-flight-sql", "grpc+tls"),
        params=params,
        auth_header=f"Bearer {params.pop('token')}",
    )


def main(args):
    conn_attr = parse_jdbc_uri(args[1])
    with connect(
        conn_attr.host,
        db_kwargs={
            DatabaseOptions.AUTHORIZATION_HEADER.value: conn_attr.auth_header,
            DatabaseOptions.TLS_SKIP_VERIFY.value: "true",
            **{
                f"{DatabaseOptions.RPC_CALL_HEADER_PREFIX.value}{k}": v
                for k, v in conn_attr.params.items()
            },
        },
    ) as conn:
        print(conn.adbc_get_info())


if __name__ == "__main__":
    main(sys.argv)

I feel like if I knew Go I might be able to try some debugging within the Go driver, but unfortunately I've got no experience with it. Any help on how I might further debug would be greatly appreciated!

@lidavidm lidavidm added this to the ADBC Libraries 0.5.0 milestone May 31, 2023
@lidavidm
Copy link
Member

Hmm, the Go driver needs to include more diagnostic info about which server it's talking to.

But one off-the-top-of-my-head guess is that the FlightInfo might be using a grpc:// scheme instead of grpc+tls:// (or vice versa)?

@lidavidm
Copy link
Member

Also, I agree the driver should make it easier to debug what's going on...There is an issue for adding more logging in #492

@aiguofer
Copy link
Contributor Author

My initial guess was that it might be something related to the scheme, but it seems to establish the TLS connection just fine and the server handles the request correctly and responds correctly (based on logs). The issue seems to be while the client receives the response. Some logging would certainly help!

@lidavidm
Copy link
Member

There's two requests. The initial connection may be fine but when the client goes to fetch the data, it connects based on the info in the server's FlightInfo response. If the info in that response is using the wrong scheme you may see errors like this.

@aiguofer
Copy link
Contributor Author

Ahhh that was it! We were using the same Location (Location.forGrpcInsecure("0.0.0.0", config.applicationPort)) for both FlightServer.builder and for the FlightEndpoint lists that we send back for FlightInfo requests in the FlightSqlProducer. The reason I was seeing the invalid credentials error when hitting prod is because I had my local server running, so the DO_GET request was going to my local machine using the prod credentials.

Interestingly enough, this works fine when using the JDBC driver.... does that mean that the JDBC driver is ignoring the flight endpoints provided by GET_FLIGHT_INFO?

@lidavidm
Copy link
Member

Unfortunately, yes. The JDBC driver is not compliant with the Flight SQL spec: apache/arrow#34532

@lidavidm
Copy link
Member

And yes, stuff like this is why I'd like to improve the logging and add more context in error messages so it's clear what failed in a potentially multi-request chain (e.g. with a prepared statement, you might see 5 requests...)

@aiguofer
Copy link
Contributor Author

Ahhh that makes sense! Thanks for the help! Good thing for us we're just using the same server for now, but if I get some time I might try to fix that on the JDBC side. The only potential problem would be if servers aren't correctly sending the location back in FlightInfo requests this will cause clients to break, so it might need some sort of fallback logic if it detects that the main server is hosted elsewhere and the FlightInfo specifies a localhost endpoint.

Closing this issue now that I know what's going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants