Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attach OS user information to requests #371

Closed
georgipeev opened this issue Jun 27, 2024 · 6 comments · Fixed by #378
Closed

Attach OS user information to requests #371

georgipeev opened this issue Jun 27, 2024 · 6 comments · Fixed by #378
Labels
enhancement New feature or request

Comments

@georgipeev
Copy link

georgipeev commented Jun 27, 2024

It would be beneficial if clickhouse-connect included the OS user information in the requests made to the Clickhouse server, as that would improve server-side functionality related to logging, usage, compliance, security, audit trails, etc. For example, the related clickhouse-driver library does so.

@georgipeev georgipeev added the enhancement New feature or request label Jun 27, 2024
@genzgd
Copy link
Collaborator

genzgd commented Jun 30, 2024

Note that you can put whatever value you want (including the os.user or other identifying information) in the client_name argument when calling get_client, and that client name will be included in the HTTP User-Agent header. Using that header is closest we can get to the native protocol os_user field.

The default HTTP User-Agent header is set in clickhouse_connect/common.py

def build_client_name(client_name: str):
    product_name = get_setting('product_name')
    product_name = product_name.strip() + ' ' if product_name else ''
    client_name = client_name.strip() + ' ' if client_name else ''
    py_version = sys.version.split(' ', maxsplit=1)[0]
    return f'{client_name}{product_name}clickhouse-connect/{version()} (lv:py/{py_version}; os:{sys.platform})'

I'm a little nervous about automatically adding the os user to that string, both because it's already pretty long and there's arguably a security/privacy concern with passing that over the internet, but I could go either way. Is that the enhancement you're proposing?

@aadant
Copy link

aadant commented Jul 1, 2024

I agree that the client_name is not the right place to add this information.
Actually CH query log has a os_user field, you can just use that. It also have a query_id. The driver should be able to override it.

@genzgd
Copy link
Collaborator

genzgd commented Jul 1, 2024

That os_user field will not be populated by clickhouse-connect. It's only populated when using the TCP Native protocol.

@aadant
Copy link

aadant commented Jul 1, 2024

@genzgd good point, thanks ! I did not realize one was http protocol only and the other was native only.
see also #91

@aadant
Copy link

aadant commented Jul 1, 2024

so I guess the only option is to add it to the http_user_agent enabled by default with an option to disable it.

FYI I saw that in a CH instance recently :

http_user_agent: ClickHouse-JdbcDriver/0.6.0-patch5 (Windows 11/10.0; OpenJDK 64-Bit Server VM/JBR-17.0.11+1-1207.24-jcef; HttpURLConnection; rv:067ae0b)

@aadant
Copy link

aadant commented Jul 1, 2024

The concern is security behind a load balancer, if you have a generic user and a production issue, you want as much information as possible to trace it without going to potentially massive logs + it gets to the system.query_log. We get that "feature" with the clickhouse-driver over the native protocol and clickhouse-client.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants