Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fingerprinting with user_agent (GDPR) #5264

Closed
remusao opened this issue Apr 17, 2018 · 4 comments
Closed

Fingerprinting with user_agent (GDPR) #5264

remusao opened this issue Apr 17, 2018 · 4 comments
Labels
auto-locked Outdated issues that have been locked by automation

Comments

@remusao
Copy link

remusao commented Apr 17, 2018

  • Pip version: 10.0.0 (and before)
  • Python version:
  • Operating system:

Description:

I stumbled upon the user_agent function from pip/_internal/download and the combination of values returned seems to be pretty detailed. In my case it looks like this:

pip/10.0.0 {"cpu":"x86_64","distro":{"id":"Xenial Xerus","libc":{"lib":"glibc","version":"2.23"},"name":"Ubuntu","version":"16.04"},"implementation":{"name":"CPython","version":"3.6.5"},"installer":{"name":"pip","version":"10.0.0"},"openssl_version":"OpenSSL 1.0.2g  1 Mar 2016","python":"3.6.5","setuptools_version":"39.0.1","system":{"name":"Linux","release":"4.13.0-38-generic"}}

You can obtain yours using:

$ python -c 'from pip.download import user_agent; print(user_agent())'

It contains the following information:

  • pip version
  • cpu architecture
  • operating system (Ubuntu 16.04)
  • glibc version
  • CPython version
  • openssl version
  • setuptools version
  • kernel version

I would expect such combination to be unique (or at least have low cardinality), which could then be used as an implicit identifier of users. Even if this is the case, I assume the information linked to this is not sensitive, but it's hard to say without having access to the logs. If this could be used as an identifier (even implicit), then it could be that at least in Europe, the data collection falls under the jurisdiction of General Data Protection Regulation. I understand this data is very valuable but to stay on the safe side, it would be nice to understand the following:

  1. Can this fingerprints be used to identify users? (at least some of them)
  2. Does this fall under GDPR?
  3. What would be the implications regarding this data collection (opt-out, access to the data for users, etc.)?

Thank you for your time.

@jwilk
Copy link
Contributor

jwilk commented Apr 28, 2018

See #4265.

@pfmoore
Copy link
Member

pfmoore commented May 10, 2018

If you have concerns about this, you would need to seek legal advice. The pip committers are not lawyers, and so cannot really comment.

@dstufft
Copy link
Member

dstufft commented May 10, 2018

I'm going to go ahead and close this, for the reasons @pfmoore mentioned, and also because AIUI GDPR is targeting service operators, not software. For a similar issue, but on PyPI (with the assumption you're using PyPI), see pypi/warehouse#3532.

@dstufft dstufft closed this as completed May 10, 2018
@lock
Copy link

lock bot commented Jun 2, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot added the auto-locked Outdated issues that have been locked by automation label Jun 2, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Jun 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation
Projects
None yet
Development

No branches or pull requests

4 participants