Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API key dispenser #2145

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

API key dispenser #2145

wants to merge 9 commits into from

Conversation

drolbr
Copy link
Contributor

@drolbr drolbr commented Feb 15, 2019

This is an implementation of the API key dispenser intended to help with GDPR requirements.

It is as minimal as possible:

  • OSM users can register services by submitting an URI for the service.
  • OSM users can request an API key for each registered service.
  • For a registered service there is a stream of new and revoked keys with a similar mechanism to the minute updates via the new API endpoint third_party_services/keys

The data held on the Main Db is

  • a table of services where the URI is unique
  • a ledger of keys, implemented as two tables where third_party_keys hold the actual data and third_party_key_events is used as a PG sequence to properly sort out create and revoke events

Tests with full coverage have been added. Please take the whole thing still with a grain of salt: I have little Rails experience. For example, CanCan did insist in various error messages I should add skip_authorization_check to every controller, but this does not sound like the proper solution.

@woodpeck
Copy link
Contributor

Can you explain the "big picture" of this, or if it has already been explained elsewhere, link to it?

If we implement access restrictions to certain parts of the web interface as detailed in https://wiki.openstreetmap.org/wiki/GDPR/Affected_Services, is this API key scheme expected to let people access the various endpoints listed there? If yes, how would you suggest that the API key is transported from the client to the server, and how would it be checked?

If an API key gives you special powers (namely to access data governed by an agreement that you as a logged-in user have "signed", to safeguard GDPR limits), then should there perhaps be an "explainer" when you create an API key, that basically says be careful with this, by generating this you take responsibility for all access made with that key, etc.etc.?

Would it make sense for API keys to expire if not renewed regularly?

@tomhughes
Copy link
Member

tomhughes commented Feb 15, 2019

Yes this is definitely a case where the problem, and proposed solution, should have been discussed before starting work!

@mmd-osm
Copy link
Contributor

mmd-osm commented Feb 15, 2019

I think Roland decribed the concept here: https://lists.openstreetmap.org/pipermail/osmf-talk/2018-October/005323.html, and more specifically here: https://dev.overpass-api.de/misc/authentication.pdf

More details and a prototype implementation: https://wiki.openstreetmap.org/wiki/Overpass_API/GDPR

I would have preferred a solution based on a more standard protocol, like JSON Web Tokens - https://jwt.io/ - and assert the origin of the tokens by means of public/private keys only.

@tomhughes
Copy link
Member

Yes rolling our own cryptographic protocol is absolutely definitely a terrible idea.

@tomhughes
Copy link
Member

tomhughes commented Feb 15, 2019

Note that OAuth 2.0 actually uses JWT as part of the implementation and, contrary to what that document suggests, we have no objection to OAuth 2.0 as far as I know. I have always assumed we should move to it but have just never got around to doing so!

@mmd-osm
Copy link
Contributor

mmd-osm commented Feb 15, 2019

For the prototype, the overall process looked as follows. I assume that the Rails based implementation is very similar to this.

  1. Log on to https://olbricht.nrw/mapper.html
  2. Request a new API key for dev.overpass-api.de
  3. Use this key in an Overpass API query, e.g. http://overpass-turbo.eu/s/G8R (does not work atm)
  4. Overpass API server would check, if the API key is known, if not, would connect back to the central key server for validation. Once the key has been successfully retrieved, it will be cached locally on the Overpass API instance.

My suggestion was that a central server (OSM website) issues a JWT token with a limited validity period. This token is signed by the OSM central server, the Overpass API server holds a public key to assert the origin of the token.

When sending a query to the Overpass API, the client needs to provide the JWT token along with the query. All subsequent queries don't require any connection to the central server, as long as the token is still valid. Typically, such a token could be valid for a few hours, or maybe days.

@drolbr
Copy link
Contributor Author

drolbr commented Feb 15, 2019

This is unrelated to above mentioned document.

This is about
https://wiki.openstreetmap.org/wiki/Overpass_API/GDPR

Baseline: Overpass API (and potentially other downstream data processors) shall hand out user data only to those of its clients that are OSM users. The solution approved by the OSMF board is that the service sees an anonymized key from the client, and the OSMF wants to keep the association between key and user name private, in particular hidden from the service. This pull request is intended to implement that.

@drolbr
Copy link
Contributor Author

drolbr commented Feb 15, 2019

By the way: Can someone please have a look why Travis complains? I see "1199 runs, 363556 assertions, 0 failures, 0 errors, 0 skips" as the result of the job log.

@mmd-osm
Copy link
Contributor

mmd-osm commented Feb 15, 2019

Can someone please have a look why Travis complains? I see "1199 runs, 363556 assertions, 0 failures, 0 errors, 0 skips" as the result of the job log.

That's caused by 128 code style issues, starting here: https://travis-ci.org/openstreetmap/openstreetmap-website/builds/493663483#L2958

@mmd-osm
Copy link
Contributor

mmd-osm commented Feb 16, 2019

Would it make sense for API keys to expire if not renewed regularly?

Most definitely yes.

As it stands, allowing API keys to be embedded in an Overpass query (e.g.
[api_key:"e10359b7d832e21c04c3aa17cbab45c37d7d6244"];) is asking for trouble. An unsuspecting user may hit the "Share" button in overpass turbo, at which time that key can be trivially harvested by any third party.

overpass turbo's share links were never designed to store any kind of secret information. What's worse, once the API key has leaked this way, the unsuspecting user still has absolutely no idea that someone else might abuse their API key (!).

I find it really hard to imagine that all users manage to keep their API key secret, and know when to create a new API key which would then invalidate their old API key.

Time limiting any kind of token would be one essential step to limit the effects of a leaked token.

@tomhughes
Copy link
Member

Well a "share" button that shares a link that doesn't work is kind of useless anyway! So expiring the token and hence breaking the link seems like the wrong solution.

@mmd-osm
Copy link
Contributor

mmd-osm commented Feb 16, 2019

Well, the proper solution in that case would be some additional logic in overpass turbo to inject the API key of the respective user when calling the Overpass API, and never store the API key in the query itself, let alone include it in a share link

@drolbr
Copy link
Contributor Author

drolbr commented Feb 17, 2019

Well a "share" button that shares a link that doesn't work is kind of useless anyway! So expiring the token and hence breaking the link seems like the wrong solution.

There is no GDPR friendly way to have such a link anyway. Osm.org, Overpass API, or both would need to start tracking users on third party sites to make such a link working, depending on the implementation. An important concern in such a situation is substantially and permanently increased support requests because the tracking is at odds with browser settings or extensions.

Once Overpass Turbo uses the usual moustace syntax, i.e. [api_key:{{apikey}}] and stores the key as first party cookie (or whatever) then we get out of the tracking situation because it is bilateral between the user and Overpass Turbo.

@mmd-osm
Copy link
Contributor

mmd-osm commented Dec 19, 2019

https://www.openstreetmap.org/user/@kevin_bullock/diary/391652 brings up more use cases for API keys (management). Along with the ideas to introduce some API key support for tile.osm.org, it would probably make sense to move towards one simple solution that covers a number of use cases.

//cc; @grischard , @simonpoole

@westnordost
Copy link

Just want to add another use case I am not sure this PR covers or not:

An API that provides OSM-derived-data (f.e. from changesets) wants to verify whether the client querying the data has an access token to OSM.
For example, think of an API for HDYC.

@mmd-osm
Copy link
Contributor

mmd-osm commented Oct 11, 2020

I believe once we have OAuth 2.0 in place, we could validate an oauth2 token that has been issued by osm.org by calling the introspect endpoint (assuming this will be available in the future).

Here's a quick example for illustration purposes (actual output,, no fake):

  1. Overpass API receives a query with an HTTP Authentication header Bearer ASIKSMtZ67n2d7FaM5pYRQOLkNqZOfaYDQn-aB1OCCE

  2. Overpass API validates token against oauth introspection endpoint osm.org:

curl -F client_id=zQyq4UbbrCMjShugI1BbYmJ_JQZKnDLj3iZjMVSEB8o -F client_secret=rTDU2cPJ284WL41yYIiPXqzvre2MXjovU3B4WX-zbN4 -F token=ASIKSMtZ67n2d7FaM5pYRQOLkNqZOfaYDQn-aB1OCCE -X POST http://localhost:3000/oauth2/introspect

Token valid:

{"active":true,"scope":"read","client_id":"zQyq4UbbrCMjShugI1BbYmJ_JQZKnDLj3iZjMVSEB8o","token_type":"Bearer","exp":1602439905,"iat":1602432705,"username":"mmd3"}

In case of an invalid token, the server would return:

{"active":false}

username":"mmd3" is not part of the doorkeeper.rb default, and only added here as an example.

Results could probably be cached for some time to avoid repeatedly calling the osm.org endpoint.

client_id and secret have been defined as part of an "Overpass test" application in doorkeeper before (http://localhost:3000/oauth2/applications).

Bildschirmfoto von 2020-10-11 19-18-20

@tomhughes
Copy link
Member

Surely if you're going to use OAuth 2 you would just ask the the read prefs scope and then the user details like everybody else that authenticates against osm - no need to for token introspection.

@mmd-osm
Copy link
Contributor

mmd-osm commented Oct 11, 2020

Yes, Overpass could use that token to query the user details endpoint (see screenshot below).

The only difference I see is that Overpass wouldn't be able to figure out, if the token has been requested for an entirely different app, and just passed on to Overpass. In case of the introspection endpoint, you could ensure that the token matches the application that has been registered before for Overpass purposes. I think that was one of the assumptions Roland made, hence I included it in my example.

Bildschirmfoto von 2020-10-11 19-40-03

@tomhughes
Copy link
Member

I'm pretty sure you can't just pass tokens around - the request will have to be signed by the client that the token was issued to!

@mmd-osm
Copy link
Contributor

mmd-osm commented Oct 11, 2020

Signing requests was a topic for OAuth 1.0a. Bearer tokens in 2.0 would give you access to a resource server (see example below), that's why you need to safeguard them. RFC 6750 has more details on security threats related to Bearer Token usage.

curl  -H "Authorization: Bearer oUA-D-78IXuB9c2TM5BdGtAdLcUih5FXUIWl6Lb8V0g" http://localhost:3000/api/0.6/user/details.json
{"user":{"id":1,"display_name":"mmd2","account_created":"2017-12-05T17:28:53Z","description":"Hello!","contributor_terms":{"agreed":true},"roles":["moderator","administrator"],"changesets":{"count":1706},"traces":{"count":78},"blocks":{"received":{"count":1,"active":0},"issued":{"count":20,"active":0}}}}

@tomhughes
Copy link
Member

But even then that's only an issue if (a) a site exposes the authentication token to the end user and (b) another site allows the user to provide their own token rather than requiring them to go through the authentication flow to generate a token.

So even if the first type of site exists any site that wants to ensure it has a valid token can just not provide a way for you to inut a token and instead require you to go though the Authorization Code Flow to generate one.

@mmd-osm
Copy link
Contributor

mmd-osm commented Oct 11, 2020

I thought I could add some more details about the main use case we have in mind.

Let's assume an overpass turbo user wants to query for some OSM data that includes metadata. Due to GDPR requirements, metadata wouldn't be available to anonymous users anymore in the future. Here's where Overpass API would require the user to present a proof of an osm.org account, ideally without disclosing the user's id.

Steps are:

  1. download overpass turbo as a single page web-app from overpass-turbo.eu, app lives in your browser only
  2. in the app: authorize access to an Overpass API application against osm.org. Application has been registered upfront via osm.org/oauth2/applications (not yet available)
  3. browser to send query along with a token to Overpass API, running on overpass-api.de
  4. Overpass API needs to find out that the token is valid and that the user sending the request has an account on osm.org. The actual user name is not needed
  5. Overpass API returns additional meta data in the positive case.

From what I've read, Authorization Code PKCE Flow would be relevant here.

@mmd-osm
Copy link
Contributor

mmd-osm commented Jul 2, 2021

(moved to #3245)

@tomhughes
Copy link
Member

Oh I didn't realise there was a technical problem - if you'd made that clear I'd have dealt with it as part of the original pull request!

When you raised it before I thought you were just asking about whether we should enable an optional feature of doorkeeper.

What is the use case for this anyway?

@tomhughes
Copy link
Member

Also what on earth does it have to do with this PR exactly? If you find a problem please raise an issue for it, don't pollute unrelated discussions...

@mmd-osm
Copy link
Contributor

mmd-osm commented Dec 4, 2022

Needs decision from maintainers: go with standards based approach (OAuth 2.0 or OpenID connect), or run some custom protocol instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants