-
Notifications
You must be signed in to change notification settings - Fork 948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API key dispenser #2145
base: master
Are you sure you want to change the base?
API key dispenser #2145
Conversation
Can you explain the "big picture" of this, or if it has already been explained elsewhere, link to it? If we implement access restrictions to certain parts of the web interface as detailed in https://wiki.openstreetmap.org/wiki/GDPR/Affected_Services, is this API key scheme expected to let people access the various endpoints listed there? If yes, how would you suggest that the API key is transported from the client to the server, and how would it be checked? If an API key gives you special powers (namely to access data governed by an agreement that you as a logged-in user have "signed", to safeguard GDPR limits), then should there perhaps be an "explainer" when you create an API key, that basically says be careful with this, by generating this you take responsibility for all access made with that key, etc.etc.? Would it make sense for API keys to expire if not renewed regularly? |
Yes this is definitely a case where the problem, and proposed solution, should have been discussed before starting work! |
I think Roland decribed the concept here: https://lists.openstreetmap.org/pipermail/osmf-talk/2018-October/005323.html, and more specifically here: https://dev.overpass-api.de/misc/authentication.pdf More details and a prototype implementation: https://wiki.openstreetmap.org/wiki/Overpass_API/GDPR I would have preferred a solution based on a more standard protocol, like JSON Web Tokens - https://jwt.io/ - and assert the origin of the tokens by means of public/private keys only. |
Yes rolling our own cryptographic protocol is absolutely definitely a terrible idea. |
Note that OAuth 2.0 actually uses JWT as part of the implementation and, contrary to what that document suggests, we have no objection to OAuth 2.0 as far as I know. I have always assumed we should move to it but have just never got around to doing so! |
For the prototype, the overall process looked as follows. I assume that the Rails based implementation is very similar to this.
My suggestion was that a central server (OSM website) issues a JWT token with a limited validity period. This token is signed by the OSM central server, the Overpass API server holds a public key to assert the origin of the token. When sending a query to the Overpass API, the client needs to provide the JWT token along with the query. All subsequent queries don't require any connection to the central server, as long as the token is still valid. Typically, such a token could be valid for a few hours, or maybe days. |
This is unrelated to above mentioned document. This is about Baseline: Overpass API (and potentially other downstream data processors) shall hand out user data only to those of its clients that are OSM users. The solution approved by the OSMF board is that the service sees an anonymized key from the client, and the OSMF wants to keep the association between key and user name private, in particular hidden from the service. This pull request is intended to implement that. |
By the way: Can someone please have a look why Travis complains? I see "1199 runs, 363556 assertions, 0 failures, 0 errors, 0 skips" as the result of the job log. |
That's caused by 128 code style issues, starting here: https://travis-ci.org/openstreetmap/openstreetmap-website/builds/493663483#L2958 |
Most definitely yes. As it stands, allowing API keys to be embedded in an Overpass query (e.g. overpass turbo's share links were never designed to store any kind of secret information. What's worse, once the API key has leaked this way, the unsuspecting user still has absolutely no idea that someone else might abuse their API key (!). I find it really hard to imagine that all users manage to keep their API key secret, and know when to create a new API key which would then invalidate their old API key. Time limiting any kind of token would be one essential step to limit the effects of a leaked token. |
Well a "share" button that shares a link that doesn't work is kind of useless anyway! So expiring the token and hence breaking the link seems like the wrong solution. |
Well, the proper solution in that case would be some additional logic in overpass turbo to inject the API key of the respective user when calling the Overpass API, and never store the API key in the query itself, let alone include it in a share link |
There is no GDPR friendly way to have such a link anyway. Osm.org, Overpass API, or both would need to start tracking users on third party sites to make such a link working, depending on the implementation. An important concern in such a situation is substantially and permanently increased support requests because the tracking is at odds with browser settings or extensions. Once Overpass Turbo uses the usual moustace syntax, i.e. [api_key:{{apikey}}] and stores the key as first party cookie (or whatever) then we get out of the tracking situation because it is bilateral between the user and Overpass Turbo. |
https://www.openstreetmap.org/user/@kevin_bullock/diary/391652 brings up more use cases for API keys (management). Along with the ideas to introduce some API key support for tile.osm.org, it would probably make sense to move towards one simple solution that covers a number of use cases. //cc; @grischard , @simonpoole |
Just want to add another use case I am not sure this PR covers or not: An API that provides OSM-derived-data (f.e. from changesets) wants to verify whether the client querying the data has an access token to OSM. |
I believe once we have OAuth 2.0 in place, we could validate an oauth2 token that has been issued by osm.org by calling the introspect endpoint (assuming this will be available in the future). Here's a quick example for illustration purposes (actual output,, no fake):
Token valid:
In case of an invalid token, the server would return:
Results could probably be cached for some time to avoid repeatedly calling the osm.org endpoint. client_id and secret have been defined as part of an "Overpass test" application in doorkeeper before (http://localhost:3000/oauth2/applications). |
Surely if you're going to use OAuth 2 you would just ask the the read prefs scope and then the user details like everybody else that authenticates against osm - no need to for token introspection. |
Yes, Overpass could use that token to query the user details endpoint (see screenshot below). The only difference I see is that Overpass wouldn't be able to figure out, if the token has been requested for an entirely different app, and just passed on to Overpass. In case of the introspection endpoint, you could ensure that the token matches the application that has been registered before for Overpass purposes. I think that was one of the assumptions Roland made, hence I included it in my example. |
I'm pretty sure you can't just pass tokens around - the request will have to be signed by the client that the token was issued to! |
Signing requests was a topic for OAuth 1.0a. Bearer tokens in 2.0 would give you access to a resource server (see example below), that's why you need to safeguard them. RFC 6750 has more details on security threats related to Bearer Token usage.
|
But even then that's only an issue if (a) a site exposes the authentication token to the end user and (b) another site allows the user to provide their own token rather than requiring them to go through the authentication flow to generate a token. So even if the first type of site exists any site that wants to ensure it has a valid token can just not provide a way for you to inut a token and instead require you to go though the Authorization Code Flow to generate one. |
I thought I could add some more details about the main use case we have in mind. Let's assume an overpass turbo user wants to query for some OSM data that includes metadata. Due to GDPR requirements, metadata wouldn't be available to anonymous users anymore in the future. Here's where Overpass API would require the user to present a proof of an osm.org account, ideally without disclosing the user's id. Steps are:
From what I've read, Authorization Code PKCE Flow would be relevant here. |
(moved to #3245) |
Oh I didn't realise there was a technical problem - if you'd made that clear I'd have dealt with it as part of the original pull request! When you raised it before I thought you were just asking about whether we should enable an optional feature of doorkeeper. What is the use case for this anyway? |
Also what on earth does it have to do with this PR exactly? If you find a problem please raise an issue for it, don't pollute unrelated discussions... |
Needs decision from maintainers: go with standards based approach (OAuth 2.0 or OpenID connect), or run some custom protocol instead. |
This is an implementation of the API key dispenser intended to help with GDPR requirements.
It is as minimal as possible:
The data held on the Main Db is
Tests with full coverage have been added. Please take the whole thing still with a grain of salt: I have little Rails experience. For example, CanCan did insist in various error messages I should add skip_authorization_check to every controller, but this does not sound like the proper solution.