Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use challenge response for authentication #1586

Closed
wants to merge 21 commits into from

Conversation

rbehjati
Copy link
Contributor

@rbehjati rbehjati commented Oct 14, 2020

Summary of the changes:

  • Adds a user module, which defines a virtual node (Implement downgrading privilege for user principals authenticated over gRPC / HTTP #1452) named UserNode, to oak_runtime/node/http. An instance of UserNode is created for every incoming HTTP request. This node gets the identity of the user as its declassification and endorsement privileges. The functionality of UserNode is mostly copied from HttpServerNode, in particular, the function inject_http_request. In HttpServerNode the function inject_http_request is updated to instead create the UserNode instance and forward the HTTP request and the required invocation channels to it.
    • Something similar is needed for gRPC.
    • Does this node need to be destroyed when the handling of the request is completed, or is the Runtime smart enough to remove the node itself?
  • Adds a simple implementation of the challenge response (Implement support for generic challenge-response style authentication #1357), using a fixed challenge phrase (oak-challenge)
    • The signed challenge is added as an HTTP header in the examples and unit tests, but for now the code assumes that the signed challenge is optional. Not sure if this a valid assumption though.
    • New protobuf messages SignedChallenge, UserIdentityTag, UserNodeConfiguration, and OuterHttpInvocation are added.
  • The UserNode with its privilege uses the user's identity to correctly set the labels on the invocation channels used for the interaction with the Oak node (Set the user's identity as the confidentiality tag in invocation channels for HTTP and gRPC server nodes #1428).
  • Tests are updated accordingly.
  • Documentation will be updated in a separate PR.

Here is an attempt to visualize this change:

before

after

Checklist

  • Pull request affects core Oak functionality (e.g. runtime, SDK, ABI)
    • I have written/updated tests that cover the code changes.
    • I have checked that these tests are run by
      Cloudbuild
    • I have updated documentation accordingly.
    • I have raised an issue to
      cover any TODOs and/or unfinished work.
  • Pull request includes prototype/experimental work that is under
    construction.

@google-cla google-cla bot added the cla: yes label Oct 14, 2020
@rbehjati rbehjati added the WIP Work in progress label Oct 14, 2020
@rbehjati rbehjati force-pushed the oak-1357-challenge-resp branch 6 times, most recently from 13d9fab to 333ada4 Compare October 20, 2020 12:05
@rbehjati rbehjati force-pushed the oak-1357-challenge-resp branch from a0bb0bc to 2352508 Compare October 21, 2020 19:28
@rbehjati rbehjati force-pushed the oak-1357-challenge-resp branch from 2352508 to 7109974 Compare October 22, 2020 11:39
@rbehjati rbehjati marked this pull request as ready for review October 22, 2020 12:46
@rbehjati rbehjati removed the WIP Work in progress label Oct 22, 2020
Copy link
Collaborator

@conradgrobler conradgrobler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not involved in the original design discussions, so I am unsure on the background. Could you please add some information on why these label calculations are done in a separate node rather than in the HTTP server pseudo-node (similar to the current gRPC pseudo-node implementation)?

Is this just to make a logical separation more clear in the code, or is there a technical reason that would make this approach more secure (or some other technical advantage)?

startup_handle: oak_abi::Handle,
_notify_receiver: oneshot::Receiver<()>,
) {
let _unit_result = self.try_run(runtime, startup_handle);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems weird that try_run returns a result, but is only used in one place where the result is ignored.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this to be able to use ? in try_run instead of using a match or if let for every statement in the function. This is changed now, but I think in general using anyhow or a specific error type can fix this issue.

handle: response_reader,
});
let response = response_receiver.receive(&runtime).map_err(|err| {
error!(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: I personally don't like this pattern of using map_err to do logging. I think it is cleaner if mapper functions like map_err do not produce side-effects. One option might be to define an error type that contains the message and it's severity. This can then be logged as eitehr a warning or error at the higher level where the result is processed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will anyhow::Context fit here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of using anyhow, but anyhow::Context is only implemented for Result types that use StdError as the error type. To be able to use anyhow::Context here, we have to wrap the OakStatus in an anyhow::Error. I also noticed anyhow is not used in the oak_runtime. Not sure if it is something in our backlog, or if the plan is to avoid anyhow in the oak_runtime.

In my new PR, I have added a specific error type (HttpError), and have replaced all such uses of map_err.

let tag = match config.privilege.is_empty() {
true => None,
false => Some(oak_abi::label::tag::Tag::UserIdentityTag(UserIdentityTag {
public_key: config.privilege,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming seems confusing: using privilege in the proro to represent a public key in the config, but later it is used to represent the entire node privilege.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UserNode is removed from my new PR. So this issue should be resolved.

.collect(),
};
let response_writer_label = Label {
confidentiality_tags: privilege
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check whether user identity tag was included in the original label? If the user did not include the identity in the original request, this could unnecessarily increase the confidentiality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a chat with @tiziano88 and I think it is correct to set the confidentiality tag of the response channel to the user's identity regardless of the request label. Checking the label should then be done in the Router node, implemented by the application developers.

But I agree with you. If the request label does not have the user's identity, the application won't be able to provide the user with a useful response (for instance, even a bad request response, in case the request is missing a certain header, is not possible). But IIUC this is intentional, and the point about using IFC and enforcing unidirectional communication. I think the solution is to instruct the application developers to make sure that the clients they develop include the user's identity as a confidentiality tag in the request label. I believe, disjunctions are needed to allow/simplify this.

&self,
runtime: &RuntimeProxy,
invocation_channel: oak_abi::Handle,
) -> Result<(), ()> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Result<(),()> does not seem like a particularly useful type as it does not really convey much information and the results do not seem to be used anywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It allows using the ? operator. I am now using HttpError as the error type.

@rbehjati
Copy link
Contributor Author

I was not involved in the original design discussions, so I am unsure on the background. Could you please add some information on why these label calculations are done in a separate node rather than in the HTTP server pseudo-node (similar to the current gRPC pseudo-node implementation)?

According to the docs, the integrity component of the request receiver channel must be set to the identity of the user. However, the HTTP server pseudo-Node cannot create a channel with a non-empty integrity component. The user node has the privilege that allows it to create such channels. Does the gRPC implementation actually set the integrity of the request receiver channel to the user's identity? Do we have examples that show that it works?

I had a discussion with @tiziano88 and realized that the OuterHttpInvocation that the HTTP server pseudo-node sends to the UserNode needs to be changed, as currently it publicly exposes the request.

The other issue is the one that I am experimenting with in #1614. If the request label is non-empty, the Oak node won't be able to read the request. This means that neither the before approach nor the new approach work!

@conradgrobler
Copy link
Collaborator

According to the docs, the integrity component of the request receiver channel must be set to the identity of the user. However, the HTTP server pseudo-Node cannot create a channel with a non-empty integrity component. The user node has the privilege that allows it to create such channels. Does the gRPC implementation actually set the integrity of the request receiver channel to the user's identity? Do we have examples that show that it works?

Integrity label support is not currently implemented for the gRPC server node, but it should be possible to do so. Seeing that the gRPC and HTTP server pseudo-nodes are part of the runtime and therefore the TCB they should in effect have the "top" integrity label. Therefore if we had a way of representing "top" as a label, it would resolve the issue. That seems to me to be a cleaner approach than creating a new node that gets assigned per-user integrity by the runtime. In both cases the runtime must be trusted to create the appropriate label, but in the case of the extra node there is significant overhead without clear security or safety improvements.

I had a discussion with @tiziano88 and realized that the OuterHttpInvocation that the HTTP server pseudo-node sends to the UserNode needs to be changed, as currently it publicly exposes the request.

I agree.

The other issue is the one that I am experimenting with in #1614. If the request label is non-empty, the Oak node won't be able to read the request. This means that neither the before approach nor the new approach work!

Any user-specific label on the request requires a "router" pattern where the router node (public) would create a new node with the right label and forward the invocation to the new node which should handle the request and produce the response.

@rbehjati
Copy link
Contributor Author

Integrity label support is not currently implemented for the gRPC server node, but it should be possible to do so. Seeing that the gRPC and HTTP server pseudo-nodes are part of the runtime and therefore the TCB they should in effect have the "top" integrity label. Therefore if we had a way of representing "top" as a label, it would resolve the issue. That seems to me to be a cleaner approach than creating a new node that gets assigned per-user integrity by the runtime. In both cases the runtime must be trusted to create the appropriate label, but in the case of the extra node there is significant overhead without clear security or safety improvements.

What do you mean by "top" integrity label? Currently, the default choice for labelling the new nodes and channels is public_untrusted. Are you suggesting that is should be changed to public_fully_trusted (for those nodes and channels that belong to the TCB)?

I agree that that would be a more elegant solution, but I don't know what are its implications for our security guarantees, and NI proofs.

Any user-specific label on the request requires a "router" pattern where the router node (public) would create a new node with the right label and forward the invocation to the new node which should handle the request and produce the response.

Currently (with public_untrusted being the default label, and the label of the router node), this would not work in general. We may need intermediate nodes with specific privileges to be able to create user/request-specific Oak nodes with the right labels.

@conradgrobler
Copy link
Collaborator

What do you mean by "top" integrity label? Currently, the default choice for labelling the new nodes and channels is public_untrusted. Are you suggesting that is should be changed to public_fully_trusted (for those nodes and channels that belong to the TCB)?

Yes.

I agree that that would be a more elegant solution, but I don't know what are its implications for our security guarantees, and NI proofs.

I don't think it is a problem. The TCB is fully trusted, so to explicitly assign a "fully_trusted" integrity label to parts of it should be fine.

Any user-specific label on the request requires a "router" pattern where the router node (public) would create a new node with the right label and forward the invocation to the new node which should handle the request and produce the response.

Currently (with public_untrusted being the default label, and the label of the router node), this would not work in general. We may need intermediate nodes with specific privileges to be able to create user/request-specific Oak nodes with the right labels.

I don't think intermediate nodes will help. The untrusted intial node will not be able to create channels to communicate with the more trusted intermediate nodes (and probably not be able to create these nodes either). If there is some exception that allows it, it breaks IFC (or implicitly grants the router "fully_trusted" integrity). The router node also needs to be "public_fully_trusted" to function correctly. As an aside, this is part of the reason for the configuration-based router node RFC: a well-reviewed, reusable node that can act in a fully trusted way.

Integrity tags in general are currently still problematic. E.g. we don't have a way to create the initial Wasm node with any level of trusted integrity (apart from perhaps its own Wasm hash), so it severely limits the propagation of integrity labels through the system.

@rbehjati
Copy link
Contributor Author

I don't think intermediate nodes will help. The untrusted intial node will not be able to create channels to communicate with the more trusted intermediate nodes (and probably not be able to create these nodes either).

The intermediate nodes won't be more trusted, but they'd have extra privileges. I don't think we have any IFC rules that restrict the privileges of a node. At least currently, while we still have separate privileges instead of robust declassification and transparent endorsement.

@conradgrobler
Copy link
Collaborator

The intermediate nodes won't be more trusted, but they'd have extra privileges. I don't think we have any IFC rules that restrict the privileges of a node. At least currently, while we still have separate privileges instead of robust declassification and transparent endorsement.

Good point. This is the case currently. But it seems that is still does not solve the problem. We can't currently assign arbitrary privilege to any nodes outside of the TCB, so these nodes must be generic nodes in the TCB. How will they decide whether or not they should create the more trusted nodes? If they always just create the nodes that they are told to create, it is equivalent to just granting the privilege to the original node that creates the intermediate node. Then just granting "fully_trusted" privilege to the original node is a simpler, but security-wise equivalent solution.

@rbehjati
Copy link
Contributor Author

Then just granting "fully_trusted" privilege to the original node is a simpler, but security-wise equivalent solution.

Good point :)

@rbehjati
Copy link
Contributor Author

@conradgrobler @daviddrysdale @ipetr0v @tiziano88 Following the discussions on Slack I think it is best to abandon this PR for now. If I leave out the integrity labels, then it would be possible to implement the first increment of the challenge-response without the virtual UserNode (or a node with fully_trusted integrity label). So, the PR could split into two.

@tiziano88
Copy link
Collaborator

I agree, I think this may be split even further:

  • implement proper privilege for the bearer token auth (though this may be throwaway work if we decide to remove it)
  • implement challenge / response mechanism, and assign privilege correctly (only for confidentiality labels)
  • separately look into integrity labels (I think this is not a priority for now)
  • separately look into splitting nodes per user (or we can discuss on an issue / slack thread if necessary)
  • fix tests related to assigning labels based on HTTP headers (already started in Adds a new HTTP test with a non-empty request label #1614)

@@ -82,3 +83,11 @@ message TlsEndpointTag {
// using the set of Certificate Authorities (CA) that are available to it.
string authority = 1;
}

// Policies related to user identification.
message UserIdentityTag {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I am rewriting the chat app to use labels, it turns out that we use public keys to represent things other than users (e.g. chat rooms), so I think we should use a more generic terminology, so perhaps just use PublicKeyIdentityTag or similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I'll change the name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the name in my new PR, but I now wonder if PublicKeyIdentityTag still corresponds to the user sub-lattice. I guess there should be a one-to-one correspondence between the tag types and the principals. Or do we consider chat room to have the same principal nature as users?

Copy link
Contributor Author

@rbehjati rbehjati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reviews. I have created a new PR (#1652) that leaves out setting the user's identity in the integrity label of the request channel. I have applied the comments in the new PR.

let tag = match config.privilege.is_empty() {
true => None,
false => Some(oak_abi::label::tag::Tag::UserIdentityTag(UserIdentityTag {
public_key: config.privilege,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UserNode is removed from my new PR. So this issue should be resolved.

handle: response_reader,
});
let response = response_receiver.receive(&runtime).map_err(|err| {
error!(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of using anyhow, but anyhow::Context is only implemented for Result types that use StdError as the error type. To be able to use anyhow::Context here, we have to wrap the OakStatus in an anyhow::Error. I also noticed anyhow is not used in the oak_runtime. Not sure if it is something in our backlog, or if the plan is to avoid anyhow in the oak_runtime.

In my new PR, I have added a specific error type (HttpError), and have replaced all such uses of map_err.

startup_handle: oak_abi::Handle,
_notify_receiver: oneshot::Receiver<()>,
) {
let _unit_result = self.try_run(runtime, startup_handle);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this to be able to use ? in try_run instead of using a match or if let for every statement in the function. This is changed now, but I think in general using anyhow or a specific error type can fix this issue.

.collect(),
};
let response_writer_label = Label {
confidentiality_tags: privilege
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a chat with @tiziano88 and I think it is correct to set the confidentiality tag of the response channel to the user's identity regardless of the request label. Checking the label should then be done in the Router node, implemented by the application developers.

But I agree with you. If the request label does not have the user's identity, the application won't be able to provide the user with a useful response (for instance, even a bad request response, in case the request is missing a certain header, is not possible). But IIUC this is intentional, and the point about using IFC and enforcing unidirectional communication. I think the solution is to instruct the application developers to make sure that the clients they develop include the user's identity as a confidentiality tag in the request label. I believe, disjunctions are needed to allow/simplify this.

&self,
runtime: &RuntimeProxy,
invocation_channel: oak_abi::Handle,
) -> Result<(), ()> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It allows using the ? operator. I am now using HttpError as the error type.

@@ -82,3 +83,11 @@ message TlsEndpointTag {
// using the set of Certificate Authorities (CA) that are available to it.
string authority = 1;
}

// Policies related to user identification.
message UserIdentityTag {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the name in my new PR, but I now wonder if PublicKeyIdentityTag still corresponds to the user sub-lattice. I guess there should be a one-to-one correspondence between the tag types and the principals. Or do we consider chat room to have the same principal nature as users?

@rbehjati
Copy link
Contributor Author

rbehjati commented Nov 3, 2020

Replaced with #1652

@rbehjati rbehjati closed this Nov 3, 2020
@ipetr0v ipetr0v removed their request for review November 3, 2020 14:01
@rbehjati rbehjati deleted the oak-1357-challenge-resp branch March 25, 2021 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants