-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use challenge response for authentication #1586
Conversation
13d9fab
to
333ada4
Compare
a0bb0bc
to
2352508
Compare
2352508
to
7109974
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not involved in the original design discussions, so I am unsure on the background. Could you please add some information on why these label calculations are done in a separate node rather than in the HTTP server pseudo-node (similar to the current gRPC pseudo-node implementation)?
Is this just to make a logical separation more clear in the code, or is there a technical reason that would make this approach more secure (or some other technical advantage)?
startup_handle: oak_abi::Handle, | ||
_notify_receiver: oneshot::Receiver<()>, | ||
) { | ||
let _unit_result = self.try_run(runtime, startup_handle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems weird that try_run
returns a result, but is only used in one place where the result is ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this to be able to use ?
in try_run
instead of using a match
or if let
for every statement in the function. This is changed now, but I think in general using anyhow
or a specific error type can fix this issue.
handle: response_reader, | ||
}); | ||
let response = response_receiver.receive(&runtime).map_err(|err| { | ||
error!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional: I personally don't like this pattern of using map_err
to do logging. I think it is cleaner if mapper functions like map_err
do not produce side-effects. One option might be to define an error type that contains the message and it's severity. This can then be logged as eitehr a warning or error at the higher level where the result is processed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will anyhow::Context
fit here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of using anyhow
, but anyhow::Context
is only implemented for Result
types that use StdError
as the error type. To be able to use anyhow::Context
here, we have to wrap the OakStatus
in an anyhow::Error
. I also noticed anyhow
is not used in the oak_runtime
. Not sure if it is something in our backlog, or if the plan is to avoid anyhow
in the oak_runtime
.
In my new PR, I have added a specific error type (HttpError
), and have replaced all such uses of map_err
.
let tag = match config.privilege.is_empty() { | ||
true => None, | ||
false => Some(oak_abi::label::tag::Tag::UserIdentityTag(UserIdentityTag { | ||
public_key: config.privilege, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming seems confusing: using privilege
in the proro to represent a public key in the config, but later it is used to represent the entire node privilege.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UserNode
is removed from my new PR. So this issue should be resolved.
.collect(), | ||
}; | ||
let response_writer_label = Label { | ||
confidentiality_tags: privilege |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this check whether user identity tag was included in the original label? If the user did not include the identity in the original request, this could unnecessarily increase the confidentiality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a chat with @tiziano88 and I think it is correct to set the confidentiality tag of the response channel to the user's identity regardless of the request label. Checking the label should then be done in the Router node, implemented by the application developers.
But I agree with you. If the request label does not have the user's identity, the application won't be able to provide the user with a useful response (for instance, even a bad request response, in case the request is missing a certain header, is not possible). But IIUC this is intentional, and the point about using IFC and enforcing unidirectional communication. I think the solution is to instruct the application developers to make sure that the clients they develop include the user's identity as a confidentiality tag in the request label. I believe, disjunctions are needed to allow/simplify this.
&self, | ||
runtime: &RuntimeProxy, | ||
invocation_channel: oak_abi::Handle, | ||
) -> Result<(), ()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Result<(),()> does not seem like a particularly useful type as it does not really convey much information and the results do not seem to be used anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It allows using the ?
operator. I am now using HttpError
as the error type.
According to the docs, the integrity component of the I had a discussion with @tiziano88 and realized that the The other issue is the one that I am experimenting with in #1614. If the |
Integrity label support is not currently implemented for the gRPC server node, but it should be possible to do so. Seeing that the gRPC and HTTP server pseudo-nodes are part of the runtime and therefore the TCB they should in effect have the "top" integrity label. Therefore if we had a way of representing "top" as a label, it would resolve the issue. That seems to me to be a cleaner approach than creating a new node that gets assigned per-user integrity by the runtime. In both cases the runtime must be trusted to create the appropriate label, but in the case of the extra node there is significant overhead without clear security or safety improvements.
I agree.
Any user-specific label on the request requires a "router" pattern where the router node (public) would create a new node with the right label and forward the invocation to the new node which should handle the request and produce the response. |
What do you mean by "top" integrity label? Currently, the default choice for labelling the new nodes and channels is I agree that that would be a more elegant solution, but I don't know what are its implications for our security guarantees, and NI proofs.
Currently (with |
Yes.
I don't think it is a problem. The TCB is fully trusted, so to explicitly assign a "fully_trusted" integrity label to parts of it should be fine.
I don't think intermediate nodes will help. The untrusted intial node will not be able to create channels to communicate with the more trusted intermediate nodes (and probably not be able to create these nodes either). If there is some exception that allows it, it breaks IFC (or implicitly grants the router "fully_trusted" integrity). The router node also needs to be "public_fully_trusted" to function correctly. As an aside, this is part of the reason for the configuration-based router node RFC: a well-reviewed, reusable node that can act in a fully trusted way. Integrity tags in general are currently still problematic. E.g. we don't have a way to create the initial Wasm node with any level of trusted integrity (apart from perhaps its own Wasm hash), so it severely limits the propagation of integrity labels through the system. |
The intermediate nodes won't be more trusted, but they'd have extra privileges. I don't think we have any IFC rules that restrict the privileges of a node. At least currently, while we still have separate privileges instead of robust declassification and transparent endorsement. |
Good point. This is the case currently. But it seems that is still does not solve the problem. We can't currently assign arbitrary privilege to any nodes outside of the TCB, so these nodes must be generic nodes in the TCB. How will they decide whether or not they should create the more trusted nodes? If they always just create the nodes that they are told to create, it is equivalent to just granting the privilege to the original node that creates the intermediate node. Then just granting "fully_trusted" privilege to the original node is a simpler, but security-wise equivalent solution. |
Good point :) |
@conradgrobler @daviddrysdale @ipetr0v @tiziano88 Following the discussions on Slack I think it is best to abandon this PR for now. If I leave out the integrity labels, then it would be possible to implement the first increment of the challenge-response without the virtual |
I agree, I think this may be split even further:
|
@@ -82,3 +83,11 @@ message TlsEndpointTag { | |||
// using the set of Certificate Authorities (CA) that are available to it. | |||
string authority = 1; | |||
} | |||
|
|||
// Policies related to user identification. | |||
message UserIdentityTag { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I am rewriting the chat app to use labels, it turns out that we use public keys to represent things other than users (e.g. chat rooms), so I think we should use a more generic terminology, so perhaps just use PublicKeyIdentityTag
or similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I'll change the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the name in my new PR, but I now wonder if PublicKeyIdentityTag
still corresponds to the user
sub-lattice. I guess there should be a one-to-one correspondence between the tag types and the principals. Or do we consider chat room
to have the same principal nature as users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reviews. I have created a new PR (#1652) that leaves out setting the user's identity in the integrity label of the request channel. I have applied the comments in the new PR.
let tag = match config.privilege.is_empty() { | ||
true => None, | ||
false => Some(oak_abi::label::tag::Tag::UserIdentityTag(UserIdentityTag { | ||
public_key: config.privilege, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UserNode
is removed from my new PR. So this issue should be resolved.
handle: response_reader, | ||
}); | ||
let response = response_receiver.receive(&runtime).map_err(|err| { | ||
error!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of using anyhow
, but anyhow::Context
is only implemented for Result
types that use StdError
as the error type. To be able to use anyhow::Context
here, we have to wrap the OakStatus
in an anyhow::Error
. I also noticed anyhow
is not used in the oak_runtime
. Not sure if it is something in our backlog, or if the plan is to avoid anyhow
in the oak_runtime
.
In my new PR, I have added a specific error type (HttpError
), and have replaced all such uses of map_err
.
startup_handle: oak_abi::Handle, | ||
_notify_receiver: oneshot::Receiver<()>, | ||
) { | ||
let _unit_result = self.try_run(runtime, startup_handle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this to be able to use ?
in try_run
instead of using a match
or if let
for every statement in the function. This is changed now, but I think in general using anyhow
or a specific error type can fix this issue.
.collect(), | ||
}; | ||
let response_writer_label = Label { | ||
confidentiality_tags: privilege |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a chat with @tiziano88 and I think it is correct to set the confidentiality tag of the response channel to the user's identity regardless of the request label. Checking the label should then be done in the Router node, implemented by the application developers.
But I agree with you. If the request label does not have the user's identity, the application won't be able to provide the user with a useful response (for instance, even a bad request response, in case the request is missing a certain header, is not possible). But IIUC this is intentional, and the point about using IFC and enforcing unidirectional communication. I think the solution is to instruct the application developers to make sure that the clients they develop include the user's identity as a confidentiality tag in the request label. I believe, disjunctions are needed to allow/simplify this.
&self, | ||
runtime: &RuntimeProxy, | ||
invocation_channel: oak_abi::Handle, | ||
) -> Result<(), ()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It allows using the ?
operator. I am now using HttpError
as the error type.
@@ -82,3 +83,11 @@ message TlsEndpointTag { | |||
// using the set of Certificate Authorities (CA) that are available to it. | |||
string authority = 1; | |||
} | |||
|
|||
// Policies related to user identification. | |||
message UserIdentityTag { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the name in my new PR, but I now wonder if PublicKeyIdentityTag
still corresponds to the user
sub-lattice. I guess there should be a one-to-one correspondence between the tag types and the principals. Or do we consider chat room
to have the same principal nature as users?
Replaced with #1652 |
Summary of the changes:
user
module, which defines a virtual node (Implement downgrading privilege for user principals authenticated over gRPC / HTTP #1452) namedUserNode
, tooak_runtime/node/http
. An instance ofUserNode
is created for every incoming HTTP request. This node gets the identity of the user as its declassification and endorsement privileges. The functionality ofUserNode
is mostly copied fromHttpServerNode
, in particular, the functioninject_http_request
. InHttpServerNode
the functioninject_http_request
is updated to instead create theUserNode
instance and forward the HTTP request and the required invocation channels to it.SignedChallenge
,UserIdentityTag
,UserNodeConfiguration
, andOuterHttpInvocation
are added.UserNode
with its privilege uses the user's identity to correctly set the labels on the invocation channels used for the interaction with the Oak node (Set the user's identity as the confidentiality tag in invocation channels for HTTP and gRPC server nodes #1428).Here is an attempt to visualize this change:
Checklist
Cloudbuild
cover any TODOs and/or unfinished work.
construction.