Policies are used to decide whether a given call is allowed or not.
They are defined in module Policy
.
For each compartment, a policy defines a list of exported procedures and a list of procedures imported from other compartments. Procedures are refered to by their public identifier.
Policies can be seen as records with two fields:
policy_export : compartment → list ident
, a partial map that associates a list of exported procedures to each compartmentpolicy_import : compartment → list (compartment * ident)
, a partial map that associates a list of imported procedures from particular compartments to each compartment
(these are in fact PTrees
, the efficient implementation of finite maps provided by CompCert)
The function Policy.eqb
defines an equivalence between two policies: two
policies are equivalent iff for each compartment, they define the same exported
and imported procedures
The actual definition of whether a call is allowed or not is provided in common/Globalenv.v.
We modify the definition of programs so that they include a policy, which will govern which calls are allowed.
Record program (F V: Type) : Type := mkprogram {
prog_defs: list (ident * globdef F V);
prog_public: list ident;
prog_main: ident;
prog_pol: Policy.t
}.
A program transformation always leaves the policy unchanged. Similarly, policies are not used inside the transformations, so it’s not possible to use the information of these policies to perform optimizations.
Linking two programs p1
and p2
now requires that the policies pol1
pol2
defined by the two programs agree, i.e. that Policy.eqb pol1 pol2
, or in
words, that they defined the same exported and imported procedures for each
compartment.
Definition link_prog :=
if ident_eq p1.(prog_main) p2.(prog_main)
&& PTree_Properties.for_all dm1 link_prog_check
&& Policy.eqb p1.(prog_pol) p2.(prog_pol) then (* HERE *)
Some {| … |}
else
None.
The global environments are modified so they now contain a policy
Record Genv.t: Type := mkgenv {
…
genv_policy: Policy.t;
…
}
The global environment’s policy comes from the program:
Program Definition empty_genv (pub: list ident) (pol: Policy.t): t :=
@mkgenv pub (PTree.empty _) (PTree.empty _) 1%positive pol _ _ _.
Definition globalenv (p: program F V) :=
add_globals (empty_genv p.(prog_public) p.(prog_pol)) p.(prog_defs).
The definition of what calls are allowed is given by Genv.allowed_call
.
Definition allowed_call (ge: t) (cp: compartment) (vf: val) :=
Some default_compartment = find_comp ge vf \/ (* 1 *)
Some cp = find_comp ge vf \/ (* 2 *)
allowed_cross_call ge cp vf. (* 3 *)
To determine whether a call is or isn’t allowed, we look at two pieces of
information: the calling compartment, cp: compartment
, and a value vf: val
,
which should be a function pointer to the callee. The reason we use a function
pointer is that it allows us to use this definition even at the lowest levels,
where jumps could be used to circumvent a protection purely based on checking
identifiers.
A call is allowed if any of these 3 cases holds: (1) the procedure being called belongs to the default compartment (2) the procedure being called belongs to the same compartment as the caller (3) the call is an inter-compartment call and is allowed by the policy
Case (1) treats the default compartment as a public compartment that can be called anytime. This is useful to assign the same compartments to all builtin procedures that are eventually simplified or transformed into proper calls.
Case (3) makes use of the policy, and is more interesting. The definition of
allowed_cross_call
is:
Definition allowed_cross_call (ge: t) (cp: compartment) (vf: val) :=
match vf with
| Vptr b _ =>
exists i cp',
invert_symbol ge b = Some i /\
find_comp ge vf = Some cp' /\
match (Policy.policy_import ge.(genv_policy)) ! cp with
| Some l => In (cp', i) l
| None => False
end /\
match (Policy.policy_export ge.(genv_policy)) ! cp' with
| Some l => In i l
| None => False
end
| _ => False
end.
This definition does the following:
- it checks that the callee is indeed a pointer to block
b
- it checks that this block corresponds to a identifier
i
- it obtains the compartment
cp'
of this block - it checks that the pair
(cp', i)
is indeed imported bycp
- it checks that the identifier
i
is indeed exported bycp'
An executable version with boolean value is given (allowed_call_b
) and it is
proven equivalent to the version in Prop (lemma allowed_call_reflect
).
Given two “matching” programs p
and tp
, the following lemma
can be used to show that any allowed call for p
is also allowed
for tp
.
(* … *)
Variable match_fundef: C -> F1 -> F2 -> Prop.
Variable match_varinfo: V1 -> V2 -> Prop.
Variable ctx: C.
Hypothesis progmatch: match_program_gen match_fundef match_varinfo ctx p tp.
Lemma match_genvs_allowed_calls:
forall cp vf,
allowed_call (globalenv p) cp vf ->
allowed_call (globalenv tp) cp vf.
There are also special cases of this lemma for transformations that do not depend on the compilation unit:
Theorem allowed_call_transf_partial:
forall cp vf,
allowed_call (globalenv p) cp vf -> allowed_call (globalenv tp) cp vf.
Theorem allowed_call_transf:
forall cp vf,
allowed_call (globalenv p) cp vf -> allowed_call (globalenv tp) cp vf.
The syntax of all languages isn’t changed.
The semantics of most languages are modified in the following way:
- when the small-step semantics already used function pointers to
perform calls (for instance, in the case of Cminor), we add
a condition
Genv.allowed_call ge (comp_of f) vf
to thestep_call
rule that checks that the call tovf
is allowed according to the policy of the global environment - similarly, we add the same condition to the
step_tailcall
rule. In the case ofstep_tailcall
, we additionally require that the tail call is internal (comp_of fd = (comp_of f)
wherefd
is the procedure being called andf
is the current procedure), and thatneeds_calling_comp (comp_of f) = false
(not sure what this one is about) We require that the tail call is internal to avoid problems when the tail call will be optimized to a jump in the later phases. - we always allow calls using the special “call to built-ins” instructions, as we assume these are only used to call built-ins. This is something for which we should add a syntactic check.
Some languages have big-step semantics or executable versions of the semantics, these are updated in the same way.
Some particular cases:
This is not a particular interesting change, but there might be an opportunity to factorize some code here.
Some languages don’t use function pointers directly in the semantics to perform
calls (it is instead hidden under another condition). For instance, in LTL.v
,
the function definition is accessed not by following a function pointer vf
but
instead by using find_function: (mreg + ident) -> locset -> option fundef
. To
perform the policy checks, we define a function find_function_ptr: (mreg +
ident) -> locset -> option val
that returns the function pointer corresponding
to the callee. Usually this is not difficult, because find_function
already
look the function pointer up before obtaining a fundef
. For instance, in
LTL.v
:
Definition find_function_ptr (ros: mreg + ident) (rs: locset) : option val :=
match ros with
| inl r => Some (rs (R r))
| inr symb =>
match Genv.find_symbol ge symb with
| Some b => Some (Vptr b Ptrofs.zero)
| None => None
end
end.
Definition find_function (ros: mreg + ident) (rs: locset) : option fundef :=
match ros with
| inl r => Genv.find_funct ge (rs (R r))
| inr symb =>
match Genv.find_symbol ge symb with
| None => None
| Some b => Genv.find_funct_ptr ge b
end
end.
Lemma find_function_find_function_ptr: forall ros rs fd,
find_function ros rs = Some fd ->
exists vf, find_function_ptr ros rs = Some vf.
During the inlining and tailcall optimization phases in the backend, there are inserted checks to only inline and tailcall functions that belong to the same compartment.
The Risc-V backend assembly language’s syntax and semantics are the most important modification.
At this level, we need to check at every step of execution that we are not changing compartment without respecting the policy. To do so, we modify the syntax and the semantics.
As a first step, we need to modify the rule
exec_step_internal:
forall b ofs f i rs m rs' m',
rs PC = Vptr b ofs ->
Genv.find_funct_ptr ge b = Some (Internal f) ->
find_instr (Ptrofs.unsigned ofs) (fn_code f) = Some i ->
exec_instr f i rs m = Next rs' m' ->
step (State rs m) E0 (State rs' m')
which handles stepping in the code.
We add the check (ALLOWED: Genv.allowed_call ge (comp_of f) (Vptr b' ofs'))
where f
is the currently executing function and Vptr b' ofs'
is the new PC.
Adding this condition prevents all unwanted compartment changes.
However, this doesn’t allow all returns: for instance, consider the case where a
procedure P_1
of a compartment C_1
calls a procedure P_2
of C_2
. When
C_2
tries to return to P_1
, it is not necessary the case that this is
allowed by the policy (it does not have to be symmetric, and indeed it shouldn’t
be in general. C_2
could be an untrusted compartment that is not allowed to call
anything from C_1
).
We tried first one solution that doesn’t work: when jumping to the return address RA
,
the semantics could have a special case that do allow the return, on the condition
that the call from the compartment of RA
to the PC from which we are returning
is allowed (i.e. check the policy in reverse).
However, this is not sufficent. Consider the following example:
P_1
from C_1
calls P_2
from C_2
, which then do a tailcall to
P_2'
(still from C_2
). Then, when P_2'
returns, it tries to return
to P_1
, but the call from C_1
to P_2'
was not necessarily allowed!
(for instance, if P_2'
is private to C_2
).
We settled on the following solution: the risc-V backend makes use of a virtual cross-compartment stack that stores return addresses and stack pointers of cross-compartment calls. When doing cross-compartment calls, we add a frame to this cross-compartment stack. When doing cross-compartment returns, we check that the new PC points to the return address stored on the stack, and also that the new SP points to the address stored on the stack. This should prevent an attacker to bypass the policy by disguising calls into returns.
While we could do these checks at every step, this is not efficient and would be difficult
to implement on a micro-policy backend. Instead, we now tag some of the Risc-V instructions
as calls or returns. Instructions Pjal_s
and Pjal_r
can be tagged as calls, and
instruction Pj_r
(jump to content of register) can be tagged as a return. No other
instruction is tagged. These tags are set by the Mach
to Asm
compiler when it
compiles calls and returns.
The previous rule becomes the following three:
| exec_step_internal:
forall b ofs f i rs m rs' m' b' ofs' st,
rs PC = Vptr b ofs ->
Genv.find_funct_ptr ge b = Some (Internal f) ->
find_instr (Ptrofs.unsigned ofs) (fn_code f) = Some i ->
exec_instr f i rs m = Next rs' m' ->
is_call i = false ->
is_return i = false ->
forall (NEXTPC: rs' PC = Vptr b' ofs'),
forall (ALLOWED: Genv.allowed_call ge (comp_of f) (Vptr b' ofs')),
step (State st rs m) E0 (State st rs' m')
| exec_step_internal_call:
forall b ofs f i rs m rs' m' b' ofs' cp st st',
rs PC = Vptr b ofs ->
Genv.find_funct_ptr ge b = Some (Internal f) ->
find_instr (Ptrofs.unsigned ofs) (fn_code f) = Some i ->
exec_instr f i rs m = Next rs' m' ->
is_call i = true ->
forall (NEXTPC: rs' PC = Vptr b' ofs'),
forall (ALLOWED: Genv.allowed_call ge (comp_of f) (Vptr b' ofs')),
forall (CURCOMP: Genv.find_comp ge (Vptr b Ptrofs.zero) = Some cp),
(* Is a call, we update the stack *)
forall (STUPD: update_stack_call st cp rs' = Some st'),
step (State st rs m) E0 (State st' rs' m')
| exec_step_internal_return:
forall b ofs f i rs m rs' m' cp cp' st st',
rs PC = Vptr b ofs ->
Genv.find_funct_ptr ge b = Some (Internal f) ->
find_instr (Ptrofs.unsigned ofs) (fn_code f) = Some i ->
exec_instr f i rs m = Next rs' m' ->
is_return i = true ->
forall (CURCOMP: Genv.find_comp ge (rs PC) = Some cp),
forall (NEXTCOMP: Genv.find_comp ge (rs' PC) = Some cp'),
(* We only impose conditions on when returns can be executed for cross-compartment
returns. These conditions are that we restore the previous RA and SP *)
forall (PC_RA: cp <> cp' -> rs' PC = asm_parent_ra st),
forall (RESTORE_SP: cp <> cp' -> rs' SP = asm_parent_sp st),
(* Note that in the same manner, this definition only updates the stack when doing
cross-compartment returns *)
forall (STUPD: update_stack_return st cp rs' = Some st'),
step (State st rs m) E0 (State st' rs' m')
- Rule
exec_step_internal
is used for every step that is not a call or a return. It still does a policy check. (TODO: can we get rid of this policy check and instead simply check that we do not change compartment?) - Rule
exec_step_internal_call
is used when the instruction is tagged as a call. It checks that the call is allowed, and update the stack (functionupdate_stack_call
). Note thatupdate_stack_call
does not modify the stack if the call is intra-compartment. - Rule
exec_step_interal_return
is used when the instruction is tagged as a return. It checks that the return is allowed, in the sense that it restores the return address and the stack pointer stored on the cross-compartment stack. Again, the stack is only modified byupdate_stack_return
when the return is cross-compartment.
In terms of compilation: calls and returns are compiled to instructions tagged as calls or returns:
| Mcall sig (inl r) =>
do r1 <- ireg_of r; OK (Pjal_r r1 sig true :: k)
| Mcall sig (inr symb) =>
OK (Pjal_s symb sig true :: k)
(* … *)
| Mreturn => |
OK (make_epilogue f (Pj_r RA f.(Mach.fn_sig) true :: k))
Note that tailcalls are never compiled to instructions tagged as calls, because they can only occur intra-compartment.
To show that the simulation lemmas still hold, we need to show that if a call is allowed in the source, then it is allowed in the target
For instance, in Cminorgenproof.v
, we use the lemma allowed_call_transl
:
Lemma allowed_call_transl: forall cenv f vf sz tfn,
Genv.allowed_call ge (comp_of f) vf ->
transl_funbody cenv sz f = OK tfn ->
Genv.allowed_call tge (comp_of tfn) vf.
In this case, the lemma is easy to prove, as the function pointer is trivially the same in both the source and target.
Sometimes, we also need to prove the equality of function pointers between the source
and target. For instance, in SimplLocalsproof.v
, we prove the following lemma:
Lemma match_cont_find_funct_eq:
forall f cenv k tk m bound tbound vf fd tvf,
match_cont f cenv k tk m bound tbound ->
Genv.find_funct ge vf = Some fd ->
Val.inject f vf tvf ->
vf = tvf.
This kind of lemma is usually easy to prove. Most of the time, there is already a lemma that has the same premises, and that proves something like:
exists tfd, Genv.find_funct tge tvf = Some tfd /\ transf_fundef fd = OK tfd
That is that by following the target function pointer, we find the translation of the source
procedure. To prove the match_cont_find_funct_eq
we follow the same proof structure, except
we stop a bit earlier.
Some goals also require proving equality of compartment between source and target. We use
lemmas such as comp_transl
or comp_transl_partial
to do so.
Particular cases:
This proof is a bit special, as it relies on a custom relation between
the source and target global environment instead of the generic one.
As a result one has to prove again some results that were proven for
the generic match_genvs
relation. However, nothing is particularly
surprising in these proofs.
Because of the new stack, we need to make a few modifications to the simulation proof.
We introduce a new simulation invariant:
Inductive match_stacks: val -> list Mach.stackframe -> stack -> Prop :=
| match_stacks_nil:
forall pc,
match_stacks pc nil nil
| match_stacks_intra_compartment:
(* Intra-compartment calls create a new frame in the source, not the target *)
forall newpc pc s s' f,
match_stacks pc s s' ->
Genv.find_comp ge (val_of_stackframe f) = Genv.find_comp ge pc ->
Genv.find_comp ge (val_of_stackframe f) = Genv.find_comp ge newpc ->
(* no condition on the frame *)
match_stacks newpc (f :: s) s'
| match_stacks_cross_compartment:
(* Cross-compartment calls create a new frame in both the source and the target *)
forall newpc pc s s' f f',
match_stacks pc s s' ->
Genv.find_comp ge (val_of_stackframe f) = Genv.find_comp ge pc ->
Genv.find_comp ge (val_of_stackframe f) <> Genv.find_comp ge newpc ->
match_stackframe f f' ->
match_stacks newpc (f :: s) (f' :: s')
.
This invariant relate source (Mach
) stack and target (Asm
) cross-compartment stack,
as well as the current PC.
- Empty stacks are always related
- When the
Mach
program does an intra-compartment call, then a new frame is created inMach
, but not inAsm
(rulematch_stacks_intra_compartment
). - When the
Mach
program does a cross-compartment call, then theAsm
program also does a cross-compartment call as well. A new frame is added in both stacks (rulematch_stacks_cross_compartment
).
We can prove (match_stacks_same_compartment
) that one can change the PC arbitrarily
as long as it stays in the same compartment.
TODO: explain the proof.