Skip to content

Latest commit

 

History

History
441 lines (386 loc) · 18.6 KB

summary-interfaces.org

File metadata and controls

441 lines (386 loc) · 18.6 KB

common/AST.v

Module Policy

Policies are used to decide whether a given call is allowed or not. They are defined in module Policy.

For each compartment, a policy defines a list of exported procedures and a list of procedures imported from other compartments. Procedures are refered to by their public identifier.

Policies can be seen as records with two fields:

  • policy_export : compartment → list ident, a partial map that associates a list of exported procedures to each compartment
  • policy_import : compartment → list (compartment * ident), a partial map that associates a list of imported procedures from particular compartments to each compartment

(these are in fact PTrees, the efficient implementation of finite maps provided by CompCert)

The function Policy.eqb defines an equivalence between two policies: two policies are equivalent iff for each compartment, they define the same exported and imported procedures

The actual definition of whether a call is allowed or not is provided in common/Globalenv.v.

Programs contain a policy

We modify the definition of programs so that they include a policy, which will govern which calls are allowed.

Record program (F V: Type) : Type := mkprogram {
  prog_defs: list (ident * globdef F V);
  prog_public: list ident;
  prog_main: ident;
  prog_pol: Policy.t
}.

Program transformations do not use or modify the policy

A program transformation always leaves the policy unchanged. Similarly, policies are not used inside the transformations, so it’s not possible to use the information of these policies to perform optimizations.

common/Linking.v

Linking two programs p1 and p2 now requires that the policies pol1 pol2 defined by the two programs agree, i.e. that Policy.eqb pol1 pol2, or in words, that they defined the same exported and imported procedures for each compartment.

Definition link_prog :=
  if ident_eq p1.(prog_main) p2.(prog_main)
     && PTree_Properties.for_all dm1 link_prog_check
     && Policy.eqb p1.(prog_pol) p2.(prog_pol) then (* HERE *)
    Some {| … |}
  else
    None.

common/Globalenvs.v

The global environments are modified so they now contain a policy

Record Genv.t: Type := mkgenv {
  …
  genv_policy: Policy.t;
  …
}

The global environment’s policy comes from the program:

Program Definition empty_genv (pub: list ident) (pol: Policy.t): t :=
  @mkgenv pub (PTree.empty _) (PTree.empty _) 1%positive pol _ _ _.

Definition globalenv (p: program F V) :=
  add_globals (empty_genv p.(prog_public) p.(prog_pol)) p.(prog_defs).

The definition of what calls are allowed is given by Genv.allowed_call.

Definition allowed_call (ge: t) (cp: compartment) (vf: val) :=
  Some default_compartment = find_comp ge vf \/ (* 1 *)
  Some cp = find_comp ge vf \/                  (* 2 *)
  allowed_cross_call ge cp vf.                 (* 3 *)

To determine whether a call is or isn’t allowed, we look at two pieces of information: the calling compartment, cp: compartment, and a value vf: val, which should be a function pointer to the callee. The reason we use a function pointer is that it allows us to use this definition even at the lowest levels, where jumps could be used to circumvent a protection purely based on checking identifiers.

A call is allowed if any of these 3 cases holds: (1) the procedure being called belongs to the default compartment (2) the procedure being called belongs to the same compartment as the caller (3) the call is an inter-compartment call and is allowed by the policy

Case (1) treats the default compartment as a public compartment that can be called anytime. This is useful to assign the same compartments to all builtin procedures that are eventually simplified or transformed into proper calls.

Case (3) makes use of the policy, and is more interesting. The definition of allowed_cross_call is:

Definition allowed_cross_call (ge: t) (cp: compartment) (vf: val) :=
  match vf with
  | Vptr b _ =>
    exists i cp',
    invert_symbol ge b = Some i /\
    find_comp ge vf = Some cp' /\
    match (Policy.policy_import ge.(genv_policy)) ! cp with
    | Some l => In (cp', i) l
    | None => False
    end /\
    match (Policy.policy_export ge.(genv_policy)) ! cp' with
    | Some l => In i l
    | None => False
    end
  | _ => False
  end.

This definition does the following:

  • it checks that the callee is indeed a pointer to block b
  • it checks that this block corresponds to a identifier i
  • it obtains the compartment cp' of this block
  • it checks that the pair (cp', i) is indeed imported by cp
  • it checks that the identifier i is indeed exported by cp'

An executable version with boolean value is given (allowed_call_b) and it is proven equivalent to the version in Prop (lemma allowed_call_reflect).

Given two “matching” programs p and tp, the following lemma can be used to show that any allowed call for p is also allowed for tp.

(**)
Variable match_fundef: C -> F1 -> F2 -> Prop.
Variable match_varinfo: V1 -> V2 -> Prop.
Variable ctx: C.
Hypothesis progmatch: match_program_gen match_fundef match_varinfo ctx p tp.

Lemma match_genvs_allowed_calls:
  forall cp vf,
    allowed_call (globalenv p) cp vf ->
    allowed_call (globalenv tp) cp vf.

There are also special cases of this lemma for transformations that do not depend on the compilation unit:

Theorem allowed_call_transf_partial:
  forall cp vf,
    allowed_call (globalenv p) cp vf -> allowed_call (globalenv tp) cp vf.

Theorem allowed_call_transf:
  forall cp vf,
    allowed_call (globalenv p) cp vf -> allowed_call (globalenv tp) cp vf.

Languages

The syntax of all languages isn’t changed.

The semantics of most languages are modified in the following way:

  • when the small-step semantics already used function pointers to perform calls (for instance, in the case of Cminor), we add a condition Genv.allowed_call ge (comp_of f) vf to the step_call rule that checks that the call to vf is allowed according to the policy of the global environment
  • similarly, we add the same condition to the step_tailcall rule. In the case of step_tailcall, we additionally require that the tail call is internal (comp_of fd = (comp_of f) where fd is the procedure being called and f is the current procedure), and that needs_calling_comp (comp_of f) = false (not sure what this one is about) We require that the tail call is internal to avoid problems when the tail call will be optimized to a jump in the later phases.
  • we always allow calls using the special “call to built-ins” instructions, as we assume these are only used to call built-ins. This is something for which we should add a syntactic check.

Some languages have big-step semantics or executable versions of the semantics, these are updated in the same way.

Some particular cases:

Languages that don’t use function pointers directly to perform calls

This is not a particular interesting change, but there might be an opportunity to factorize some code here.

Some languages don’t use function pointers directly in the semantics to perform calls (it is instead hidden under another condition). For instance, in LTL.v, the function definition is accessed not by following a function pointer vf but instead by using find_function: (mreg + ident) -> locset -> option fundef. To perform the policy checks, we define a function find_function_ptr: (mreg + ident) -> locset -> option val that returns the function pointer corresponding to the callee. Usually this is not difficult, because find_function already look the function pointer up before obtaining a fundef. For instance, in LTL.v:

Definition find_function_ptr (ros: mreg + ident) (rs: locset) : option val :=
  match ros with
  | inl r => Some (rs (R r))
  | inr symb =>
    match Genv.find_symbol ge symb with
    | Some b => Some (Vptr b Ptrofs.zero)
    | None => None
    end
  end.

Definition find_function (ros: mreg + ident) (rs: locset) : option fundef :=
  match ros with
  | inl r => Genv.find_funct ge (rs (R r))
  | inr symb =>
      match Genv.find_symbol ge symb with
      | None => None
      | Some b => Genv.find_funct_ptr ge b
      end
  end.

Lemma find_function_find_function_ptr: forall ros rs fd,
    find_function ros rs = Some fd ->
    exists vf, find_function_ptr ros rs = Some vf.

Case of inlining and tailcall optimization

During the inlining and tailcall optimization phases in the backend, there are inserted checks to only inline and tailcall functions that belong to the same compartment.

Case of the Risc-V backend

The Risc-V backend assembly language’s syntax and semantics are the most important modification.

At this level, we need to check at every step of execution that we are not changing compartment without respecting the policy. To do so, we modify the syntax and the semantics.

As a first step, we need to modify the rule

exec_step_internal:
  forall b ofs f i rs m rs' m',
  rs PC = Vptr b ofs ->
  Genv.find_funct_ptr ge b = Some (Internal f) ->
  find_instr (Ptrofs.unsigned ofs) (fn_code f) = Some i ->
  exec_instr f i rs m = Next rs' m' ->
  step (State rs m) E0 (State rs' m')

which handles stepping in the code. We add the check (ALLOWED: Genv.allowed_call ge (comp_of f) (Vptr b' ofs')) where f is the currently executing function and Vptr b' ofs' is the new PC. Adding this condition prevents all unwanted compartment changes.

However, this doesn’t allow all returns: for instance, consider the case where a procedure P_1 of a compartment C_1 calls a procedure P_2 of C_2. When C_2 tries to return to P_1, it is not necessary the case that this is allowed by the policy (it does not have to be symmetric, and indeed it shouldn’t be in general. C_2 could be an untrusted compartment that is not allowed to call anything from C_1).

We tried first one solution that doesn’t work: when jumping to the return address RA, the semantics could have a special case that do allow the return, on the condition that the call from the compartment of RA to the PC from which we are returning is allowed (i.e. check the policy in reverse).

However, this is not sufficent. Consider the following example: P_1 from C_1 calls P_2 from C_2, which then do a tailcall to P_2' (still from C_2). Then, when P_2' returns, it tries to return to P_1, but the call from C_1 to P_2' was not necessarily allowed! (for instance, if P_2' is private to C_2).

We settled on the following solution: the risc-V backend makes use of a virtual cross-compartment stack that stores return addresses and stack pointers of cross-compartment calls. When doing cross-compartment calls, we add a frame to this cross-compartment stack. When doing cross-compartment returns, we check that the new PC points to the return address stored on the stack, and also that the new SP points to the address stored on the stack. This should prevent an attacker to bypass the policy by disguising calls into returns.

While we could do these checks at every step, this is not efficient and would be difficult to implement on a micro-policy backend. Instead, we now tag some of the Risc-V instructions as calls or returns. Instructions Pjal_s and Pjal_r can be tagged as calls, and instruction Pj_r (jump to content of register) can be tagged as a return. No other instruction is tagged. These tags are set by the Mach to Asm compiler when it compiles calls and returns.

The previous rule becomes the following three:

| exec_step_internal:
    forall b ofs f i rs m rs' m' b' ofs' st,
    rs PC = Vptr b ofs ->
    Genv.find_funct_ptr ge b = Some (Internal f) ->
    find_instr (Ptrofs.unsigned ofs) (fn_code f) = Some i ->
    exec_instr f i rs m = Next rs' m' ->
    is_call i = false ->
    is_return i = false ->
    forall (NEXTPC: rs' PC = Vptr b' ofs'),
    forall (ALLOWED: Genv.allowed_call ge (comp_of f) (Vptr b' ofs')),
    step (State st rs m) E0 (State st rs' m')
| exec_step_internal_call:
    forall b ofs f i rs m rs' m' b' ofs' cp st st',
    rs PC = Vptr b ofs ->
    Genv.find_funct_ptr ge b = Some (Internal f) ->
    find_instr (Ptrofs.unsigned ofs) (fn_code f) = Some i ->
    exec_instr f i rs m = Next rs' m' ->
    is_call i = true ->
    forall (NEXTPC: rs' PC = Vptr b' ofs'),
    forall (ALLOWED: Genv.allowed_call ge (comp_of f) (Vptr b' ofs')),
    forall (CURCOMP: Genv.find_comp ge (Vptr b Ptrofs.zero) = Some cp),
    (* Is a call, we update the stack *)
    forall (STUPD: update_stack_call st cp rs' = Some st'),
    step (State st rs m) E0 (State st' rs' m')
| exec_step_internal_return:
    forall b ofs f i rs m rs' m' cp cp' st st',
    rs PC = Vptr b ofs ->
    Genv.find_funct_ptr ge b = Some (Internal f) ->
    find_instr (Ptrofs.unsigned ofs) (fn_code f) = Some i ->
    exec_instr f i rs m = Next rs' m' ->
    is_return i = true ->
    forall (CURCOMP: Genv.find_comp ge (rs PC) = Some cp),
    forall (NEXTCOMP: Genv.find_comp ge (rs' PC) = Some cp'),
    (* We only impose conditions on when returns can be executed for cross-compartment
       returns. These conditions are that we restore the previous RA and SP *)
    forall (PC_RA: cp <> cp' -> rs' PC = asm_parent_ra st),
    forall (RESTORE_SP: cp <> cp' -> rs' SP = asm_parent_sp st),
    (* Note that in the same manner, this definition only updates the stack when doing
       cross-compartment returns *)
    forall (STUPD: update_stack_return st cp rs' = Some st'),
    step (State st rs m) E0 (State st' rs' m')
  • Rule exec_step_internal is used for every step that is not a call or a return. It still does a policy check. (TODO: can we get rid of this policy check and instead simply check that we do not change compartment?)
  • Rule exec_step_internal_call is used when the instruction is tagged as a call. It checks that the call is allowed, and update the stack (function update_stack_call). Note that update_stack_call does not modify the stack if the call is intra-compartment.
  • Rule exec_step_interal_return is used when the instruction is tagged as a return. It checks that the return is allowed, in the sense that it restores the return address and the stack pointer stored on the cross-compartment stack. Again, the stack is only modified by update_stack_return when the return is cross-compartment.

In terms of compilation: calls and returns are compiled to instructions tagged as calls or returns:

| Mcall sig (inl r) =>
    do r1 <- ireg_of r; OK (Pjal_r r1 sig true :: k)
| Mcall sig (inr symb) =>
    OK (Pjal_s symb sig true :: k)
(**)
| Mreturn => |
    OK (make_epilogue f (Pj_r RA f.(Mach.fn_sig) true :: k))

Note that tailcalls are never compiled to instructions tagged as calls, because they can only occur intra-compartment.

Simulation proofs

To show that the simulation lemmas still hold, we need to show that if a call is allowed in the source, then it is allowed in the target

For instance, in Cminorgenproof.v, we use the lemma allowed_call_transl:

Lemma allowed_call_transl: forall cenv f vf sz tfn,
  Genv.allowed_call ge (comp_of f) vf ->
  transl_funbody cenv sz f = OK tfn ->
  Genv.allowed_call tge (comp_of tfn) vf.

In this case, the lemma is easy to prove, as the function pointer is trivially the same in both the source and target.

Sometimes, we also need to prove the equality of function pointers between the source and target. For instance, in SimplLocalsproof.v, we prove the following lemma:

Lemma match_cont_find_funct_eq:
  forall f cenv k tk m bound tbound vf fd tvf,
  match_cont f cenv k tk m bound tbound ->
  Genv.find_funct ge vf = Some fd ->
  Val.inject f vf tvf ->
  vf = tvf.

This kind of lemma is usually easy to prove. Most of the time, there is already a lemma that has the same premises, and that proves something like:

exists tfd, Genv.find_funct tge tvf = Some tfd /\ transf_fundef fd = OK tfd

That is that by following the target function pointer, we find the translation of the source procedure. To prove the match_cont_find_funct_eq we follow the same proof structure, except we stop a bit earlier.

Some goals also require proving equality of compartment between source and target. We use lemmas such as comp_transl or comp_transl_partial to do so.

Particular cases:

Case of Unusedglob (Elimination of unreferenced static definitions)

This proof is a bit special, as it relies on a custom relation between the source and target global environment instead of the generic one. As a result one has to prove again some results that were proven for the generic match_genvs relation. However, nothing is particularly surprising in these proofs.

Compilation from Mach to Asm

Because of the new stack, we need to make a few modifications to the simulation proof.

We introduce a new simulation invariant:

Inductive match_stacks: val -> list Mach.stackframe -> stack -> Prop :=
| match_stacks_nil:
    forall pc,
    match_stacks pc nil nil
| match_stacks_intra_compartment:
    (* Intra-compartment calls create a new frame in the source, not the target *)
    forall newpc pc s s' f,
    match_stacks pc s s' ->
    Genv.find_comp ge (val_of_stackframe f) = Genv.find_comp ge pc ->
    Genv.find_comp ge (val_of_stackframe f) = Genv.find_comp ge newpc ->
    (* no condition on the frame *)
    match_stacks newpc (f :: s) s'
| match_stacks_cross_compartment:
    (* Cross-compartment calls create a new frame in both the source and the target *)
    forall newpc pc s s' f f',
    match_stacks pc s s' ->
    Genv.find_comp ge (val_of_stackframe f) = Genv.find_comp ge pc ->
    Genv.find_comp ge (val_of_stackframe f) <> Genv.find_comp ge newpc ->
    match_stackframe f f' ->
    match_stacks newpc (f :: s) (f' :: s')
.

This invariant relate source (Mach) stack and target (Asm) cross-compartment stack, as well as the current PC.

  • Empty stacks are always related
  • When the Mach program does an intra-compartment call, then a new frame is created in Mach, but not in Asm (rule match_stacks_intra_compartment).
  • When the Mach program does a cross-compartment call, then the Asm program also does a cross-compartment call as well. A new frame is added in both stacks (rule match_stacks_cross_compartment).

We can prove (match_stacks_same_compartment) that one can change the PC arbitrarily as long as it stays in the same compartment.

TODO: explain the proof.