Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use widened bounds to update bounds in context #821

Merged
merged 14 commits into from
May 11, 2020
Merged

Conversation

mgrang
Copy link

@mgrang mgrang commented Apr 17, 2020

Use the widened bounds calculated by the analysis in BoundsAnalysis.cpp to
widen the bounds of nt_array_ptr.

Use the widened bounds calculated by the analysis in BoundsAnalysis.cpp to
widen the bounds of nt_array_ptr.
@mgrang mgrang requested review from dtarditi and kkjeer and removed request for dtarditi April 17, 2020 23:52
Copy link
Member

@dtarditi dtarditi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good.

@@ -2123,6 +2123,71 @@ namespace {
}
}

void ResetKilledBounds(StmtDeclSetTy &KilledBounds,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment describing what this function is doing?

StmtDeclSetTy KilledBounds = BA.GetKilledBounds(Block);
// Update the bounds in context (BlockState.UC) using the bounds
// widening info calculated above.
UpdateBoundsInContext(WidenedBounds, BlockState.UC);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After more discussion on the relationship between widened bounds, target bounds, and the bounds context, I'm currently updating the way the bounds context is handled. In particular, CheckDeclRefExpr will no longer use the BlockState.UC context to return the lvalue target bounds (since the widened bounds should not affect the target bounds for a variable). We may need to reevaluate how to use the widened bounds as the observed rvalue bounds for the value of a variable in light of my proposed updates to the bounds context.

Expr *Upper = RBE->getUpperExpr();

// NewUpperBound = UpperBound + WidenedOffset.
Expr *NewUpper = ExprCreatorUtil::CreateBinaryOperator(
Copy link
Contributor

@kkjeer kkjeer Apr 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this locally and ran into some issues with checking an assignment using widened bounds.

In this example:

void widen_bounds_bug(nt_array_ptr<int> a : bounds(a, a + 1),
                      nt_array_ptr<int> p : bounds(p, p + 0),
                      nt_array_ptr<int> b : bounds(b, b + (0 + 1))) {
  if (*p) {
    // Bounds of p have been widened to (p, (p + 0) + 1).
    // The target bounds of a are (a, a + 1).
    // Because the base of p's bounds is p + 0,
    // ProveBoundsDeclValidity is unable to create a base range for p's bounds,
    // and is therefore unable to prove the validity of a's bounds after the assignment.
    a = p;

    // Bounds of b are (b, b + (0 + 1)).
    // The target bounds of a are (a, a + 1).
    // Because the base of b's bounds is b,
    // ProveBoundsDeclValidity is able to create a base range for b's bounds,
    // and is therefore able to prove the validity of a's bounds after the assignment.
    a = b;
  }
}

The assignment a = p results in a warning "unable to prove the declared bounds of a are valid after the assignment". The reason is that the new upper expression for p's widened bounds (p, (p + 0) + 1) is left-associative, so the base of the widened bounds is p + 0. If the upper expression were right-associative (p, p + (0 + 1)), as in the declared bounds for b, then the assignment a = p would be fine (as in the assignment a = b).

I quickly put together a local fix involving SplitIntoBaseAndOffset to split the upper expression of a variable's bounds and use the resulting base and offset to create the new upper expression for the widened bounds, but I haven't fully tested it.

Copy link
Author

@mgrang mgrang Apr 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, SplitIntoBaseAndOffset does not currently handle associativity and multiple arithmetic operators correctly. In BoundsAnalysis I ran into a similar issue wherein I had to correctly determine the offset of the expression being dereferenced. So I wrote an implementation of SplitIntoBaseAndOffset (see https://github.com/microsoft/checkedc-clang/blob/master/clang/lib/Sema/BoundsAnalysis.cpp#L340). This function associates all variables to the LHS and all constants to the RHS. Ideally, we want to merge both these functions into one.

One thing to keep in mind with changing associativity of operators is overflow. The original expression may not overflow but if we change the associativity then an overflow/underflow may occur.

}
}

void UpdateBoundsInContext(BoundsMapTy &WidenedBounds,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment to this function as well as ResetKilledBounds?

@@ -2223,6 +2288,10 @@ namespace {
BoundsContextTy InitialObservedBounds = BlockState.ObservedBounds;
BlockState.G.clear();

// If any bounds are killed by statement S, remove their bounds
// from the ObservedBounds.
ResetKilledBounds(KilledBounds, S, BlockState.ObservedBounds);
Copy link
Contributor

@kkjeer kkjeer May 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be done before saving InitialObservedBounds = BlockState.ObservedBounds. InitialBounds should contain the contents of ObservedBounds before calling Check(S, CSS, BlockState) in order for the observed bounds to be reset after the Check call.

For example:

void pointer_widening() {
  nt_array_ptr<char> p : count(0) = "a";

  if (p[0]) {
    // This assignment kills the widened bounds of p.
    // InitialObservedBounds should contain p => bounds(p, p + 0),
    // not p => bounds(p, p + 0 + 1).
    // After checking this assignment, BlockState.ObservedBounds should contain
    // p => bounds(p, p + 0).
    p = "";
    if (p[1]) // observed bounds of p are (p, p + 0), so this is an out-of-bounds memory access
  }
}

// statement S. After removal, the bounds of these variables would default
// to the user declared bounds in DeclaredBounds.
for (const VarDecl *V : I->second)
ObservedBounds.erase(V);
Copy link
Contributor

@kkjeer kkjeer May 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should reset ObservedBounds[V] to the normalized declared bounds of V, rather than erasing V from the context. Before calling Check on a statement S, the ObservedBounds context should contain, for each variable V that is in scope at S:

  1. If V has widened bounds (lb, ub + i) at S and S does not make an assignment to V, ObservedBounds[V] => bounds(lb, ub + i)
  2. Otherwise, if V has declared bounds (lb, ub), ObservedBounds[V] => bounds(lb, ub)

The InitialObservedBounds variable should also contain these bounds. After calling Check on S, ObservedBounds may contain updated bounds due to any assignments in S. These updated observed bounds should be used to check that they imply the declared bounds for each variable. After this check, ObservedBounds should be reset to InitialObservedBounds and contain the bounds described above.

For example:

void pointer_widening() {
  nt_array_ptr<char> p : count(0) = "a";

  if (p[0]) {
    // This assignment kills the widened bounds (p, p + 1) of p.
    // Before checking this assignment, ObservedBounds should contain p => (p, p + 0).
    // The original value of p in this assignment is p - 1.
    // p - 1 should replace p in the observed bounds of p.
    // After checking this assignment, ObservedBounds should contain p => (p - 1, p - 1 + 0).
    // The updated observed bounds (p - 1, p - 1 + 0) should be used to check that they imply
    // the declared bounds (p, p + 0) for p. After this check, ObservedBounds should be reset
    // to the initial observed bounds before checking this assignment: (p, p + 0).
    p = p + 1;
  }
}

@mgrang
Copy link
Author

mgrang commented May 7, 2020

This depends on #829

// TODO: update BlockState.ObservedBounds to reset any widened
// bounds that are killed by S to the declared variable bounds.

// If any bounds are killed by statement S, remove their bounds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment no longer seems quite accurate, since the bounds that are killed by S are being reset to their declared bounds rather than removed from ObservedBounds.

Copy link
Member

@dtarditi dtarditi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. Thank you!

There are places we are repeatedly allocating AST nodes. We should avoid doing that because those AST nodes are permanently allocated, increased memory usage of the compiler. Please open an issue to track this. It can be addressed in a separate PR.

// variables killed by the statement S to the declared bounds.
for (const VarDecl *V : I->second) {
if (const BoundsExpr *Bounds = V->getBoundsExpr())
ObservedBounds[V] = S.ExpandBoundsToRange(V, Bounds);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is inefficient for memory use to call ExpandBoundsToRange repeatedly on the bounds for a variable. ExpandBoundsToRange can allocate AST data structures, and we don't want to do that in a dataflow analysis. Could you open a follow-up issue about reducing allocation of AST nodes and add a TODO here. The solution is to compute it once and attach the canoncialized value to the VarDecl.

This change is beyond the scope of this PR, so I think it should be done separately.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created issue to track this: #830


// We normalize the declared bounds to RangBoundsExpr here so that we
// can easily apply the offset to the upper bound.
BoundsExpr *Bounds = S.ExpandBoundsToRange(V, V->getBoundsExpr());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too we may be allocating AST data structures multiple times. We want to avoid doing that, and do the expansion only once.

@mgrang
Copy link
Author

mgrang commented May 11, 2020

This PR depends on checkedc/checkedc#400

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants