-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use widened bounds to update bounds in context #821
Conversation
Use the widened bounds calculated by the analysis in BoundsAnalysis.cpp to widen the bounds of nt_array_ptr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good.
clang/lib/Sema/SemaBounds.cpp
Outdated
@@ -2123,6 +2123,71 @@ namespace { | |||
} | |||
} | |||
|
|||
void ResetKilledBounds(StmtDeclSetTy &KilledBounds, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment describing what this function is doing?
clang/lib/Sema/SemaBounds.cpp
Outdated
StmtDeclSetTy KilledBounds = BA.GetKilledBounds(Block); | ||
// Update the bounds in context (BlockState.UC) using the bounds | ||
// widening info calculated above. | ||
UpdateBoundsInContext(WidenedBounds, BlockState.UC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After more discussion on the relationship between widened bounds, target bounds, and the bounds context, I'm currently updating the way the bounds context is handled. In particular, CheckDeclRefExpr will no longer use the BlockState.UC context to return the lvalue target bounds (since the widened bounds should not affect the target bounds for a variable). We may need to reevaluate how to use the widened bounds as the observed rvalue bounds for the value of a variable in light of my proposed updates to the bounds context.
clang/lib/Sema/SemaBounds.cpp
Outdated
Expr *Upper = RBE->getUpperExpr(); | ||
|
||
// NewUpperBound = UpperBound + WidenedOffset. | ||
Expr *NewUpper = ExprCreatorUtil::CreateBinaryOperator( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this locally and ran into some issues with checking an assignment using widened bounds.
In this example:
void widen_bounds_bug(nt_array_ptr<int> a : bounds(a, a + 1),
nt_array_ptr<int> p : bounds(p, p + 0),
nt_array_ptr<int> b : bounds(b, b + (0 + 1))) {
if (*p) {
// Bounds of p have been widened to (p, (p + 0) + 1).
// The target bounds of a are (a, a + 1).
// Because the base of p's bounds is p + 0,
// ProveBoundsDeclValidity is unable to create a base range for p's bounds,
// and is therefore unable to prove the validity of a's bounds after the assignment.
a = p;
// Bounds of b are (b, b + (0 + 1)).
// The target bounds of a are (a, a + 1).
// Because the base of b's bounds is b,
// ProveBoundsDeclValidity is able to create a base range for b's bounds,
// and is therefore able to prove the validity of a's bounds after the assignment.
a = b;
}
}
The assignment a = p
results in a warning "unable to prove the declared bounds of a are valid after the assignment". The reason is that the new upper expression for p's widened bounds (p, (p + 0) + 1)
is left-associative, so the base of the widened bounds is p + 0
. If the upper expression were right-associative (p, p + (0 + 1))
, as in the declared bounds for b
, then the assignment a = p
would be fine (as in the assignment a = b
).
I quickly put together a local fix involving SplitIntoBaseAndOffset
to split the upper expression of a variable's bounds and use the resulting base and offset to create the new upper expression for the widened bounds, but I haven't fully tested it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, SplitIntoBaseAndOffset does not currently handle associativity and multiple arithmetic operators correctly. In BoundsAnalysis I ran into a similar issue wherein I had to correctly determine the offset of the expression being dereferenced. So I wrote an implementation of SplitIntoBaseAndOffset (see https://github.com/microsoft/checkedc-clang/blob/master/clang/lib/Sema/BoundsAnalysis.cpp#L340). This function associates all variables to the LHS and all constants to the RHS. Ideally, we want to merge both these functions into one.
One thing to keep in mind with changing associativity of operators is overflow. The original expression may not overflow but if we change the associativity then an overflow/underflow may occur.
clang/lib/Sema/SemaBounds.cpp
Outdated
} | ||
} | ||
|
||
void UpdateBoundsInContext(BoundsMapTy &WidenedBounds, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment to this function as well as ResetKilledBounds?
@@ -2223,6 +2288,10 @@ namespace { | |||
BoundsContextTy InitialObservedBounds = BlockState.ObservedBounds; | |||
BlockState.G.clear(); | |||
|
|||
// If any bounds are killed by statement S, remove their bounds | |||
// from the ObservedBounds. | |||
ResetKilledBounds(KilledBounds, S, BlockState.ObservedBounds); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be done before saving InitialObservedBounds = BlockState.ObservedBounds
. InitialBounds
should contain the contents of ObservedBounds
before calling Check(S, CSS, BlockState)
in order for the observed bounds to be reset after the Check
call.
For example:
void pointer_widening() {
nt_array_ptr<char> p : count(0) = "a";
if (p[0]) {
// This assignment kills the widened bounds of p.
// InitialObservedBounds should contain p => bounds(p, p + 0),
// not p => bounds(p, p + 0 + 1).
// After checking this assignment, BlockState.ObservedBounds should contain
// p => bounds(p, p + 0).
p = "";
if (p[1]) // observed bounds of p are (p, p + 0), so this is an out-of-bounds memory access
}
}
clang/lib/Sema/SemaBounds.cpp
Outdated
// statement S. After removal, the bounds of these variables would default | ||
// to the user declared bounds in DeclaredBounds. | ||
for (const VarDecl *V : I->second) | ||
ObservedBounds.erase(V); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should reset ObservedBounds[V]
to the normalized declared bounds of V
, rather than erasing V
from the context. Before calling Check
on a statement S
, the ObservedBounds
context should contain, for each variable V
that is in scope at S
:
- If
V
has widened bounds(lb, ub + i
) atS
andS
does not make an assignment toV
,ObservedBounds[V] => bounds(lb, ub + i)
- Otherwise, if
V
has declared bounds(lb, ub)
,ObservedBounds[V] => bounds(lb, ub)
The InitialObservedBounds
variable should also contain these bounds. After calling Check
on S
, ObservedBounds
may contain updated bounds due to any assignments in S
. These updated observed bounds should be used to check that they imply the declared bounds for each variable. After this check, ObservedBounds
should be reset to InitialObservedBounds
and contain the bounds described above.
For example:
void pointer_widening() {
nt_array_ptr<char> p : count(0) = "a";
if (p[0]) {
// This assignment kills the widened bounds (p, p + 1) of p.
// Before checking this assignment, ObservedBounds should contain p => (p, p + 0).
// The original value of p in this assignment is p - 1.
// p - 1 should replace p in the observed bounds of p.
// After checking this assignment, ObservedBounds should contain p => (p - 1, p - 1 + 0).
// The updated observed bounds (p - 1, p - 1 + 0) should be used to check that they imply
// the declared bounds (p, p + 0) for p. After this check, ObservedBounds should be reset
// to the initial observed bounds before checking this assignment: (p, p + 0).
p = p + 1;
}
}
This depends on #829 |
clang/lib/Sema/SemaBounds.cpp
Outdated
// TODO: update BlockState.ObservedBounds to reset any widened | ||
// bounds that are killed by S to the declared variable bounds. | ||
|
||
// If any bounds are killed by statement S, remove their bounds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment no longer seems quite accurate, since the bounds that are killed by S are being reset to their declared bounds rather than removed from ObservedBounds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. Thank you!
There are places we are repeatedly allocating AST nodes. We should avoid doing that because those AST nodes are permanently allocated, increased memory usage of the compiler. Please open an issue to track this. It can be addressed in a separate PR.
// variables killed by the statement S to the declared bounds. | ||
for (const VarDecl *V : I->second) { | ||
if (const BoundsExpr *Bounds = V->getBoundsExpr()) | ||
ObservedBounds[V] = S.ExpandBoundsToRange(V, Bounds); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is inefficient for memory use to call ExpandBoundsToRange repeatedly on the bounds for a variable. ExpandBoundsToRange can allocate AST data structures, and we don't want to do that in a dataflow analysis. Could you open a follow-up issue about reducing allocation of AST nodes and add a TODO here. The solution is to compute it once and attach the canoncialized value to the VarDecl.
This change is beyond the scope of this PR, so I think it should be done separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created issue to track this: #830
|
||
// We normalize the declared bounds to RangBoundsExpr here so that we | ||
// can easily apply the offset to the upper bound. | ||
BoundsExpr *Bounds = S.ExpandBoundsToRange(V, V->getBoundsExpr()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here too we may be allocating AST data structures multiple times. We want to avoid doing that, and do the expansion only once.
This PR depends on checkedc/checkedc#400 |
Use the widened bounds calculated by the analysis in BoundsAnalysis.cpp to
widen the bounds of nt_array_ptr.