[analyzer][solver][NFC] Introduce ConstraintAssignor

The new component is a symmetric response to SymbolicRangeInferrer.
While the latter is the unified component, which answers all the
questions what does the solver knows about a particular symbolic
expression, assignor associates new constraints (aka "assumes")
with symbolic expressions and can imply additional knowledge that
the solver can extract and use later on.

- Why do we need it and why is SymbolicRangeInferrer not enough?

As it is noted before, the inferrer only helps us to get the most
precise range information based on the existing knowledge and on the
mathematical foundations of different operations that symbolic
expressions actually represent.  It doesn't introduce new constraints.

The assignor, on the other hand, can impose constraints on other
symbols using the same domain knowledge.

- But for some expressions, SymbolicRangeInferrer looks into constraints
  for similar expressions, why can't we do that for all the cases?

That's correct!  But in order to do something like this, we should
have a finite number of possible "similar expressions".

Let's say we are asked about `$a - $b` and we know something about
`$b - $a`.  The inferrer can invert this expression and check
constraints for `$b - $a`.  This is simple!
But let's say we are asked about `$a` and we know that `$a * $b != 0`.
In this situation, we can imply that `$a != 0`, but the inferrer shouldn't
try every possible symbolic expression `X` to check if `$a * X` or
`X * $a` is constrained to non-zero.

With the assignor mechanism, we can catch this implication right at
the moment we associate `$a * $b` with non-zero range, and set similar
constraints for `$a` and `$b` as well.

Differential Revision: https://reviews.llvm.org/D105692
This commit is contained in:
Valeriy Savchenko 2021-07-08 20:09:04 +03:00
parent 2bc07083a2
commit f26deb4e6b

View File

@ -669,6 +669,17 @@ LLVM_NODISCARD inline const RangeSet *getConstraint(ProgramStateRef State,
return getConstraint(State, EquivalenceClass::find(State, Sym));
}
LLVM_NODISCARD ProgramStateRef setConstraint(ProgramStateRef State,
EquivalenceClass Class,
RangeSet Constraint) {
return State->set<ConstraintRange>(Class, Constraint);
}
LLVM_NODISCARD ProgramStateRef setConstraints(ProgramStateRef State,
ConstraintRangeTy Constraints) {
return State->set<ConstraintRange>(Constraints);
}
//===----------------------------------------------------------------------===//
// Equality/diseqiality abstraction
//===----------------------------------------------------------------------===//
@ -1373,6 +1384,182 @@ RangeSet SymbolicRangeInferrer::VisitBinaryOperator<BO_Rem>(Range LHS,
return {RangeFactory, ValueFactory.getValue(Min), ValueFactory.getValue(Max)};
}
//===----------------------------------------------------------------------===//
// Constraint assignment logic
//===----------------------------------------------------------------------===//
/// ConstraintAssignorBase is a small utility class that unifies visitor
/// for ranges with a visitor for constraints (rangeset/range/constant).
///
/// It is designed to have one derived class, but generally it can have more.
/// Derived class can control which types we handle by defining methods of the
/// following form:
///
/// bool handle${SYMBOL}To${CONSTRAINT}(const SYMBOL *Sym,
/// CONSTRAINT Constraint);
///
/// where SYMBOL is the type of the symbol (e.g. SymSymExpr, SymbolCast, etc.)
/// CONSTRAINT is the type of constraint (RangeSet/Range/Const)
/// return value signifies whether we should try other handle methods
/// (i.e. false would mean to stop right after calling this method)
template <class Derived> class ConstraintAssignorBase {
public:
using Const = const llvm::APSInt &;
#define DISPATCH(CLASS) return assign##CLASS##Impl(cast<CLASS>(Sym), Constraint)
#define ASSIGN(CLASS, TO, SYM, CONSTRAINT) \
if (!static_cast<Derived *>(this)->assign##CLASS##To##TO(SYM, CONSTRAINT)) \
return false
void assign(SymbolRef Sym, RangeSet Constraint) {
assignImpl(Sym, Constraint);
}
bool assignImpl(SymbolRef Sym, RangeSet Constraint) {
switch (Sym->getKind()) {
#define SYMBOL(Id, Parent) \
case SymExpr::Id##Kind: \
DISPATCH(Id);
#include "clang/StaticAnalyzer/Core/PathSensitive/Symbols.def"
}
llvm_unreachable("Unknown SymExpr kind!");
}
#define DEFAULT_ASSIGN(Id) \
bool assign##Id##To##RangeSet(const Id *Sym, RangeSet Constraint) { \
return true; \
} \
bool assign##Id##To##Range(const Id *Sym, Range Constraint) { return true; } \
bool assign##Id##To##Const(const Id *Sym, Const Constraint) { return true; }
// When we dispatch for constraint types, we first try to check
// if the new constraint is the constant and try the corresponding
// assignor methods. If it didn't interrupt, we can proceed to the
// range, and finally to the range set.
#define CONSTRAINT_DISPATCH(Id) \
if (const llvm::APSInt *Const = Constraint.getConcreteValue()) { \
ASSIGN(Id, Const, Sym, *Const); \
} \
if (Constraint.size() == 1) { \
ASSIGN(Id, Range, Sym, *Constraint.begin()); \
} \
ASSIGN(Id, RangeSet, Sym, Constraint)
// Our internal assign method first tries to call assignor methods for all
// constraint types that apply. And if not interrupted, continues with its
// parent class.
#define SYMBOL(Id, Parent) \
bool assign##Id##Impl(const Id *Sym, RangeSet Constraint) { \
CONSTRAINT_DISPATCH(Id); \
DISPATCH(Parent); \
} \
DEFAULT_ASSIGN(Id)
#define ABSTRACT_SYMBOL(Id, Parent) SYMBOL(Id, Parent)
#include "clang/StaticAnalyzer/Core/PathSensitive/Symbols.def"
// Default implementations for the top class that doesn't have parents.
bool assignSymExprImpl(const SymExpr *Sym, RangeSet Constraint) {
CONSTRAINT_DISPATCH(SymExpr);
return true;
}
DEFAULT_ASSIGN(SymExpr);
#undef DISPATCH
#undef CONSTRAINT_DISPATCH
#undef DEFAULT_ASSIGN
#undef ASSIGN
};
/// A little component aggregating all of the reasoning we have about
/// assigning new constraints to symbols.
///
/// The main purpose of this class is to associate constraints to symbols,
/// and impose additional constraints on other symbols, when we can imply
/// them.
///
/// It has a nice symmetry with SymbolicRangeInferrer. When the latter
/// can provide more precise ranges by looking into the operands of the
/// expression in question, ConstraintAssignor looks into the operands
/// to see if we can imply more from the new constraint.
class ConstraintAssignor : public ConstraintAssignorBase<ConstraintAssignor> {
public:
template <class ClassOrSymbol>
LLVM_NODISCARD static ProgramStateRef
assign(ProgramStateRef State, SValBuilder &Builder, RangeSet::Factory &F,
ClassOrSymbol CoS, RangeSet NewConstraint) {
if (!State || NewConstraint.isEmpty())
return nullptr;
ConstraintAssignor Assignor{State, Builder, F};
return Assignor.assign(CoS, NewConstraint);
}
inline bool assignSymExprToConst(const SymExpr *Sym, Const Constraint);
private:
ConstraintAssignor(ProgramStateRef State, SValBuilder &Builder,
RangeSet::Factory &F)
: State(State), Builder(Builder), RangeFactory(F) {}
using Base = ConstraintAssignorBase<ConstraintAssignor>;
/// Base method for handling new constraints for symbols.
LLVM_NODISCARD ProgramStateRef assign(SymbolRef Sym, RangeSet NewConstraint) {
// All constraints are actually associated with equivalence classes, and
// that's what we are going to do first.
State = assign(EquivalenceClass::find(State, Sym), NewConstraint);
if (!State)
return nullptr;
// And after that we can check what other things we can get from this
// constraint.
Base::assign(Sym, NewConstraint);
return State;
}
/// Base method for handling new constraints for classes.
LLVM_NODISCARD ProgramStateRef assign(EquivalenceClass Class,
RangeSet NewConstraint) {
// There is a chance that we might need to update constraints for the
// classes that are known to be disequal to Class.
//
// In order for this to be even possible, the new constraint should
// be simply a constant because we can't reason about range disequalities.
if (const llvm::APSInt *Point = NewConstraint.getConcreteValue()) {
ConstraintRangeTy Constraints = State->get<ConstraintRange>();
ConstraintRangeTy::Factory &CF = State->get_context<ConstraintRange>();
// Add new constraint.
Constraints = CF.add(Constraints, Class, NewConstraint);
for (EquivalenceClass DisequalClass : Class.getDisequalClasses(State)) {
RangeSet UpdatedConstraint = SymbolicRangeInferrer::inferRange(
RangeFactory, State, DisequalClass);
UpdatedConstraint = RangeFactory.deletePoint(UpdatedConstraint, *Point);
// If we end up with at least one of the disequal classes to be
// constrained with an empty range-set, the state is infeasible.
if (UpdatedConstraint.isEmpty())
return nullptr;
Constraints = CF.add(Constraints, DisequalClass, UpdatedConstraint);
}
assert(areFeasible(Constraints) && "Constraint manager shouldn't produce "
"a state with infeasible constraints");
return setConstraints(State, Constraints);
}
return setConstraint(State, Class, NewConstraint);
}
ProgramStateRef State;
SValBuilder &Builder;
RangeSet::Factory &RangeFactory;
};
//===----------------------------------------------------------------------===//
// Constraint manager implementation details
//===----------------------------------------------------------------------===//
@ -1449,6 +1636,10 @@ private:
RangeSet getRange(ProgramStateRef State, SymbolRef Sym);
RangeSet getRange(ProgramStateRef State, EquivalenceClass Class);
ProgramStateRef setRange(ProgramStateRef State, SymbolRef Sym,
RangeSet Range);
ProgramStateRef setRange(ProgramStateRef State, EquivalenceClass Class,
RangeSet Range);
RangeSet getSymLTRange(ProgramStateRef St, SymbolRef Sym,
const llvm::APSInt &Int,
@ -1490,7 +1681,7 @@ private:
// This is an infeasible assumption.
return nullptr;
if (ProgramStateRef NewState = setConstraint(State, Sym, NewConstraint)) {
if (ProgramStateRef NewState = setRange(State, Sym, NewConstraint)) {
if (auto Equality = EqualityInfo::extract(Sym, Int, Adjustment)) {
// If the original assumption is not Sym + Adjustment !=/</> Int,
// we should invert IsEquality flag.
@ -1520,86 +1711,37 @@ private:
SymbolRef RHS) {
return EquivalenceClass::merge(F, State, LHS, RHS);
}
LLVM_NODISCARD ProgramStateRef setConstraint(ProgramStateRef State,
EquivalenceClass Class,
RangeSet Constraint) {
ConstraintRangeTy Constraints = State->get<ConstraintRange>();
ConstraintRangeTy::Factory &CF = State->get_context<ConstraintRange>();
assert(!Constraint.isEmpty() && "New constraint should not be empty");
// Add new constraint.
Constraints = CF.add(Constraints, Class, Constraint);
// There is a chance that we might need to update constraints for the
// classes that are known to be disequal to Class.
//
// In order for this to be even possible, the new constraint should
// be simply a constant because we can't reason about range disequalities.
if (const llvm::APSInt *Point = Constraint.getConcreteValue())
for (EquivalenceClass DisequalClass : Class.getDisequalClasses(State)) {
RangeSet UpdatedConstraint = getRange(State, DisequalClass);
UpdatedConstraint = F.deletePoint(UpdatedConstraint, *Point);
// If we end up with at least one of the disequal classes to be
// constrained with an empty range-set, the state is infeasible.
if (UpdatedConstraint.isEmpty())
return nullptr;
Constraints = CF.add(Constraints, DisequalClass, UpdatedConstraint);
}
assert(areFeasible(Constraints) && "Constraint manager shouldn't produce "
"a state with infeasible constraints");
return State->set<ConstraintRange>(Constraints);
}
// Associate a constraint to a symbolic expression. First, we set the
// constraint in the State, then we try to simplify existing symbolic
// expressions based on the newly set constraint.
LLVM_NODISCARD inline ProgramStateRef
setConstraint(ProgramStateRef State, SymbolRef Sym, RangeSet Constraint) {
assert(State);
State = setConstraint(State, EquivalenceClass::find(State, Sym), Constraint);
if (!State)
return nullptr;
// We have a chance to simplify existing symbolic values if the new
// constraint is a constant.
if (!Constraint.getConcreteValue())
return State;
llvm::SmallSet<EquivalenceClass, 4> SimplifiedClasses;
// Iterate over all equivalence classes and try to simplify them.
ClassMembersTy Members = State->get<ClassMembers>();
for (std::pair<EquivalenceClass, SymbolSet> ClassToSymbolSet : Members) {
EquivalenceClass Class = ClassToSymbolSet.first;
State = Class.simplify(getSValBuilder(), F, State);
if (!State)
return nullptr;
SimplifiedClasses.insert(Class);
}
// Trivial equivalence classes (those that have only one symbol member) are
// not stored in the State. Thus, we must skim through the constraints as
// well. And we try to simplify symbols in the constraints.
ConstraintRangeTy Constraints = State->get<ConstraintRange>();
for (std::pair<EquivalenceClass, RangeSet> ClassConstraint : Constraints) {
EquivalenceClass Class = ClassConstraint.first;
if (SimplifiedClasses.count(Class)) // Already simplified.
continue;
State = Class.simplify(getSValBuilder(), F, State);
if (!State)
return nullptr;
}
return State;
}
};
bool ConstraintAssignor::assignSymExprToConst(const SymExpr *Sym,
const llvm::APSInt &Constraint) {
llvm::SmallSet<EquivalenceClass, 4> SimplifiedClasses;
// Iterate over all equivalence classes and try to simplify them.
ClassMembersTy Members = State->get<ClassMembers>();
for (std::pair<EquivalenceClass, SymbolSet> ClassToSymbolSet : Members) {
EquivalenceClass Class = ClassToSymbolSet.first;
State = Class.simplify(Builder, RangeFactory, State);
if (!State)
return false;
SimplifiedClasses.insert(Class);
}
// Trivial equivalence classes (those that have only one symbol member) are
// not stored in the State. Thus, we must skim through the constraints as
// well. And we try to simplify symbols in the constraints.
ConstraintRangeTy Constraints = State->get<ConstraintRange>();
for (std::pair<EquivalenceClass, RangeSet> ClassConstraint : Constraints) {
EquivalenceClass Class = ClassConstraint.first;
if (SimplifiedClasses.count(Class)) // Already simplified.
continue;
State = Class.simplify(Builder, RangeFactory, State);
if (!State)
return false;
}
return true;
}
} // end anonymous namespace
std::unique_ptr<ConstraintManager>
@ -2244,6 +2386,18 @@ RangeSet RangeConstraintManager::getRange(ProgramStateRef State,
return SymbolicRangeInferrer::inferRange(F, State, Class);
}
ProgramStateRef RangeConstraintManager::setRange(ProgramStateRef State,
SymbolRef Sym,
RangeSet Range) {
return ConstraintAssignor::assign(State, getSValBuilder(), F, Sym, Range);
}
ProgramStateRef RangeConstraintManager::setRange(ProgramStateRef State,
EquivalenceClass Class,
RangeSet Range) {
return ConstraintAssignor::assign(State, getSValBuilder(), F, Class, Range);
}
//===------------------------------------------------------------------------===
// assumeSymX methods: protected interface for RangeConstraintManager.
//===------------------------------------------------------------------------===/
@ -2397,7 +2551,7 @@ RangeConstraintManager::assumeSymGE(ProgramStateRef St, SymbolRef Sym,
const llvm::APSInt &Int,
const llvm::APSInt &Adjustment) {
RangeSet New = getSymGERange(St, Sym, Int, Adjustment);
return New.isEmpty() ? nullptr : setConstraint(St, Sym, New);
return setRange(St, Sym, New);
}
RangeSet
@ -2441,7 +2595,7 @@ RangeConstraintManager::assumeSymLE(ProgramStateRef St, SymbolRef Sym,
const llvm::APSInt &Int,
const llvm::APSInt &Adjustment) {
RangeSet New = getSymLERange(St, Sym, Int, Adjustment);
return New.isEmpty() ? nullptr : setConstraint(St, Sym, New);
return setRange(St, Sym, New);
}
ProgramStateRef RangeConstraintManager::assumeSymWithinInclusiveRange(
@ -2451,7 +2605,7 @@ ProgramStateRef RangeConstraintManager::assumeSymWithinInclusiveRange(
if (New.isEmpty())
return nullptr;
RangeSet Out = getSymLERange([&] { return New; }, To, Adjustment);
return Out.isEmpty() ? nullptr : setConstraint(State, Sym, Out);
return setRange(State, Sym, Out);
}
ProgramStateRef RangeConstraintManager::assumeSymOutsideInclusiveRange(
@ -2460,7 +2614,7 @@ ProgramStateRef RangeConstraintManager::assumeSymOutsideInclusiveRange(
RangeSet RangeLT = getSymLTRange(State, Sym, From, Adjustment);
RangeSet RangeGT = getSymGTRange(State, Sym, To, Adjustment);
RangeSet New(F.add(RangeLT, RangeGT));
return New.isEmpty() ? nullptr : setConstraint(State, Sym, New);
return setRange(State, Sym, New);
}
//===----------------------------------------------------------------------===//