llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2024-11-23 13:50:11 +00:00

Author	SHA1	Message	Date
Alex Richardson	a602f76a24	[clang][TargetInfo] Use LangAS for getPointer{Width,Align}() Mixing LLVM and Clang address spaces can result in subtle bugs, and there is no need for this hook to use the LLVM IR level address spaces. Most of this change is just replacing zero with LangAS::Default, but it also allows us to remove a few calls to getTargetAddressSpace(). This also removes a stale comment+workaround in CGDebugInfo::CreatePointerLikeType(): ASTContext::getTypeSize() does return the expected size for ReferenceType (and handles address spaces). Differential Revision: https://reviews.llvm.org/D138295	2022-11-30 20:24:01 +00:00
Balazs Benics	dbb94b415a	[analyzer] Remove the unused LocalCheckers.h header	2022-11-28 13:08:38 +01:00
Kazu Hirata	20ba079dda	[StaticAnalyzer] Don't use Optional::create (NFC) Note that std::optional does not offer create(). This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-11-25 15:38:53 -08:00
Balazs Benics	097ce76165	[analyzer] Deprecate FAM analyzer-config, recommend -fstrict-flex-arrays instead By default, clang assumes that all trailing array objects could be a FAM. So, an array of undefined size, size 0, size 1, or even size 42 is considered as FAMs for optimizations at least. One needs to override the default behavior by supplying the `-fstrict-flex-arrays=<N>` flag, with `N > 0` value to reduce the set of FAM candidates. Value `3` is the most restrictive and `0` is the most permissive on this scale. 0: all trailing arrays are FAMs 1: only incomplete, zero and one-element arrays are FAMs 2: only incomplete, zero-element arrays are FAMs 3: only incomplete arrays are FAMs If the user is happy with consdering single-element arrays as FAMs, they just need to remove the `consider-single-element-arrays-as-flexible-array-members` from the command line. Otherwise, if they don't want to recognize such cases as FAMs, they should specify `-fstrict-flex-arrays` anyway, which will be picked up by CSA. Any use of the deprecated analyzer-config value will trigger a warning explaining what to use instead. The `-analyzer-config-help` is updated accordingly. Depends on D138657 Reviewed By: xazax.hun Differential Revision: https://reviews.llvm.org/D138659	2022-11-25 10:24:56 +01:00
Balazs Benics	93b98eb399	[analyzer] getBinding should auto-detect type only if it was not given Casting a pointer to a suitably large integral type by reinterpret-cast should result in the same value as by using the `__builtin_bit_cast()`. The compiler exploits this: https://godbolt.org/z/zMP3sG683 However, the analyzer does not bind the same symbolic value to these expressions, resulting in weird situations, such as failing equality checks and even results in crashes: https://godbolt.org/z/oeMP7cj8q Previously, in the `RegionStoreManager::getBinding()` even if `T` was non-null, we replaced it with `TVR->getValueType()` in case the `MR` was `TypedValueRegion`. It doesn't make much sense to auto-detect the type if the type is already given. By not doing the auto-detection, we would just do the right thing and perform the load by that type. This means that we will cast the value to that type. So, in this patch, I'm proposing to do auto-detection only if the type was null. Here is a snippet of code, annotated by the previous and new dump values. `LocAsInteger` should wrap the `SymRegion`, since we want to load the address as if it was an integer. In none of the following cases should type auto-detection be triggered, hence we should eventually reach an `evalCast()` to lazily cast the loaded value into that type. ```lang=C++ void LValueToRValueBitCast_dumps(void p, char (array)[8]) { clang_analyzer_dump(p); // remained: &SymRegion{reg_$0<void * p>} clang_analyzer_dump(array); // remained: {{&SymRegion{reg_$1<char ()[8] array>} clang_analyzer_dump((unsigned long)p); // remained: {{&SymRegion{reg_$0<void p>} [as 64 bit integer]}} clang_analyzer_dump(__builtin_bit_cast(unsigned long, p)); <--------- change #1 // previously: {{&SymRegion{reg_$0<void * p>}}} // now: {{&SymRegion{reg_$0<void * p>} [as 64 bit integer]}} clang_analyzer_dump((unsigned long)array); // remained: {{&SymRegion{reg_$1<char ()[8] array>} [as 64 bit integer]}} clang_analyzer_dump(__builtin_bit_cast(unsigned long, array)); <--------- change #2 // previously: {{&SymRegion{reg_$1<char ()[8] array>}}} // now: {{&SymRegion{reg_$1<char (*)[8] array>} [as 64 bit integer]}} } ``` Reviewed By: xazax.hun Differential Revision: https://reviews.llvm.org/D136603	2022-11-23 15:52:11 +01:00
Vaibhav Yenamandra	7b6fe711b2	Refactor StaticAnalyzer to use `clang::SarifDocumentWriter` Refactor StaticAnalyzer to use clang::SarifDocumentWriter for serializing sarif diagnostics. Uses clang::SarifDocumentWriter to generate SARIF output in the StaticAnalyzer. Various bugfixes are also made to clang::SarifDocumentWriter. Summary of changes: clang/lib/Basic/Sarif.cpp: * Fix bug in adjustColumnPos introduced from prev move, it now uses FullSourceLoc::getDecomposedExpansionLoc which provides the correct location (in the presence of macros) instead of FullSourceLoc::getDecomposedLoc. * Fix createTextRegion so that it handles caret ranges correctly, this should bring it to parity with the previous implementation. clang/test/Analysis/diagnostics/Inputs/expected-sarif: * Update the schema URL to the offical website * Add the emitted defaultConfiguration sections to all rules * Annotate results with the "level" property clang/lib/StaticAnalyzer/Core/SarifDiagnostics.cpp: * Update SarifDiagnostics class to hold a clang::SarifDocumentWriter that it uses to convert diagnostics to SARIF.	2022-11-17 14:47:02 -05:00
Tomasz Kamiński	2fb3bec932	[analyzer] Fix crash for array-delete of UnknownVal values. We now skip the destruction of array elements for `delete[] p`, if the value of `p` is UnknownVal and does not have corresponding region. This eliminate the crash in `getDynamicElementCount` on that region and matches the behavior for deleting the array of non-constant range. Reviewed By: isuckatcs Differential Revision: https://reviews.llvm.org/D136671	2022-11-09 15:06:46 +01:00
Rageking8	94738a5ac3	Fix duplicate word typos; NFC This revision fixes typos where there are 2 consecutive words which are duplicated. There should be no code changes in this revision (only changes to comments and docs). Do let me know if there are any undesirable changes in this revision. Thanks.	2022-11-08 07:21:23 -05:00
Nathan James	108e41d962	[clang][NFC] Use c++17 style variable type traits This was done as a test for D137302 and it makes sense to push these changes Reviewed By: shafik Differential Revision: https://reviews.llvm.org/D137491	2022-11-07 18:25:48 +00:00
Jennifer Yu	ea64e66f7b	[OPENMP]Initial support for error directive. Differential Revision: https://reviews.llvm.org/D137209	2022-11-02 14:25:28 -07:00
Bill Wendling	7f93ae8086	[clang] Implement -fstrict-flex-arrays=3 The -fstrict-flex-arrays=3 is the most restrictive type of flex arrays. No number, including 0, is allowed in the FAM. In the cases where a "0" is used, the resulting size is the same as if a zero-sized object were substituted. This is needed for proper _FORTIFY_SOURCE coverage in the Linux kernel, among other reasons. So while the only reason for specifying a zero-length array at the end of a structure is for specify a FAM, treating it as such will cause _FORTIFY_SOURCE not to work correctly; __builtin_object_size will report -1 instead of 0 for a destination buffer size to keep any kernel internals from using the deprecated members as fake FAMs. For example: struct broken { int foo; int fake_fam[0]; struct something oops; }; There have been bugs where the above struct was created because "oops" was added after "fake_fam" by someone not realizing. Under __FORTIFY_SOURCE, doing: memcpy(p->fake_fam, src, len); raises no warnings when __builtin_object_size(p->fake_fam, 1) returns -1 and may stomp on "oops." Omitting a warning when using the (invalid) zero-length array is how GCC treats -fstrict-flex-arrays=3. A warning in that situation is likely an irritant, because requesting this option level is explicitly requesting this behavior. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836 Differential Revision: https://reviews.llvm.org/D134902	2022-10-27 10:50:04 -07:00
Kristóf Umann	a504ddc8bf	[analyzer] Initialize regions returned by CXXNew to undefined Discourse mail: https://discourse.llvm.org/t/analyzer-why-do-we-suck-at-modeling-c-dynamic-memory/65667 malloc() returns a piece of uninitialized dynamic memory. We were (almost) always able to model this behaviour. Its C++ counterpart, operator new is a lot more complex, because it allows for initialization, the most complicated of which is the usage of constructors. We gradually became better in modeling constructors, but for some reason, most likely for reasons lost in history, we never actually modeled the case when the memory returned by operator new was just simply uninitialized. This patch (attempts) to fix this tiny little error. Differential Revision: https://reviews.llvm.org/D135375	2022-10-26 17:22:12 +02:00
Gabor Marton	82a50812f7	[analyzer][StdLibraryFunctionsChecker] Add NoteTags for applied arg constraints In this patch I add a new NoteTag for each applied argument constraint. This way, any other checker that reports a bug - where the applied constraint is relevant - will display the corresponding note. With this change we provide more information for the users to understand some bug reports easier. Differential Revision: https://reviews.llvm.org/D101526 Reviewed By: NoQ	2022-10-26 16:33:25 +02:00
Balazs Benics	aa12a48c82	[analyzer] Fix assertion failure with conflicting prototype calls It turns out we can reach the `Init.castAs<nonlock::CompoundVal>()` expression with other kinds of SVals. Such as by `nonloc::ConcreteInt` in this example: https://godbolt.org/z/s4fdxrcs9 ```lang=C++ int buffer[10]; void b(); void top() { b(&buffer); } void b(int c) { c = 42; // would crash } ``` In this example, we try to store `42` to the `Elem{buffer, 0}`. This situation can appear if the CallExpr refers to a function declaration without prototype. In such cases, the engine will pick the redecl of the referred function decl which has function body, hence has a function prototype. This weird situation will have an interesting effect to the AST, such as the argument at the callsite will miss a cast, which would cast the `int ()[10]` expression into `int `, which means that when we evaluate the `c = 42` expression, we want to bind `42` to an array, causing the crash. Look at the AST of the callsite with and without the function prototype: https://godbolt.org/z/Gncebcbdb The only difference is that without the proper function prototype, we will not have the `ImplicitCastExpr` `BitCasting` from `int ()[10]` to `int *` to match the expected type of the parameter declaration. In this patch, I'm proposing to emit a cast in the mentioned edge-case, to bind the argument value of the expected type to the parameter. I'm only proposing this if the runtime definition has exactly the same number of parameters as the callsite feeds it by arguments. If that's not the case, I believe, we are better off by binding `Unknown` to those parameters. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D136162	2022-10-26 11:27:01 +02:00
Tomasz Kamiński	6194229c62	[analyzer] Make directly bounded LazyCompoundVal as lazily copied Previously, `LazyCompoundVal` bindings to subregions referred by `LazyCopoundVals`, were not marked as //lazily copied//. This change returns `LazyCompoundVals` from `getInterestingValues()`, so their regions can be marked as //lazily copied// in `RemoveDeadBindingsWorker::VisitBinding()`. Depends on D134947 Authored by: Tomasz Kamiński <tomasz.kamiński@sonarsource.com> Reviewed By: martong Differential Revision: https://reviews.llvm.org/D135136	2022-10-19 16:06:32 +02:00
Tomasz Kamiński	a6b42040ad	[analyzer] Fix the liveness of Symbols for values in regions referred by LazyCompoundVal To illustrate our current understanding, let's start with the following program: https://godbolt.org/z/33f6vheh1 ```lang=c++ void clang_analyzer_printState(); struct C { int x; int y; int more_padding; }; struct D { C c; int z; }; C foo(D d, int new_x, int new_y) { d.c.x = new_x; // B1 assert(d.c.x < 13); // C1 C c = d.c; // L assert(d.c.y < 10); // C2 assert(d.z < 5); // C3 d.c.y = new_y; // B2 assert(d.c.y < 10); // C4 return c; // R } ``` In the code, we create a few bindings to subregions of root region `d` (`B1`, `B2`), a constrain on the values (`C1`, `C2`, ….), and create a `lazyCompoundVal` for the part of the region `d` at point `L`, which is returned at point `R`. Now, the question is which of these should remain live as long the return value of the `foo` call is live. In perfect a word we should preserve: # only the bindings of the subregions of `d.c`, which were created before the copy at `L`. In our example, this includes `B1`, and not `B2`. In other words, `new_x` should be live but `new_y` shouldn’t. # constraints on the values of `d.c`, that are reachable through `c`. This can be created both before the point of making the copy (`L`) or after. In our case, that would be `C1` and `C2`. But not `C3` (`d.z` value is not reachable through `c`) and `C4` (the original value of`d.c.y` was overridden at `B2` after the creation of `c`). The current code in the `RegionStore` covers the use case (1), by using the `getInterestingValues()` to extract bindings to parts of the referred region present in the store at the point of copy. This also partially covers point (2), in case when constraints are applied to a location that has binding at the point of the copy (in our case `d.c.x` in `C1` that has value `new_x`), but it fails to preserve the constraints that require creating a new symbol for location (`d.c.y` in `C2`). We introduce the concept of //lazily copied// locations (regions) to the `SymbolReaper`, i.e. for which a program can access the value stored at that location, but not its address. These locations are constructed as a set of regions referred to by `lazyCompoundVal`. A //readable// location (region) is a location that //live// or //lazily copied// . And symbols that refer to values in regions are alive if the region is //readable//. For simplicity, we follow the current approach to live regions and mark the base region as //lazily copied//, and consider any subregions as //readable//. This makes some symbols falsy live (`d.z` in our example) and keeps the corresponding constraints alive. The rename `Regions` to `LiveRegions` inside `RegionStore` is NFC change, that was done to make it clear, what is difference between regions stored in this two sets. Regression Test: https://reviews.llvm.org/D134941 Co-authored-by: Balazs Benics <benicsbalazs@gmail.com> Reviewed By: martong, xazax.hun Differential Revision: https://reviews.llvm.org/D134947	2022-10-19 16:06:32 +02:00
Kazu Hirata	08901c8a98	[clang] Use llvm::reverse (NFC)	2022-10-15 21:54:13 -07:00
Matheus Izvekov	bcd9ba2b7e	[clang] Track the templated entity in type substitution. This is a change to how we represent type subsitution in the AST. Instead of only storing the replaced type, we track the templated entity we are substituting, plus an index. We modify MLTAL to track the templated entity at each level. Otherwise, it's much more expensive to go from the template parameter back to the templated entity, and not possible to do in some cases, as when we instantiate outer templates, parameters might still reference the original entity. This also allows us to very cheaply lookup the templated entity we saw in the naming context and find the corresponding argument it was replaced from, such as for implementing template specialization resugaring. Signed-off-by: Matheus Izvekov <mizvekov@gmail.com> Differential Revision: https://reviews.llvm.org/D131858	2022-10-15 22:08:36 +02:00
Balazs Benics	b062ee7dc4	[analyzer] Workaround crash on encountering Class non-type template parameters The Clang Static Analyzer will crash on this code: ```lang=C++ struct Box { int value; }; template <Box V> int get() { return V.value; } template int get<Box{-1}>(); ``` https://godbolt.org/z/5Yb1sMMMb The problem is that we don't account for encountering `TemplateParamObjectDecl`s within the `DeclRefExpr` handler in the `ExprEngine`. IMO we should create a new memregion for representing such template param objects, to model their language semantics. Such as: - it should have global static storage - for two identical values, their addresses should be identical as well http://eel.is/c%2B%2Bdraft/temp.param#8 I was thinking of introducing a `TemplateParamObjectRegion` under `DeclRegion` for this purpose. It could have `TemplateParamObjectDecl` as a field. The `TemplateParamObjectDecl::getValue()` returns `APValue`, which might represent multiple levels of structures, unions and other goodies - making the transformation from `APValue` to `SVal` a bit complicated. That being said, for now, I think having `Unknowns` for such cases is definitely an improvement to crashing, hence I'm proposing this patch. Reviewed By: xazax.hun Differential Revision: https://reviews.llvm.org/D135763	2022-10-13 08:41:31 +02:00
Arseniy Zaostrovnykh	ec6da3fb9d	Fix false positive related to handling of [[noreturn]] function pointers Before this change, the `NoReturnFunctionChecker` was missing function pointers with a `[[noreturn]]` attribute, while `CFG` was constructed taking that into account, which leads CSA to take impossible paths. The reason was that the `NoReturnFunctionChecker` was looking for the attribute in the type of the entire call expression rather than the type of the function being called. This change makes the `[[noreturn]]` attribute of a function pointer visible to `NoReturnFunctionChecker`. This leads to a more coherent behavior of the CSA on the AST involving. Reviewed By: xazax.hun Differential Revision: https://reviews.llvm.org/D135682	2022-10-12 14:46:32 +02:00
Soumi Manna	3b652fc6d6	[analyzer] Fix static code analysis concerns ProcessMemberDtor(), ProcessDeleteDtor(), and ProcessAutomaticObjDtor(): Fix static analyzer warnings with suspicious dereference of pointer 'Pred' in function call before NULL checks - NFCI Differential Revision: https://reviews.llvm.org/D135290	2022-10-07 16:58:37 +02:00
Bill Wendling	7404b855e5	[clang][NFC] Use enum for -fstrict-flex-arrays Use enums for the strict flex arrays flag so that it's more readable. Differential Revision: https://reviews.llvm.org/D135107	2022-10-06 10:45:41 -07:00
Argyrios Kyrtzidis	371883f46d	[clang/Sema] Fix non-deterministic order for certain kind of diagnostics In the context of caching clang invocations it is important to emit diagnostics in deterministic order; the same clang invocation should result in the same diagnostic output. rdar://100336989 Differential Revision: https://reviews.llvm.org/D135118	2022-10-05 12:58:01 -07:00
Tomasz Kamiński	4ff836a138	[analyzer] Pass correct bldrCtx to computeObjectUnderConstruction In case when the prvalue is returned from the function (kind is one of `SimpleReturnedValueKind`, `CXX17ElidedCopyReturnedValueKind`), then it construction happens in context of the caller. We pass `BldrCtx` explicitly, as `currBldrCtx` will always refer to callee context. In the following example: ``` struct Result {int value; }; Result create() { return Result{10}; } int accessValue(Result r) { return r.value; } void test() { for (int i = 0; i < 2; ++i) accessValue(create()); } ``` In case when the returned object was constructed directly into the argument to a function call `accessValue(create())`, this led to inappropriate value of `blockCount` being used to locate parameter region, and as a consequence resulting object (from `create()`) was constructed into a different region, that was later read by inlined invocation of outer function (`accessValue`). This manifests itself only in case when calling block is visited more than once (loop in above example), as otherwise there is no difference in `blockCount` value between callee and caller context. This happens only in case when copy elision is disabled (before C++17). Reviewed By: NoQ Differential Revision: https://reviews.llvm.org/D132030	2022-09-26 11:39:10 +02:00
Jan Korous	85d97aac80	[analyzer] Support implicit parameter 'self' in path note showBRParamDiagnostics assumed stores happen only via function parameters while that can also happen via implicit parameters like 'self' or 'this'. The regression test caused a failed assert in the original cast to ParmVarDecl. Differential Revision: https://reviews.llvm.org/D133815	2022-09-21 17:26:09 -07:00
isuckatcs	6931d311ea	[analyzer] Cleanup some artifacts from non-POD array evaluation Most of the state traits used for non-POD array evaluation were only cleaned up if the ctors/dtors were inlined, since the cleanup happened in ExprEngine::processCallExit(). This patch makes sure they are removed even if said functions are not inlined. Differential Revision: https://reviews.llvm.org/D133643	2022-09-17 22:46:27 +02:00
Kazu Hirata	8009d236e5	[clang] Don't include SetVector.h (NFC)	2022-09-17 13:36:13 -07:00
Balazs Benics	7cddf9cad1	[analyzer] Dump the environment entry kind as well By this change the `exploded-graph-rewriter` will display the class kind of the expression of the environment entry. It makes easier to decide if the given entry corresponds to the lvalue or to the rvalue of some expression. It turns out the rewriter already had support for visualizing it, but probably was never actually used? Reviewed By: martong Differential Revision: https://reviews.llvm.org/D132109	2022-09-13 09:04:27 +02:00
Balazs Benics	afcd862b2e	[analyzer] LazyCompoundVals should be always bound as default bindings `LazyCompoundVals` should only appear as `default` bindings in the store. This fixes the second case in this patch-stack. Depends on: D132142 Reviewed By: xazax.hun Differential Revision: https://reviews.llvm.org/D132143	2022-09-13 08:58:46 +02:00
Balazs Benics	f8643a9b31	[analyzer] Prefer wrapping SymbolicRegions by ElementRegions It turns out that in certain cases `SymbolRegions` are wrapped by `ElementRegions`; in others, it's not. This discrepancy can cause the analyzer not to recognize if the two regions are actually referring to the same entity, which then can lead to unreachable paths discovered. Consider this example: ```lang=C++ struct Node { int* ptr; }; void with_structs(Node* n1) { Node c = n1; // copy Node n2 = &c; clang_analyzer_dump(n1); // lazy... clang_analyzer_dump(n2); // lazy... clang_analyzer_dump(n1->ptr); // rval(n1->ptr): reg_$2<int * SymRegion{reg_$0<struct Node * n1>}.ptr> clang_analyzer_dump(n2->ptr); // rval(n2->ptr): reg_$1<int * Element{SymRegion{reg_$0<struct Node * n1>},0 S64b,struct Node}.ptr> clang_analyzer_eval(n1->ptr != n2->ptr); // UNKNOWN, bad! (void)(n1); (void)(n2); } ``` The copy of `n1` will insert a new binding to the store; but for doing that it actually must create a `TypedValueRegion` which it could pass to the `LazyCompoundVal`. Since the memregion in question is a `SymbolicRegion` - which is untyped, it needs to first wrap it into an `ElementRegion` basically implementing this untyped -> typed conversion for the sake of passing it to the `LazyCompoundVal`. So, this is why we have `Element{SymRegion{.}, 0,struct Node}` for `n1`. The problem appears if the analyzer evaluates a read from the expression `n1->ptr`. The same logic won't apply for `SymbolRegionValues`, since they accept raw `SubRegions`, hence the `SymbolicRegion` won't be wrapped into an `ElementRegion` in that case. Later when we arrive at the equality comparison, we cannot prove that they are equal. For more details check the corresponding thread on discourse: https://discourse.llvm.org/t/are-symbolicregions-really-untyped/64406 --- In this patch, I'm eagerly wrapping each `SymbolicRegion` by an `ElementRegion`; basically canonicalizing to this form. It seems reasonable to do so since any object can be thought of as a single array of that object; so this should not make much of a difference. The tests also underpin this assumption, as only a few were broken by this change; and actually fixed a FIXME along the way. About the second example, which does the same copy operation - but on the heap - it will be fixed by the next patch. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D132142	2022-09-13 08:58:46 +02:00
isuckatcs	a11e51e91f	[analyzer] Track trivial copy/move constructors and initializer lists in the BugReporter If an object has a trivial copy/move constructor, it's not inlined on invocation but a trivial copy is performed instead. This patch handles trivial copies in the bug reporter by matching the field regions of the 2 objects involved in the copy/move construction, and tracking the appropriate region further. This patch also introduces some support for tracking values in initializer lists. Differential Revision: https://reviews.llvm.org/D131262	2022-09-05 17:06:27 +02:00
isuckatcs	a46154cb1c	[analyzer] Warn if the size of the array in `new[]` is undefined This patch introduces a new checker, called NewArraySize checker, which detects if the expression that yields the element count of the array in new[], results in an Undefined value. Differential Revision: https://reviews.llvm.org/D131299	2022-09-04 23:06:58 +02:00
Kazu Hirata	b7a7aeee90	[clang] Qualify auto in range-based for loops (NFC)	2022-09-03 23:27:27 -07:00
isuckatcs	b5147937b2	[analyzer] Add more information to the Exploded Graph This patch dumps every state trait in the egraph. Also the empty state traits are no longer dumped, instead they are treated as null by the egraph rewriter script, which solves reverse compatibility issues. Differential Revision: https://reviews.llvm.org/D131187	2022-09-03 00:21:05 +02:00
Balázs Kéri	d56a1c6824	[clang][analyzer] Errno modeling code refactor (NFC). Some of the code used in StdLibraryFunctionsChecker is applicable to other checkers, this is put into common functions. Errno related parts of the checker are simplified and renamed. Documentations in errno_modeling functions are updated. This change makes it available to have more checkers that perform modeling of some standard functions. These can set the errno state with common functions and the bug report messages (note tags) can look similar. Reviewed By: steakhal, martong Differential Revision: https://reviews.llvm.org/D131879	2022-09-01 09:05:59 +02:00
Martin Storsjö	efc76a1ac5	[analyzer] Silence GCC warnings about unused variables. NFC. Use `isa<T>()` instead of `Type *Var = dyn_cast<T>()` when the result of the cast isn't used.	2022-08-29 13:26:13 +03:00
ziqingluo-90	a5e354ec4d	[analyzer] Fixing a bug raising false positives of stack block object leaking in ARC mode When ARC (automatic reference count) is enabled, (objective-c) block objects are automatically retained and released thus they do not leak. Without ARC, they still can leak from an expiring stack frame like other stack variables. With this commit, the static analyzer now puts a block object in an "unknown" region if ARC is enabled because it is up to the implementation to choose whether to put the object on stack initially (then move to heap when needed) or in heap directly under ARC. Therefore, the `StackAddrEscapeChecker` has no need to know specifically about ARC at all and it will not report errors on objects in "unknown" regions. Reviewed By: NoQ (Artem Dergachev) Differential Revision: https://reviews.llvm.org/D131009	2022-08-26 12:19:32 -07:00
isuckatcs	e3e9082b01	[analyzer] Fix for incorrect handling of 0 length non-POD array construction Prior to this patch when the analyzer encountered a non-POD 0 length array, it still invoked the constructor for 1 element, which lead to false positives. This patch makes sure that we no longer construct any elements when we see a 0 length array. Differential Revision: https://reviews.llvm.org/D131501	2022-08-25 12:42:02 +02:00
isuckatcs	aac73a31ad	[analyzer] Process non-POD array element destructors The constructors of non-POD array elements are evaluated under certain conditions. This patch makes sure that in such cases we also evaluate the destructors. Differential Revision: https://reviews.llvm.org/D130737	2022-08-24 01:28:21 +02:00
Fred Tingaud	16cb3be626	[analyzer] Deadstore static analysis: Fix false positive on C++17 assignments Dead store detection automatically checks that an expression is a CXXConstructor and skips it because of potential side effects. In C++17, with guaranteed copy elision, this check can fail because we actually receive the implicit cast of a CXXConstructor. Most checks in the dead store analysis were already stripping all casts and parenthesis and those that weren't were either forgotten (like the constructor) or would not suffer from it, so this patch proposes to factorize the stripping. It has an impact on where the dead store warning is reported in the case of an explicit cast, from auto a = static_cast<B>(A()); ^~~~~~~~~~~~~~~~~~~ to auto a = static_cast<B>(A()); ^~~ which we think is an improvement. Patch By: frederic-tingaud-sonarsource Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D126534	2022-08-23 18:33:26 +02:00
isuckatcs	c81bf940c7	[analyzer] Handling non-POD multidimensional arrays in ArrayInitLoopExpr This patch makes it possible for lambdas, implicit copy/move ctors and structured bindings to handle non-POD multidimensional arrays. Differential Revision: https://reviews.llvm.org/D131840	2022-08-22 13:53:53 +02:00
isuckatcs	3c482632e6	[analyzer] Remove pattern matching of lambda capture initializers Prior to this patch we handled lambda captures based on their initializer expression, which resulted in pattern matching. With C++17 copy elision the initializer expression can be anything, and this approach proved to be fragile and a source of crashes. This patch removes pattern matching and only checks whether the object is under construction or not. Differential Revision: https://reviews.llvm.org/D131944	2022-08-22 13:00:31 +02:00
isuckatcs	a47ec1b797	[analyzer][NFC] Be more descriptive when we replay without inlining This patch adds a ProgramPointTag to the EpsilonPoint created before we replay a call without inlining. Differential Revision: https://reviews.llvm.org/D132246	2022-08-19 18:05:52 +02:00
isuckatcs	b4e3e3a3eb	[analyzer] Fix a crash on copy elided initialized lambda captures Inside `ExprEngine::VisitLambdaExpr()` we wasn't prepared for a copy elided initialized capture's `InitExpr`. This patch teaches the analyzer how to handle such situation. Differential Revision: https://reviews.llvm.org/D131784	2022-08-13 00:22:01 +02:00
Denys Petrov	adcd4b1c0b	[analyzer] [NFC] Fix comments into more regular form.	2022-08-11 21:28:23 +03:00
malavikasamak	c74a204826	[analyzer] Fix false positive in use-after-move checker Differential Revision: https://reviews.llvm.org/D131525	2022-08-09 17:26:30 -07:00
Fangrui Song	32197830ef	[clang][clang-tools-extra] LLVM_NODISCARD => [[nodiscard]]. NFC	2022-08-09 07:11:18 +00:00
Fangrui Song	3f18f7c007	[clang] LLVM_FALLTHROUGH => [[fallthrough]]. NFC With C++17 there is no Clang pedantic warning or MSVC C5051. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D131346	2022-08-08 09:12:46 -07:00
Balázs Kéri	501faaa0d6	[clang][analyzer] Add more wide-character functions to CStringChecker Support for functions wmempcpy, wmemmove, wmemcmp is added to the checker. The same tests are copied that exist for the non-wide versions, with non-wide functions and character types changed to the wide version. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D130470	2022-08-05 10:32:53 +02:00
Corentin Jabot	127bf44385	[Clang][C++20] Support capturing structured bindings in lambdas This completes the implementation of P1091R3 and P1381R1. This patch allow the capture of structured bindings both for C++20+ and C++17, with extension/compat warning. In addition, capturing an anonymous union member, a bitfield, or a structured binding thereof now has a better diagnostic. We only support structured bindings - as opposed to other kinds of structured statements/blocks. We still emit an error for those. In addition, support for structured bindings capture is entirely disabled in OpenMP mode as this needs more investigation - a specific diagnostic indicate the feature is not yet supported there. Note that the rest of P1091R3 (static/thread_local structured bindings) was already implemented. at the request of @shafik, i can confirm the correct behavior of lldb wit this change. Fixes https://github.com/llvm/llvm-project/issues/54300 Fixes https://github.com/llvm/llvm-project/issues/54300 Fixes https://github.com/llvm/llvm-project/issues/52720 Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D122768	2022-08-04 10:12:53 +02:00
Corentin Jabot	a274219600	Revert "[Clang][C++20] Support capturing structured bindings in lambdas" This reverts commit `44f2baa380`. Breaks self builds and seems to have conformance issues.	2022-08-03 21:00:29 +02:00
Corentin Jabot	44f2baa380	[Clang][C++20] Support capturing structured bindings in lambdas This completes the implementation of P1091R3 and P1381R1. This patch allow the capture of structured bindings both for C++20+ and C++17, with extension/compat warning. In addition, capturing an anonymous union member, a bitfield, or a structured binding thereof now has a better diagnostic. We only support structured bindings - as opposed to other kinds of structured statements/blocks. We still emit an error for those. In addition, support for structured bindings capture is entirely disabled in OpenMP mode as this needs more investigation - a specific diagnostic indicate the feature is not yet supported there. Note that the rest of P1091R3 (static/thread_local structured bindings) was already implemented. at the request of @shafik, i can confirm the correct behavior of lldb wit this change. Fixes https://github.com/llvm/llvm-project/issues/54300 Fixes https://github.com/llvm/llvm-project/issues/54300 Fixes https://github.com/llvm/llvm-project/issues/52720 Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D122768	2022-08-03 20:00:01 +02:00
isuckatcs	10a7ee0bac	[analyzer] Fix for the crash in #56873 In ExprEngine::bindReturnValue() we cast an SVal to DefinedOrUnknownSVal, however this SVal can also be Undefined, which leads to an assertion failure. Fixes: #56873 Differential Revision: https://reviews.llvm.org/D130974	2022-08-03 19:25:02 +02:00
Gabriel Ravier	5674a3c880	Fixed a number of typos I went over the output of the following mess of a command: (ulimit -m 2000000; ulimit -v 2000000; git ls-files -z \| parallel --xargs -0 cat \| aspell list --mode=none --ignore-case \| grep -E '^[A-Za-z][a-z]*$' \| sort \| uniq -c \| sort -n \| grep -vE '.{25}' \| aspell pipe -W3 \| grep : \| cut -d' ' -f2 \| less) and proceeded to spend a few days looking at it to find probable typos and fixed a few hundred of them in all of the llvm project (note, the ones I found are not anywhere near all of them, but it seems like a good start). Differential Revision: https://reviews.llvm.org/D130827	2022-08-01 13:13:18 -04:00
Matheus Izvekov	15f3cd6bfc	[clang] Implement ElaboratedType sugaring for types written bare Without this patch, clang will not wrap in an ElaboratedType node types written without a keyword and nested name qualifier, which goes against the intent that we should produce an AST which retains enough details to recover how things are written. The lack of this sugar is incompatible with the intent of the type printer default policy, which is to print types as written, but to fall back and print them fully qualified when they are desugared. An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still requires pointer alignment due to pre-existing bug in the TypeLoc buffer handling. --- Troubleshooting list to deal with any breakage seen with this patch: 1) The most likely effect one would see by this patch is a change in how a type is printed. The type printer will, by design and default, print types as written. There are customization options there, but not that many, and they mainly apply to how to print a type that we somehow failed to track how it was written. This patch fixes a problem where we failed to distinguish between a type that was written without any elaborated-type qualifiers, such as a 'struct'/'class' tags and name spacifiers such as 'std::', and one that has been stripped of any 'metadata' that identifies such, the so called canonical types. Example: ``` namespace foo { struct A {}; A a; }; ``` If one were to print the type of `foo::a`, prior to this patch, this would result in `foo::A`. This is how the type printer would have, by default, printed the canonical type of A as well. As soon as you add any name qualifiers to A, the type printer would suddenly start accurately printing the type as written. This patch will make it print it accurately even when written without qualifiers, so we will just print `A` for the initial example, as the user did not really write that `foo::` namespace qualifier. 2) This patch could expose a bug in some AST matcher. Matching types is harder to get right when there is sugar involved. For example, if you want to match a type against being a pointer to some type A, then you have to account for getting a type that is sugar for a pointer to A, or being a pointer to sugar to A, or both! Usually you would get the second part wrong, and this would work for a very simple test where you don't use any name qualifiers, but you would discover is broken when you do. The usual fix is to either use the matcher which strips sugar, which is annoying to use as for example if you match an N level pointer, you have to put N+1 such matchers in there, beginning to end and between all those levels. But in a lot of cases, if the property you want to match is present in the canonical type, it's easier and faster to just match on that... This goes with what is said in 1), if you want to match against the name of a type, and you want the name string to be something stable, perhaps matching on the name of the canonical type is the better choice. 3) This patch could expose a bug in how you get the source range of some TypeLoc. For some reason, a lot of code is using getLocalSourceRange(), which only looks at the given TypeLoc node. This patch introduces a new, and more common TypeLoc node which contains no source locations on itself. This is not an inovation here, and some other, more rare TypeLoc nodes could also have this property, but if you use getLocalSourceRange on them, it's not going to return any valid locations, because it doesn't have any. The right fix here is to always use getSourceRange() or getBeginLoc/getEndLoc which will dive into the inner TypeLoc to get the source range if it doesn't find it on the top level one. You can use getLocalSourceRange if you are really into micro-optimizations and you have some outside knowledge that the TypeLocs you are dealing with will always include some source location. 4) Exposed a bug somewhere in the use of the normal clang type class API, where you have some type, you want to see if that type is some particular kind, you try a `dyn_cast` such as `dyn_cast<TypedefType>` and that fails because now you have an ElaboratedType which has a TypeDefType inside of it, which is what you wanted to match. Again, like 2), this would usually have been tested poorly with some simple tests with no qualifications, and would have been broken had there been any other kind of type sugar, be it an ElaboratedType or a TemplateSpecializationType or a SubstTemplateParmType. The usual fix here is to use `getAs` instead of `dyn_cast`, which will look deeper into the type. Or use `getAsAdjusted` when dealing with TypeLocs. For some reason the API is inconsistent there and on TypeLocs getAs behaves like a dyn_cast. 5) It could be a bug in this patch perhaps. Let me know if you need any help! Signed-off-by: Matheus Izvekov <mizvekov@gmail.com> Differential Revision: https://reviews.llvm.org/D112374	2022-07-27 11:10:54 +02:00
Chuanqi Xu	5588985212	[NFC] Convert a dyn_cast<> to an isa<>	2022-07-27 13:56:38 +08:00
Balazs Benics	a80418eec0	[analyzer] Improve loads from reinterpret-cast fields Consider this example: ```lang=C++ struct header { unsigned a : 1; unsigned b : 1; }; struct parse_t { unsigned bits0 : 1; unsigned bits2 : 2; // <-- header unsigned bits4 : 4; }; int parse(parse_t p) { unsigned copy = p->bits2; clang_analyzer_dump(copy); // expected-warning@-1 {{reg_$1<unsigned int SymRegion{reg_$0<struct Bug_55934::parse_t p>}.bits2>}} header bits = (header )© clang_analyzer_dump(bits->b); // <--- Was UndefinedVal previously. // expected-warning@-1 {{derived_$2{reg_$1<unsigned int SymRegion{reg_$0<struct Bug_55934::parse_t * p>}.bits2>,Element{copy,0 S64b,struct Bug_55934::header}.b}}} return bits->b; // no-warning: it's not UndefinedVal } ``` `bits->b` should have the same content as the second bit of `p->bits2` (assuming that the bitfields are in spelling order). --- The `Store` has the correct bindings. The problem is with the load of `bits->b`. It will eventually reach `RegionStoreManager::getBindingForField()` with `Element{copy,0 S64b,struct header}.b`, which is a `FieldRegion`. It did not find any direct bindings, so the `getBindingForFieldOrElementCommon()` gets called. That won't find any bindings, but it sees that the variable is on the //stack//, thus it must be an uninitialized local variable; thus it returns `UndefinedVal`. Instead of doing this, it should have created a //derived symbol// representing the slice of the region corresponding to the member. So, if the value of `copy` is `reg1`, then the value of `bits->b` should be `derived{reg1, elem{copy,0, header}.b}`. Actually, the `getBindingForElement()` already does exactly this for reinterpret-casts, so I decided to hoist that and reuse the logic. Fixes #55934 Reviewed By: martong Differential Revision: https://reviews.llvm.org/D128535	2022-07-26 12:31:21 +02:00
Benjamin Kramer	ad17e69923	[analyzer] Fix unused variable warning in release builds. NFC.	2022-07-26 11:29:38 +02:00
David Spickett	f3fbbe1cf3	[clang][analyzer][NFC] Use value_or instead of ValueOr The latter is deprecated.	2022-07-26 09:16:45 +00:00
isuckatcs	a618d5e0dd	[analyzer] Structured binding to tuple-like types Introducing support for creating structured binding to tuple-like types. Differential Revision: https://reviews.llvm.org/D128837	2022-07-26 10:24:29 +02:00
isuckatcs	996b092c5e	[analyzer] Lambda capture non-POD type array This patch introduces a new `ConstructionContext` for lambda capture. This `ConstructionContext` allows the analyzer to construct the captured object directly into it's final region, and makes it possible to capture non-POD arrays. Differential Revision: https://reviews.llvm.org/D129967	2022-07-26 09:40:25 +02:00
isuckatcs	8a13326d18	[analyzer] ArrayInitLoopExpr with array of non-POD type This patch introduces the evaluation of ArrayInitLoopExpr in case of structured bindings and implicit copy/move constructor. The idea is to call the copy constructor for every element in the array. The parameter of the copy constructor is also manually selected, as it is not a part of the CFG. Differential Revision: https://reviews.llvm.org/D129496	2022-07-26 09:07:22 +02:00
Kazu Hirata	3f3930a451	Remove redundaunt virtual specifiers (NFC) Identified with tidy-modernize-use-override.	2022-07-25 23:00:59 -07:00
Kazu Hirata	ae002f8bca	Use isa instead of dyn_cast (NFC)	2022-07-25 23:00:58 -07:00
Balázs Kéri	94ca2beccc	[clang][analyzer] Added partial wide character support to CStringChecker Support for functions wmemcpy, wcslen, wcsnlen is added to the checker. Documentation and tests are updated and extended with the new functions. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D130091	2022-07-25 09:23:14 +02:00
Kazu Hirata	95a932fb15	Remove redundaunt override specifiers (NFC) Identified with modernize-use-override.	2022-07-24 22:28:11 -07:00
Kazu Hirata	a210f404da	[clang] Remove redundant virtual specifies (NFC) Identified with modernize-use-override.	2022-07-24 22:02:58 -07:00
Kazu Hirata	9e88cbcc40	Use any_of (NFC)	2022-07-24 14:48:11 -07:00
Denys Petrov	a364987368	[analyzer][NFC] Use `SValVisitor` instead of explicit helper functions Summary: Get rid of explicit function splitting in favor of specifically designed Visitor. Move logic from a family of `evalCastKind` and `evalCastSubKind` helper functions to `SValVisitor`. Differential Revision: https://reviews.llvm.org/D130029	2022-07-19 23:10:00 +03:00
serge-sans-paille	f764dc99b3	[clang] Introduce -fstrict-flex-arrays=<n> for stricter handling of flexible arrays Some code [0] consider that trailing arrays are flexible, whatever their size. Support for these legacy code has been introduced in `f8f6324983` but it prevents evaluation of __builtin_object_size and __builtin_dynamic_object_size in some legit cases. Introduce -fstrict-flex-arrays=<n> to have stricter conformance when it is desirable. n = 0: current behavior, any trailing array member is a flexible array. The default. n = 1: any trailing array member of undefined, 0 or 1 size is a flexible array member n = 2: any trailing array member of undefined or 0 size is a flexible array member This takes into account two specificities of clang: array bounds as macro id disqualify FAM, as well as non standard layout. Similar patch for gcc discuss here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836 [0] https://docs.freebsd.org/en/books/developers-handbook/sockets/#sockets-essential-functions	2022-07-18 12:45:52 +02:00
Denys Petrov	bc08c3cb7f	[analyzer] Add new function `clang_analyzer_value` to ExprInspectionChecker Summary: Introduce a new function 'clang_analyzer_value'. It emits a report that in turn prints a RangeSet or APSInt associated with SVal. If there is no associated value, prints "n/a".	2022-07-15 20:07:04 +03:00
Denys Petrov	82f76c0477	[analyzer][NFC] Tidy up handler-functions in SymbolicRangeInferrer Summary: Sorted some handler-functions into more appropriate visitor functions of the SymbolicRangeInferrer. - Spread `getRangeForNegatedSub` body over several visitor functions: `VisitSymExpr`, `VisitSymIntExpr`, `VisitSymSymExpr`. - Moved `getRangeForComparisonSymbol` from `infer` to `VisitSymSymExpr`. Differential Revision: https://reviews.llvm.org/D129678	2022-07-15 19:24:57 +03:00
Fangrui Song	3c849d0aef	Modernize Optional::{getValueOr,hasValue}	2022-07-15 01:20:39 -07:00
Jonas Devlieghere	888673b6e3	Revert "[clang] Implement ElaboratedType sugaring for types written bare" This reverts commit `7c51f02eff` because it stills breaks the LLDB tests. This was re-landed without addressing the issue or even agreement on how to address the issue. More details and discussion in https://reviews.llvm.org/D112374.	2022-07-14 21:17:48 -07:00
Matheus Izvekov	7c51f02eff	[clang] Implement ElaboratedType sugaring for types written bare Without this patch, clang will not wrap in an ElaboratedType node types written without a keyword and nested name qualifier, which goes against the intent that we should produce an AST which retains enough details to recover how things are written. The lack of this sugar is incompatible with the intent of the type printer default policy, which is to print types as written, but to fall back and print them fully qualified when they are desugared. An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still requires pointer alignment due to pre-existing bug in the TypeLoc buffer handling. --- Troubleshooting list to deal with any breakage seen with this patch: 1) The most likely effect one would see by this patch is a change in how a type is printed. The type printer will, by design and default, print types as written. There are customization options there, but not that many, and they mainly apply to how to print a type that we somehow failed to track how it was written. This patch fixes a problem where we failed to distinguish between a type that was written without any elaborated-type qualifiers, such as a 'struct'/'class' tags and name spacifiers such as 'std::', and one that has been stripped of any 'metadata' that identifies such, the so called canonical types. Example: ``` namespace foo { struct A {}; A a; }; ``` If one were to print the type of `foo::a`, prior to this patch, this would result in `foo::A`. This is how the type printer would have, by default, printed the canonical type of A as well. As soon as you add any name qualifiers to A, the type printer would suddenly start accurately printing the type as written. This patch will make it print it accurately even when written without qualifiers, so we will just print `A` for the initial example, as the user did not really write that `foo::` namespace qualifier. 2) This patch could expose a bug in some AST matcher. Matching types is harder to get right when there is sugar involved. For example, if you want to match a type against being a pointer to some type A, then you have to account for getting a type that is sugar for a pointer to A, or being a pointer to sugar to A, or both! Usually you would get the second part wrong, and this would work for a very simple test where you don't use any name qualifiers, but you would discover is broken when you do. The usual fix is to either use the matcher which strips sugar, which is annoying to use as for example if you match an N level pointer, you have to put N+1 such matchers in there, beginning to end and between all those levels. But in a lot of cases, if the property you want to match is present in the canonical type, it's easier and faster to just match on that... This goes with what is said in 1), if you want to match against the name of a type, and you want the name string to be something stable, perhaps matching on the name of the canonical type is the better choice. 3) This patch could exposed a bug in how you get the source range of some TypeLoc. For some reason, a lot of code is using getLocalSourceRange(), which only looks at the given TypeLoc node. This patch introduces a new, and more common TypeLoc node which contains no source locations on itself. This is not an inovation here, and some other, more rare TypeLoc nodes could also have this property, but if you use getLocalSourceRange on them, it's not going to return any valid locations, because it doesn't have any. The right fix here is to always use getSourceRange() or getBeginLoc/getEndLoc which will dive into the inner TypeLoc to get the source range if it doesn't find it on the top level one. You can use getLocalSourceRange if you are really into micro-optimizations and you have some outside knowledge that the TypeLocs you are dealing with will always include some source location. 4) Exposed a bug somewhere in the use of the normal clang type class API, where you have some type, you want to see if that type is some particular kind, you try a `dyn_cast` such as `dyn_cast<TypedefType>` and that fails because now you have an ElaboratedType which has a TypeDefType inside of it, which is what you wanted to match. Again, like 2), this would usually have been tested poorly with some simple tests with no qualifications, and would have been broken had there been any other kind of type sugar, be it an ElaboratedType or a TemplateSpecializationType or a SubstTemplateParmType. The usual fix here is to use `getAs` instead of `dyn_cast`, which will look deeper into the type. Or use `getAsAdjusted` when dealing with TypeLocs. For some reason the API is inconsistent there and on TypeLocs getAs behaves like a dyn_cast. 5) It could be a bug in this patch perhaps. Let me know if you need any help! Signed-off-by: Matheus Izvekov <mizvekov@gmail.com> Differential Revision: https://reviews.llvm.org/D112374	2022-07-15 04:16:55 +02:00
isuckatcs	b032e3ff61	[analyzer] Evaluate construction of non-POD type arrays Introducing the support for evaluating the constructor of every element in an array. The idea is to record the index of the current array member being constructed and create a loop during the analysis. We looping over the same CXXConstructExpr as many times as many elements the array has. Differential Revision: https://reviews.llvm.org/D127973	2022-07-14 23:30:21 +02:00
Ella Ma	32fe1a4be9	[analyzer] Fixing SVal::getType returns Null Type for NonLoc::ConcreteInt in boolean type In method `TypeRetrievingVisitor::VisitConcreteInt`, `ASTContext::getIntTypeForBitwidth` is used to get the type for `ConcreteInt`s. However, the getter in ASTContext cannot handle the boolean type with the bit width of 1, which will make method `SVal::getType` return a Null `Type`. In this patch, a check for this case is added to fix this problem by returning the bool type directly when the bit width is 1. Differential Revision: https://reviews.llvm.org/D129737	2022-07-14 22:00:38 +08:00
Kazu Hirata	cb2c8f694d	[clang] Use value instead of getValue (NFC)	2022-07-13 23:39:33 -07:00
einvbri	1d7e58cfad	[analyzer] Fix use of length in CStringChecker CStringChecker is using getByteLength to get the length of a string literal. For targets where a "char" is 8-bits, getByteLength() and getLength() will be equal for a C string, but for targets where a "char" is 16-bits getByteLength() returns the size in octets. This is verified in our downstream target, but we have no way to add a test case for this case since there is no target supporting 16-bit "char" upstream. Since this cannot have a test case, I'm asserted this change is "correct by construction", and visually inspected to be correct by way of the following example where this was found. The case that shows this fails using a target with 16-bit chars is here. getByteLength() for the string literal returns 4, which fails when checked against "char x[4]". With the change, the string literal is evaluated to a size of 2 which is a correct number of "char"'s for a 16-bit target. ``` void strcpy_no_overflow_2(char y) { char x[4]; strcpy(x, "12"); // with getByteLength(), returns 4 using 16-bit chars } ``` This change exposed that embedded nulls within the string are not handled. This is documented as a FIXME for a future fix. ``` void strcpy_no_overflow_3(char y) { char x[3]; strcpy(x, "12\0"); } ``` Reviewed By: martong Differential Revision: https://reviews.llvm.org/D129269	2022-07-13 19:19:23 -05:00
Jonas Devlieghere	3968936b92	Revert "[clang] Implement ElaboratedType sugaring for types written bare" This reverts commit `bdc6974f92` because it breaks all the LLDB tests that import the std module. import-std-module/array.TestArrayFromStdModule.py import-std-module/deque-basic.TestDequeFromStdModule.py import-std-module/deque-dbg-info-content.TestDbgInfoContentDequeFromStdModule.py import-std-module/forward_list.TestForwardListFromStdModule.py import-std-module/forward_list-dbg-info-content.TestDbgInfoContentForwardListFromStdModule.py import-std-module/list.TestListFromStdModule.py import-std-module/list-dbg-info-content.TestDbgInfoContentListFromStdModule.py import-std-module/queue.TestQueueFromStdModule.py import-std-module/stack.TestStackFromStdModule.py import-std-module/vector.TestVectorFromStdModule.py import-std-module/vector-bool.TestVectorBoolFromStdModule.py import-std-module/vector-dbg-info-content.TestDbgInfoContentVectorFromStdModule.py import-std-module/vector-of-vectors.TestVectorOfVectorsFromStdModule.py https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/45301/	2022-07-13 09:20:30 -07:00
Kazu Hirata	53daa177f8	[clang, clang-tools-extra] Use has_value instead of hasValue (NFC)	2022-07-12 22:47:41 -07:00
Matheus Izvekov	bdc6974f92	[clang] Implement ElaboratedType sugaring for types written bare Without this patch, clang will not wrap in an ElaboratedType node types written without a keyword and nested name qualifier, which goes against the intent that we should produce an AST which retains enough details to recover how things are written. The lack of this sugar is incompatible with the intent of the type printer default policy, which is to print types as written, but to fall back and print them fully qualified when they are desugared. An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still requires pointer alignment due to pre-existing bug in the TypeLoc buffer handling. Signed-off-by: Matheus Izvekov <mizvekov@gmail.com> Differential Revision: https://reviews.llvm.org/D112374	2022-07-13 02:10:09 +02:00
Gabor Marton	2df120784a	[analyzer] Fix assertion in simplifySymbolCast Depends on D128068. Added a new test code that fails an assertion in the baseline. That is because `getAPSIntType` works only with integral types. Differential Revision: https://reviews.llvm.org/D126779	2022-07-05 19:00:23 +02:00
Gabor Marton	5d7fa481cf	[analyzer] Do not emit redundant SymbolCasts In `RegionStore::getBinding` we call `evalCast` unconditionally to align the stored value's type to the one that is being queried. However, the stored type might be the same, so we may end up having redundant `SymbolCasts` emitted. The solution is to check whether the `to` and `from` type are the same in `makeNonLoc`. Note, we can't just do type equivalence check at the beginning of `evalCast` because when `evalCast` is called from `getBinding` then the original type (`OriginalTy`) is not set, so one operand is missing for the comparison. In `evalCastSubKind(nonloc::SymbolVal)` when the original type is not set, we get the `from` type via `SymbolVal::getType()`. Differential Revision: https://reviews.llvm.org/D128068	2022-07-05 18:42:34 +02:00
Fazlay Rabbi	38bcd483dd	[OpenMP] Initial parsing and semantic support for 'parallel masked taskloop simd' construct This patch gives basic parsing and semantic support for "parallel masked taskloop simd" construct introduced in OpenMP 5.1 (section 2.16.10) Differential Revision: https://reviews.llvm.org/D128946	2022-07-01 08:57:15 -07:00
Fazlay Rabbi	d64ba896d3	[OpenMP] Initial parsing and sema support for 'parallel masked taskloop' construct This patch gives basic parsing and semantic support for "parallel masked taskloop" construct introduced in OpenMP 5.1 (section 2.16.9) Differential Revision: https://reviews.llvm.org/D128834	2022-06-30 11:44:17 -07:00
Corentin Jabot	64ab2b1dcc	Improve handling of static assert messages. Instead of dumping the string literal (which quotes it and escape every non-ascii symbol), we can use the content of the string when it is a 8 byte string. Wide, UTF-8/UTF-16/32 strings are still completely escaped, until we clarify how these entities should behave (cf https://wg21.link/p2361). `FormatDiagnostic` is modified to escape non printable characters and invalid UTF-8. This ensures that unicode characters, spaces and new lines are properly rendered in static messages. This make clang more consistent with other implementation and fixes this tweet https://twitter.com/jfbastien/status/1298307325443231744 :) Of note, `PaddingChecker` did print out new lines that were later removed by the diagnostic printing code. To be consistent with its tests, the new lines are removed from the diagnostic. Unicode tables updated to both use the Unicode definitions and the Unicode 14.0 data. U+00AD SOFT HYPHEN is still considered a print character to match existing practices in terminals, in addition of being considered a formatting character as per Unicode. Reviewed By: aaron.ballman, #clang-language-wg Differential Revision: https://reviews.llvm.org/D108469	2022-06-29 14:57:35 +02:00
isuckatcs	9d2e830737	[analyzer] Fix BindingDecl evaluation for reference types The case when the bound variable is reference type in a BindingDecl wasn't handled, which lead to false positives. Differential Revision: https://reviews.llvm.org/D128716	2022-06-29 13:01:19 +02:00
Fazlay Rabbi	73e5d7bdff	[OpenMP] Initial parsing and sema support for 'masked taskloop simd' construct This patch gives basic parsing and semantic support for "masked taskloop simd" construct introduced in OpenMP 5.1 (section 2.16.8) Differential Revision: https://reviews.llvm.org/D128693	2022-06-28 15:27:49 -07:00
Corentin Jabot	a774ba7f60	Revert "Improve handling of static assert messages." This reverts commit `870b6d2183`. This seems to break some libc++ tests, reverting while investigating	2022-06-29 00:03:23 +02:00
Corentin Jabot	870b6d2183	Improve handling of static assert messages. Instead of dumping the string literal (which quotes it and escape every non-ascii symbol), we can use the content of the string when it is a 8 byte string. Wide, UTF-8/UTF-16/32 strings are still completely escaped, until we clarify how these entities should behave (cf https://wg21.link/p2361). `FormatDiagnostic` is modified to escape non printable characters and invalid UTF-8. This ensures that unicode characters, spaces and new lines are properly rendered in static messages. This make clang more consistent with other implementation and fixes this tweet https://twitter.com/jfbastien/status/1298307325443231744 :) Of note, `PaddingChecker` did print out new lines that were later removed by the diagnostic printing code. To be consistent with its tests, the new lines are removed from the diagnostic. Unicode tables updated to both use the Unicode definitions and the Unicode 14.0 data. U+00AD SOFT HYPHEN is still considered a print character to match existing practices in terminals, in addition of being considered a formatting character as per Unicode. Reviewed By: aaron.ballman, #clang-language-wg Differential Revision: https://reviews.llvm.org/D108469	2022-06-28 22:26:00 +02:00
Vitaly Buka	cdfa15da94	Revert "[clang] Introduce -fstrict-flex-arrays=<n> for stricter handling of flexible arrays" This reverts D126864 and related fixes. This reverts commit `572b08790a`. This reverts commit `886715af96`.	2022-06-27 14:03:09 -07:00
Kazu Hirata	97afce08cb	[clang] Don't use Optional::hasValue (NFC) This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.	2022-06-25 22:26:24 -07:00
Kazu Hirata	3b7c3a654c	Revert "Don't use Optional::hasValue (NFC)" This reverts commit `aa8feeefd3`.	2022-06-25 11:56:50 -07:00
Kazu Hirata	aa8feeefd3	Don't use Optional::hasValue (NFC)	2022-06-25 11:55:57 -07:00
Fazlay Rabbi	42bb88e2aa	[OpenMP] Initial parsing and sema support for 'masked taskloop' construct This patch gives basic parsing and semantic support for "masked taskloop" construct introduced in OpenMP 5.1 (section 2.16.7) Differential Revision: https://reviews.llvm.org/D128478	2022-06-24 10:00:08 -07:00
serge-sans-paille	886715af96	[clang] Introduce -fstrict-flex-arrays=<n> for stricter handling of flexible arrays Some code [0] consider that trailing arrays are flexible, whatever their size. Support for these legacy code has been introduced in `f8f6324983` but it prevents evaluation of __builtin_object_size and __builtin_dynamic_object_size in some legit cases. Introduce -fstrict-flex-arrays=<n> to have stricter conformance when it is desirable. n = 0: current behavior, any trailing array member is a flexible array. The default. n = 1: any trailing array member of undefined, 0 or 1 size is a flexible array member n = 2: any trailing array member of undefined or 0 size is a flexible array member n = 3: any trailing array member of undefined size is a flexible array member (strict c99 conformance) Similar patch for gcc discuss here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836 [0] https://docs.freebsd.org/en/books/developers-handbook/sockets/#sockets-essential-functions	2022-06-24 16:13:29 +02:00
isuckatcs	8ef628088b	[analyzer] Structured binding to arrays Introducing structured binding to data members and more. To handle binding to arrays, ArrayInitLoopExpr is also evaluated, which enables the analyzer to store information in two more cases. These are: - when a lambda-expression captures an array by value - in the implicit copy/move constructor for a class with an array member Differential Revision: https://reviews.llvm.org/D126613	2022-06-23 11:38:21 +02:00
Balázs Kéri	7dc81c6244	[clang][analyzer] Fix StdLibraryFunctionsChecker 'mkdir' return value. The functions 'mkdir', 'mknod', 'mkdirat', 'mknodat' return 0 on success and -1 on failure. The checker modeled these functions with a >= 0 return value on success which is changed to 0 only. This fix makes ErrnoChecker work better for these functions. Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D127277	2022-06-23 11:27:26 +02:00
Balázs Kéri	957014da2d	[clang][Analyzer] Add errno state to standard functions modeling. This updates StdLibraryFunctionsChecker to set the state of 'errno' by using the new errno_modeling functionality. The errno value is set in the PostCall callback. Setting it in call::Eval did not work for some reason and then every function should be EvalCallAsPure which may be bad to do. Now the errno value and state is not allowed to be checked in any PostCall checker callback because it is unspecified if the errno was set already or will be set later by this checker. Reviewed By: martong, steakhal Differential Revision: https://reviews.llvm.org/D125400	2022-06-21 08:56:41 +02:00
Kazu Hirata	ca4af13e48	[clang] Don't use Optional::getValue (NFC)	2022-06-20 22:59:26 -07:00
Kazu Hirata	0916d96d12	Don't use Optional::hasValue (NFC)	2022-06-20 20:17:57 -07:00
Kazu Hirata	064a08cd95	Don't use Optional::hasValue (NFC)	2022-06-20 20:05:16 -07:00
Kazu Hirata	5413bf1bac	Don't use Optional::hasValue (NFC)	2022-06-20 11:33:56 -07:00
Kazu Hirata	452db157c9	[clang] Don't use Optional::hasValue (NFC)	2022-06-20 10:51:34 -07:00
Balázs Kéri	60f3b07118	[clang][analyzer] Add checker for bad use of 'errno'. Extend checker 'ErrnoModeling' with a state of 'errno' to indicate the importance of the 'errno' value and how it should be used. Add a new checker 'ErrnoChecker' that observes use of 'errno' and finds possible wrong uses, based on the "errno state". The "errno state" should be set (together with value of 'errno') by other checkers (that perform modeling of the given function) in the future. Currently only a test function can set this value. The new checker has no user-observable effect yet. Reviewed By: martong, steakhal Differential Revision: https://reviews.llvm.org/D122150	2022-06-20 10:07:31 +02:00
Kazu Hirata	06decd0b41	[clang] Use value_or instead of getValueOr (NFC)	2022-06-18 23:21:34 -07:00
isuckatcs	e77ac66b8c	[Static Analyzer] Structured binding to data members Introducing structured binding to data members. Differential Revision: https://reviews.llvm.org/D127643	2022-06-17 19:50:10 +02:00
isuckatcs	92bf652d40	[Static Analyzer] Small array binding policy If a lazyCompoundVal to a struct is bound to the store, there is a policy which decides whether a copy gets created instead. This patch introduces a similar policy for arrays, which is required to model structured binding to arrays without false negatives. Differential Revision: https://reviews.llvm.org/D128064	2022-06-17 18:56:13 +02:00
Jennifer Yu	bb83f8e70b	[OpenMP] Initial parsing and sema for 'parallel masked' construct Differential Revision: https://reviews.llvm.org/D127454	2022-06-16 18:01:15 -07:00
Balazs Benics	929e60b6bd	[analyzer] Relax constraints on const qualified regions The arithmetic restriction seems to be artificial. The comment below seems to be stale. Thus, we remove both. Depends on D127306. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D127763	2022-06-15 17:08:27 +02:00
Balazs Benics	f4fc3f6ba3	[analyzer] Treat system globals as mutable if they are not const Previously, system globals were treated as immutable regions, unless it was the `errno` which is known to be frequently modified. D124244 wants to add a check for stores to immutable regions. It would basically turn all stores to system globals into an error even though we have no reason to believe that those mutable sys globals should be treated as if they were immutable. And this leads to false-positives if we apply D124244. In this patch, I'm proposing to treat mutable sys globals actually mutable, hence allocate them into the `GlobalSystemSpaceRegion`, UNLESS they were declared as `const` (and a primitive arithmetic type), in which case, we should use `GlobalImmutableSpaceRegion`. In any other cases, I'm using the `GlobalInternalSpaceRegion`, which is no different than the previous behavior. --- In the tests I added, only the last `expected-warning` was different, compared to the baseline. Which is this: ```lang=C++ void test_my_mutable_system_global_constraint() { assert(my_mutable_system_global > 2); clang_analyzer_eval(my_mutable_system_global > 2); // expected-warning {{TRUE}} invalidate_globals(); clang_analyzer_eval(my_mutable_system_global > 2); // expected-warning {{UNKNOWN}} It was previously TRUE. } void test_my_mutable_system_global_assign(int x) { my_mutable_system_global = x; clang_analyzer_eval(my_mutable_system_global == x); // expected-warning {{TRUE}} invalidate_globals(); clang_analyzer_eval(my_mutable_system_global == x); // expected-warning {{UNKNOWN}} It was previously TRUE. } ``` --- Unfortunately, the taint checker will be also affected. The `stdin` global variable is a pointer, which is assumed to be a taint source, and the rest of the taint propagation rules will propagate from it. However, since mutable variables are no longer treated immutable, they also get invalidated, when an opaque function call happens, such as the first `scanf(stdin, ...)`. This would effectively remove taint from the pointer, consequently disable all the rest of the taint propagations down the line from the `stdin` variable. All that said, I decided to look through `DerivedSymbol`s as well, to acquire the memregion in that case as well. This should preserve the previously existing taint reports. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D127306	2022-06-15 17:08:27 +02:00
Balazs Benics	96ccb690a0	[analyzer][NFC] Prefer using isa<> instead getAs<> in conditions Depends on D125709 Reviewed By: martong Differential Revision: https://reviews.llvm.org/D127742	2022-06-15 16:58:13 +02:00
Balazs Benics	481f860324	[analyzer][NFC] Remove dead field of UnixAPICheckers Initially, I thought there is some fundamental bug here by not using the bool fields, but it turns out D55425 split this checker into two separate ones; making these fields dead. Depends on D127836, which uncovered this issue. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D127838	2022-06-15 16:50:12 +02:00
Balazs Benics	6c4f9998ae	[analyzer] Fix StreamErrorState hash bug The `Profile` function was incorrectly implemented. The `StreamErrorState` has an implicit `bool` conversion operator, which will result in a different hash than faithfully hashing the raw value of the enum. I don't have a test for it, since it seems difficult to find one. Even if we would have one, any change in the hashing algorithm would have a chance of breaking it, so I don't think it would justify the effort. Depends on D127836, which uncovered this issue by marking the related `Profile` function dead. Reviewed By: martong, balazske Differential Revision: https://reviews.llvm.org/D127839	2022-06-15 16:50:12 +02:00
Balazs Benics	f1b18a79b7	[analyzer][NFC] Remove dead code and modernize surroundings Thanks @kazu for helping me clean these parts in D127799. I'm leaving the dump methods, along with the unused visitor handlers and the forwarding methods. The dead parts actually helped to uncover two bugs, to which I'm going to post separate patches. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D127836	2022-06-15 16:50:12 +02:00
Balazs Benics	40940fb2a6	[analyzer][NFC] Substitute the SVal::evalMinus and evalComplement functions Depends on D126127 Reviewed By: martong Differential Revision: https://reviews.llvm.org/D127734	2022-06-14 18:56:43 +02:00
Balazs Benics	cfc915149c	[analyzer][NFC] Relocate unary transfer functions This is an initial step of removing the SimpleSValBuilder abstraction. The SValBuilder alone should be enough. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D126127	2022-06-14 18:56:43 +02:00
Balazs Benics	de6ba9704d	[analyzer][Casting] Support isa, cast, dyn_cast of SVals This change specializes the LLVM RTTI mechanism for SVals. After this change, we can use the well-known `isa`, `cast`, `dyn_cast`. Examples: // SVal V = ...; // Loc MyLoc = ...; bool IsInteresting = isa<loc::MemRegionVal, loc::GotoLabel>(MyLoc); auto MRV = cast<loc::MemRegionVal>(MyLoc); Optional<loc::MemRegionVal> MaybeMRV = dyn_cast<loc::MemRegionVal>(V) The current `SVal::getAs` and `castAs` member functions are redundant at this point, but I believe that they are still handy. The member function version is terse and reads left-to-right, which IMO is a great plus. However, we should probably add a variadic `isa` member function version to have the same casting API in both cases. Thanks for the extensive TMP help @bzcheeseman! Reviewed By: bzcheeseman Differential Revision: https://reviews.llvm.org/D125709	2022-06-14 13:43:04 +02:00
Balazs Benics	ffe7950ebc	Reland "[analyzer] Deprecate `-analyzer-store region` flag" I'm trying to remove unused options from the `Analyses.def` file, then merge the rest of the useful options into the `AnalyzerOptions.def`. Then make sure one can set these by an `-analyzer-config XXX=YYY` style flag. Then surface the `-analyzer-config` to the `clang` frontend; After all of this, we can pursue the tablegen approach described https://discourse.llvm.org/t/rfc-tablegen-clang-static-analyzer-engine-options-for-better-documentation/61488 In this patch, I'm proposing flag deprecations. We should support deprecated analyzer flags for exactly one release. In this case I'm planning to drop this flag in `clang-16`. In the clang frontend, now we won't pass this option to the cc1 frontend, rather emit a warning diagnostic reminding the users about this deprecated flag, which will be turned into error in clang-16. Unfortunately, I had to remove all the tests referring to this flag, causing a mass change. I've also added a test for checking this warning. I've seen that `scan-build` also uses this flag, but I think we should remove that part only after we turn this into a hard error. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D126215	2022-06-14 09:20:41 +02:00
Kazu Hirata	f13019f836	[clang] Use any_of and none_of (NFC)	2022-06-12 10:17:12 -07:00
Kazu Hirata	f5ef2c5838	[clang] Convert for_each to range-based for loops (NFC)	2022-06-10 22:39:45 -07:00
Nico Weber	8406839d19	Revert "[analyzer] Deprecate `-analyzer-store region` flag" This reverts commit `d50d9946d1`. Broke check-clang, see comments on https://reviews.llvm.org/D126067 Also revert dependent change "[analyzer] Deprecate the unused 'analyzer-opt-analyze-nested-blocks' cc1 flag" This reverts commit `07b4a6d046`. Also revert "[analyzer] Fix buildbots after introducing a new frontend warning" This reverts commit `90374df15d`. (See https://reviews.llvm.org/rG90374df15ddc58d823ca42326a76f58e748f20eb)	2022-06-10 08:50:13 -04:00
Balazs Benics	b73c2280f5	[analyzer][NFC] Remove unused RegionStoreFeatures Reviewed By: martong Differential Revision: https://reviews.llvm.org/D126216	2022-06-10 13:02:26 +02:00
Balazs Benics	d50d9946d1	[analyzer] Deprecate `-analyzer-store region` flag I'm trying to remove unused options from the `Analyses.def` file, then merge the rest of the useful options into the `AnalyzerOptions.def`. Then make sure one can set these by an `-analyzer-config XXX=YYY` style flag. Then surface the `-analyzer-config` to the `clang` frontend; After all of this, we can pursue the tablegen approach described https://discourse.llvm.org/t/rfc-tablegen-clang-static-analyzer-engine-options-for-better-documentation/61488 In this patch, I'm proposing flag deprecations. We should support deprecated analyzer flags for exactly one release. In this case I'm planning to drop this flag in `clang-16`. In the clang frontend, now we won't pass this option to the cc1 frontend, rather emit a warning diagnostic reminding the users about this deprecated flag, which will be turned into error in clang-16. Unfortunately, I had to remove all the tests referring to this flag, causing a mass change. I've also added a test for checking this warning. I've seen that `scan-build` also uses this flag, but I think we should remove that part only after we turn this into a hard error. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D126215	2022-06-10 12:57:15 +02:00
Balazs Benics	07a7fd314a	[analyzer] Print the offending function at EndAnalysis crash I've faced crashes in the past multiple times when some `check::EndAnalysis` callback caused some crash. It's really anoying that it doesn't tell which function triggered this callback. This patch adds the well-known trace for that situation as well. Example: 1. <eof> parser at end of file 2. While analyzing stack: #0 Calling test11 Note that this does not have tests. I've considered `unittests` for this purpose, by using the `ASSERT_DEATH()` similarly how we check double eval called functions in `ConflictingEvalCallsTest.cpp`, however, that the testsuite won't invoke the custom handlers. Only the message of the `llvm_unreachable()` will be printed. Consequently, it's not applicable for us testing this feature. I've also considered using an end-to-end LIT test for this. For that, we would need to somehow overload the `clang_analyzer_crash()` `ExprInspection` handler, to get triggered by other events than the `EvalCall`. I'm not saying that we could not come up with a generic way of causing crash in a specific checker callback, but I'm not sure if that would worth the effort. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D127389	2022-06-10 12:21:17 +02:00
Gabor Marton	bc2c759aee	[analyzer] Fix assertion failure after getKnownValue call Depends on D126560. `getKnownValue` has been changed by the parent patch in a way that simplification was removed. This is not correct when the function is called by the Checkers. Thus, a new internal function is introduced, `getConstValue`, which simply queries the constraint manager. This `getConstValue` is used internally in the `SimpleSValBuilder` when a binop is evaluated, this way we avoid the recursion into the `Simplifier`. Differential Revision: https://reviews.llvm.org/D127285	2022-06-09 16:13:57 +02:00
Vince Bridgers	c7fa4e8a8b	[analyzer] Fix null pointer deref in CastValueChecker A crash was seen in CastValueChecker due to a null pointer dereference. The fix uses QualType::getAsString to avoid the null dereference when a CXXRecordDecl cannot be obtained. A small reproducer is added, and cast value notes LITs are updated for the new debug messages. Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D127105	2022-06-07 13:34:06 -04:00
Gabor Marton	8131ee4c43	[analyzer] Remove NotifyAssumeClients Depends on D126560. Differential Revision: https://reviews.llvm.org/D126878	2022-06-07 13:02:03 +02:00
Gabor Marton	17e9ea6138	[analyzer][NFC] Add LLVM_UNLIKELY to assumeDualImpl Aligned with the measures we had in D124674, this condition seems to be unlikely. Nevertheless, I've made some new measurments with stats just for this, and data confirms this is indeed unlikely. Differential Revision: https://reviews.llvm.org/D127190	2022-06-07 12:48:48 +02:00
Gabor Marton	f66f4d3b07	[analyzer] Track assume call stack to detect fixpoint Assume functions might recurse (see `reAssume` or `tryRearrange`). During the recursion, the State might not change anymore, that means we reached a fixpoint. In this patch, we avoid infinite recursion of assume calls by checking already visited States on the stack of assume function calls. This patch renders the previous "workaround" solution (D47155) unnecessary. Note that this is not an NFC patch. If we were to limit the maximum stack depth of the assume calls to 1 then would it be equivalent with the previous solution in D47155. Additionally, in D113753, we simplify the symbols right at the beginning of evalBinOpNN. So, a call to `simplifySVal` in `getKnownValue` (added in D51252) is no longer needed. Fixes https://github.com/llvm/llvm-project/issues/55851 Differential Revision: https://reviews.llvm.org/D126560	2022-06-07 08:36:11 +02:00
Kazu Hirata	e0039b8d6a	Use llvm::less_second (NFC)	2022-06-04 22:48:32 -07:00
Kazu Hirata	4969a6924d	Use llvm::less_first (NFC)	2022-06-04 21:23:18 -07:00
Balazs Benics	7d24641f89	[llvm][analyzer][NFC] Introduce SFINAE for specializing FoldingSetTraits Reviewed By: martong Differential Revision: https://reviews.llvm.org/D126803	2022-06-02 19:46:38 +02:00
Balazs Benics	cf1f1b7240	[analyzer][NFC] Uplift checkers after D126801 Reviewed By: martong Differential Revision: https://reviews.llvm.org/D126802	2022-06-02 19:46:38 +02:00
Balazs Benics	33ca5a447e	[analyzer][NFC] Add partial specializations for ProgramStateTraits I'm also hoisting common code from the existing specializations into a common trait impl to reduce code duplication. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D126801	2022-06-02 19:46:38 +02:00
Gabor Marton	81e44414aa	[analyzer][NFC] Move overconstrained check from reAssume to assumeDualImpl Depends on D126406. Checking of the overconstrained property is much better suited here. Differential Revision: https://reviews.llvm.org/D126707	2022-06-02 11:41:19 +02:00
Gabor Marton	160798ab9b	[analyzer] Handle SymbolCast in SValBuilder Make the SimpleSValBuilder to be able to look up and use a constraint for an operand of a SymbolCast, when the operand is constrained to a const value. This part of the SValBuilder is responsible for constant folding. We need this constant folding, so the engine can work with less symbols, this way it can be more efficient. Whenever a symbol is constrained with a constant then we substitute the symbol with the corresponding integer. If a symbol is constrained with a range, then the symbol is kept and we fall-back to use the range based constraint manager, which is not that efficient. This patch is the natural extension of the existing constant folding machinery with the support of SymbolCast symbols. Differential Revision: https://reviews.llvm.org/D126481	2022-06-01 08:42:04 +02:00
Balazs Benics	a73b50ad06	Revert "[llvm][clang][bolt][NFC] Use llvm::less_first() when applicable" This reverts commit `3988bd1398`. Did not build on this bot: https://lab.llvm.org/buildbot#builders/215/builds/6372 /usr/include/c++/9/bits/predefined_ops.h:177:11: error: no match for call to ‘(llvm::less_first) (std::pair<long unsigned int, llvm::bolt::BinaryBasicBlock>&, const std::pair<long unsigned int, std::nullptr_t>&)’ 177 \| { return bool(_M_comp(__it, __val)); }	2022-05-27 11:19:18 +02:00
Balazs Benics	3988bd1398	[llvm][clang][bolt][NFC] Use llvm::less_first() when applicable One could reuse this functor instead of rolling out your own version. There were a couple other cases where the code was similar, but not quite the same, such as it might have an assertion in the lambda or other constructs. Thus, I've not touched any of those, as it might change the behavior in some way. As per https://discourse.llvm.org/t/submitting-simple-nfc-patches/62640/3?u=steakhal Chris Lattner > LLVM intentionally has a “yes, you can apply common sense judgement to > things” policy when it comes to code review. If you are doing mechanical > patches (e.g. adopting less_first) that apply to the entire monorepo, > then you don’t need everyone in the monorepo to sign off on it. Having > some +1 validation from someone is useful, but you don’t need everyone > whose code you touch to weigh in. Differential Revision: https://reviews.llvm.org/D126068	2022-05-27 11:15:23 +02:00
Balazs Benics	f13050eca3	[analyzer][NFCi] Annotate major nonnull returning functions This patch annotates the most important analyzer function APIs. Also adds a couple of assertions for uncovering any potential issues earlier in the constructor; in those cases, the member functions were already dereferencing the members unconditionally anyway. Measurements showed no performance impact, nor crashes. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D126198	2022-05-27 11:05:50 +02:00
Gabor Marton	6ab69efe61	[analyzer][NFC] Rename GREngine->CoreEngine, GRExprEngine->ExprEngine in comments and txt files fixes #115	2022-05-27 11:04:35 +02:00
Balazs Benics	3a666dd37a	[analyzer][NFC] Use MemRegion::getRegion()'s return value unconditionally Reviewed By: martong Differential Revision: https://reviews.llvm.org/D126123	2022-05-27 10:07:06 +02:00
Balazs Benics	813acb1297	[analyzer][NFC] Remove unused SVal::hasConjuredSymbol Reviewed By: martong Differential Revision: https://reviews.llvm.org/D126130	2022-05-27 10:07:06 +02:00
Balazs Benics	81066603a8	[analyzer][NFC] Remove unused nonloc::ConcreteInt::evalBinOp Reviewed By: martong Differential Revision: https://reviews.llvm.org/D126129	2022-05-27 10:07:06 +02:00
Balazs Benics	f6eab43764	[analyzer][NFC] Inline loc::ConcreteInt::evalBinOp This patch also refactored some of the enclosing parts. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D126128	2022-05-27 10:07:06 +02:00
Balazs Benics	ee8987d585	[analyzer][NFC] Inline ExprEngine::evalMinus Reviewed By: martong Differential Revision: https://reviews.llvm.org/D126125	2022-05-27 10:07:06 +02:00
Balazs Benics	7a2d6dea73	[analyzer][NFC] Inline ExprEngine::evalComplement Reviewed By: martong Differential Revision: https://reviews.llvm.org/D126124	2022-05-27 10:07:06 +02:00
Gabor Marton	88abc50398	[analyzer][solver] Handle UnarySymExpr in RangeConstraintSolver Fixes https://github.com/llvm/llvm-project/issues/55241 Differential Revision: https://reviews.llvm.org/D125395	2022-05-26 14:09:46 +02:00
Gabor Marton	b5b2aec1ff	[analyzer] Add UnarySymExpr This patch adds a new descendant to the SymExpr hierarchy. This way, now we can assign constraints to symbolic unary expressions. Only the unary minus and bitwise negation are handled. Differential Revision: https://reviews.llvm.org/D125318	2022-05-26 14:00:27 +02:00
Gabor Marton	ca3d962548	[analyzer] Return from reAssume if State is posteriorly overconstrained Depends on D124758. That patch introduced serious regression in the run-time in some special cases. This fixes that. Differential Revision: https://reviews.llvm.org/D126406	2022-05-26 13:50:40 +02:00
Gabor Marton	f75bc5bfc8	[analyzer] Fix symbol simplification assertion failure Fixes https://github.com/llvm/llvm-project/issues/55546 The assertion mentioned in the issue is triggered because an inconsistency is formed in the Sym->Class and Class->Sym relations. A simpler but similar inconsistency is demonstrated here: https://reviews.llvm.org/D114887 . Previously in `removeMember`, we didn't remove the old symbol's Sym->Class relation. Back then, we explained it with the following two bullet points: > 1) This way constraints for the old symbol can still be found via it's > equivalence class that it used to be the member of. > 2) Performance and resource reasons. We can spare one removal and thus one > additional tree in the forest of `ClassMap`. This patch do remove the old symbol's Sym->Class relation in order to keep the Sym->Class relation consistent with the Class->Sym relations. Point 2) above has negligible performance impact, empirical measurements do not show any noticeable difference in the run-time. Point 1) above seems to be a not well justified statement. This is because we cannot create a new symbol that would be equal to the old symbol after the simplification had happened. The reason for this is that the SValBuilder uses the available constant constraints for each sub-symbol. Differential Revision: https://reviews.llvm.org/D126281	2022-05-25 10:55:50 +02:00
Gabor Marton	96fba640cf	[analyzer][NFC] Factor out the copy-paste code repetition of assumeDual and assumeInclusiveRangeDual Depends on D125892. There might be efficiency and performance implications by using a lambda. Thus, I am going to conduct measurements to see if there is any noticeable impact. I've been thinking about two more alternatives: 1) Make `assumeDualImpl` a variadic template and (perfect) forward the arguments for the used `assume` function. 2) Use a macros. I have concerns though, whether these alternatives would deteriorate the readability of the code. Differential Revision: https://reviews.llvm.org/D125954	2022-05-23 09:32:44 +02:00
Gabor Marton	32f189b0d9	[analyzer] Implement assumeInclusiveRange in terms of assumeInclusiveRangeDual Depends on D124758. This is the very same thing we have done for assumeDual, but this time we do it for assumeInclusiveRange. This patch is basically a no-brainer copy of that previous patch. Differential Revision: https://reviews.llvm.org/D125892	2022-05-23 09:32:44 +02:00
Jay Foad	6bec3e9303	[APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf Most clients only used these methods because they wanted to be able to extend or truncate to the same bit width (which is a no-op). Now that the standard zext, sext and trunc allow this, there is no reason to use the OrSelf versions. The OrSelf versions additionally have the strange behaviour of allowing extending to a smaller width, or truncating to a larger width, which are also treated as no-ops. A small amount of client code relied on this (ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and needed rewriting. Differential Revision: https://reviews.llvm.org/D125557	2022-05-19 11:23:13 +01:00
Usama Hameed	dd7233bc67	[Analyzer] Remove extra space from NSErrorChecker message. Differential Revision: https://reviews.llvm.org/D125840	2022-05-18 14:35:12 -07:00
Gabor Marton	56b9b97c1e	[clang][analyzer][ctu] Make CTU a two phase analysis This new CTU implementation is the natural extension of the normal single TU analysis. The approach consists of two analysis phases. During the first phase, we do a normal single TU analysis. During this phase, if we find a foreign function (that could be inlined from another TU) then we don’t inline that immediately, we rather mark that to be analysed later. When the first phase is finished then we start the second phase, the CTU phase. In this phase, we continue the analysis from that point (exploded node) which had been enqueued during the first phase. We gradually extend the exploded graph of the single TU analysis with the new node that was created by the inlining of the foreign function. We count the number of analysis steps of the first phase and we limit the second (ctu) phase with this number. This new implementation makes it convenient for the users to run the single-TU and the CTU analysis in one go, they don't need to run the two analysis separately. Thus, we name this new implementation as "onego" CTU. Discussion: https://discourse.llvm.org/t/rfc-much-faster-cross-translation-unit-ctu-analysis-implementation/61728 Differential Revision: https://reviews.llvm.org/D123773	2022-05-18 10:35:52 +02:00
Balazs Benics	a1025e6ffe	[analyzer] Introduce clang_analyzer_dumpSvalType introspection function In some rare cases the type of an SVal might be interesting. This introspection function exposes this information in tests. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D125532	2022-05-13 17:07:58 +02:00
Balazs Benics	d5ffc1ed8b	[analyzer][NFC] Tighten some of the SValBuilder return types This is purely a cosmetic change. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D125463	2022-05-13 17:04:34 +02:00
Endre Fülöp	094fb13b88	[analyzer] Add taint to the BoolAssignmentChecker BoolAssignment checker is now taint-aware and warns if a tainted value is assigned. Original author: steakhal Reviewed By: martong Differential Revision: https://reviews.llvm.org/D125360	2022-05-13 09:27:28 +02:00
Tomasz Kamiński	14742443a2	Reland "[analyzer] Canonicalize SymIntExpr so the RHS is positive when possible" This PR changes the `SymIntExpr` so the expression that uses a negative value as `RHS`, for example: `x +/- (-N)`, is modeled as `x -/+ N` instead. This avoids producing a very large `RHS` when the symbol is cased to an unsigned number, and as consequence makes the value more robust in presence of casts. Note that this change is not applied if `N` is the lowest negative value for which negation would not be representable. Reviewed By: steakhal Patch By: tomasz-kaminski-sonarsource! Differential Revision: https://reviews.llvm.org/D124658	2022-05-12 15:40:11 +02:00
Gabor Marton	34ac048aef	[analyzer] Replace adjacent assumeInBound calls to assumeInBoundDual This is to minimize superfluous assume calls. Depends on D124758 Differential Revision: https://reviews.llvm.org/D124761	2022-05-10 10:16:55 +02:00
Gabor Marton	1c1c1e25f9	[analyzer] Implement assume in terms of assumeDual Summary: By evaluating both children states, now we are capable of discovering infeasible parent states. In this patch, `assume` is implemented in the terms of `assumeDuali`. This might be suboptimal (e.g. where there are adjacent assume(true) and assume(false) calls, next patches addresses that). This patch fixes a real CRASH. Fixes https://github.com/llvm/llvm-project/issues/54272 Differential Revision: https://reviews.llvm.org/D124758	2022-05-10 10:16:55 +02:00
Gabor Marton	c4fa05f5f7	[analyzer] Indicate if a parent state is infeasible In some cases a parent State is already infeasible, but we recognize this only if an additonal constraint is added. This patch is the first of a series to address this issue. In this patch `assumeDual` is changed to clone the parent State but with an `Infeasible` flag set, and this infeasible-parent is returned both for the true and false case. Then when we add a new transition in the exploded graph and the destination is marked as infeasible, the node will be a sink node. Related bug: https://github.com/llvm/llvm-project/issues/50883 Actually, this patch does not solve that bug in the solver, rather with this patch we can handle the general parent-infeasible cases. Next step would be to change the State API and require all checkers to use the `assume*Dual` API and deprecate the simple `assume` calls. Hopefully, the next patch will introduce `assumeInBoundDual` and will solve the CRASH we have here: https://github.com/llvm/llvm-project/issues/54272 Differential Revision: https://reviews.llvm.org/D124674	2022-05-10 10:16:55 +02:00
Fred Tingaud	1ec1cdcfb4	[analyzer] Inline operator delete when MayInlineCXXAllocator is set. This patch restores the symmetry between how operator new and operator delete are handled by also inlining the content of operator delete when possible. Patch by Fred Tingaud. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D124845	2022-05-09 15:44:33 +02:00
Balazs Benics	da5b5ae852	Revert "[analyzer] Canonicalize SymIntExpr so the RHS is positive when possible" It seems like multiple users are affected by a crash introduced by this commit, thus I'm reverting it for the time being. Read more about the found reproducers at Phabricator. Differential Revision: https://reviews.llvm.org/D124658 This reverts commit `f0d6cb4a5c`.	2022-05-06 12:13:51 +02:00
Brian Tracy	87a55137e2	Fix "the the" typo in documentation and user facing strings There are many more instances of this pattern, but I chose to limit this change to .rst files (docs), anything in libcxx/include, and string literals. These have the highest chance of being seen by end users. Reviewed By: #libc, Mordante, martong, ldionne Differential Revision: https://reviews.llvm.org/D124708	2022-05-05 17:52:08 +02:00
Tomasz Kamiński	f0d6cb4a5c	[analyzer] Canonicalize SymIntExpr so the RHS is positive when possible This PR changes the `SymIntExpr` so the expression that uses a negative value as `RHS`, for example: `x +/- (-N)`, is modeled as `x -/+ N` instead. This avoids producing a very large `RHS` when the symbol is cased to an unsigned number, and as consequence makes the value more robust in presence of casts. Note that this change is not applied if `N` is the lowest negative value for which negation would not be representable. Reviewed By: steakhal Patch By: tomasz-kaminski-sonarsource! Differential Revision: https://reviews.llvm.org/D124658	2022-05-05 17:48:49 +02:00
einvbri	df5801806d	[analyzer] Get direct binding for specific punned case Region store was not able to see through this case to the actual initialized value of STRUCT ff. This change addresses this case by getting the direct binding. This was found and debugged in a downstream compiler, with debug guidance from @steakhal. A positive and negative test case is added. The specific case where this issue was exposed. typedef struct { int a:1; int b[2]; } STRUCT; int main() { STRUCT ff = {0}; STRUCT* pff = &ff; int a = ((int)pff + 1); return a; } Reviewed By: steakhal, martong Differential Revision: https://reviews.llvm.org/D124349	2022-05-05 04:53:45 -05:00
Ali Shuja Siddiqui	cf7cd664f3	[analyzer] Check for std::__addressof for inner pointer checker This is an extension to diff D99260. This adds an additional exception for `std::__addressof` in `InnerPointerChecker`. Patch By alishuja (Ali Shuja Siddiqui)! Reviewed By: martong, alishuja Differential Revision: https://reviews.llvm.org/D109467	2022-05-03 14:05:19 +02:00
Marco Antognini	68ee5ec07d	[Analyzer] Fix assumptions about const field with member-initializer Essentially, having a default member initializer for a constant member does not necessarily imply the member will have the given default value. Remove part of `a2e053638b` ([analyzer] Treat more const variables and fields as known contants., 2018-05-04). Fix #47878 Reviewed By: r.stahl, steakhal Differential Revision: https://reviews.llvm.org/D124621	2022-05-03 11:27:45 +02:00
Marco Antognini	f34639828f	[Analyzer] Minor cleanups in StreamChecker Remove unnecessary conversion to Optional<> and incorrect assumption that BindExpr can return a null state. Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D124681	2022-05-02 17:50:10 +02:00
Marco Antognini	5a47accda8	[Analyzer] Fix clang::ento::taint::dumpTaint definition Ensure the definition is in the "taint" namespace, like its declaration. Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D124462	2022-05-02 17:44:06 +02:00
Balazs Benics	5ce7050f70	[analyzer] Allow exploded graph dumps in release builds Historically, exploded graph dumps were disabled in non-debug builds. It was done so probably because a regular user should not dump the internal representation of the analyzer anyway and the dump methods might introduce unnecessary binary size overhead. It turns out some of the users actually want to dump this. Note that e.g. `LiveExpressionsDumper`, `LiveVariablesDumper`, `ControlDependencyTreeDumper` etc. worked previously, and they are unaffected by this change. However, `CFGViewer` and `CFGDumper` still won't work for a similar reason. AFAIK only these two won't work after this change. Addresses #53873 --- baseline \| binary \| size \| size after strip \| \| clang \| 103M \| 83M \| \| clang-tidy \| 67M \| 54M \| after this change \| binary \| size \| size after strip \| \| clang \| 103M \| 84M \| \| clang-tidy \| 67M \| 54M \| CMake configuration: ``` cmake -S llvm -GNinja -DBUILD_SHARED_LIBS=OFF -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang -DLLVM_ENABLE_ASSERTIONS=OFF -DLLVM_USE_LINKER=lld -DLLVM_ENABLE_DUMP=OFF -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra" -DLLVM_ENABLE_Z3_SOLVER=ON -DLLVM_TARGETS_TO_BUILD="X86" ``` Built by `clang-14.0.0`. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D124442	2022-05-02 11:42:08 +02:00
Balazs Benics	fd7efe33f1	[analyzer] Fix cast evaluation on scoped enums in ExprEngine We ignored the cast if the enum was scoped. This is bad since there is no implicit conversion from the scoped enum to the corresponding underlying type. The fix is basically: isIntegralOrEnumerationType() -> isIntegralOrUnscopedEnumerationType() This materialized in crashes on analyzing the LLVM itself using the Z3 refutation. Refutation synthesized the given Z3 Binary expression (`BO_And` of `unsigned char` aka. 8 bits and an `int` 32 bits) with the wrong bitwidth in the end, which triggered an assert. Now, we evaluate the cast according to the standard. This bug could have been triggered using the Z3 CM according to https://bugs.llvm.org/show_bug.cgi?id=44030 Fixes #47570 #43375 Reviewed By: martong Differential Revision: https://reviews.llvm.org/D85528	2022-05-02 10:54:26 +02:00
Balazs Benics	5a2e595eb8	[analyzer] Fix Static Analyzer g_memdup false-positive `g_memdup()` allocates and copies memory, thus we should not assume that the returned memory region is uninitialized because it might not be the case. PS: It would be even better to copy the bindings to mimic the actual content of the buffer, but this works too. Fixes #53617 Reviewed By: martong Differential Revision: https://reviews.llvm.org/D124436	2022-05-02 10:35:51 +02:00
Andrew Ng	57c55165eb	[analyzer] Fix return of llvm::StringRef to destroyed std::string This issue was discovered whilst testing with ASAN. Differential Revision: https://reviews.llvm.org/D124683	2022-05-01 12:24:32 +01:00
Artem Dergachev	f68c0a2f58	[analyzer] Add path note tags to standard library function summaries. The patch is straightforward except the tiny fix in BugReporterVisitors.cpp that suppresses a default note for "Assuming pointer value is null" when a note tag from the checker is present. This is probably the right thing to do but also definitely not a complete solution to the problem of different sources of path notes being unaware of each other, which is a large and annoying issue that we have to deal with. Note tags really help there because they're nicely introspectable. The problem is demonstrated by the newly added getenv() test. Differential Revision: https://reviews.llvm.org/D122285	2022-04-28 17:17:05 -07:00
Balazs Benics	be744da01f	[analyzer] Fix ValistChecker false-positive involving symbolic pointers In the following example: int va_list_get_int(va_list va) { return va_arg(va, int); // FP } The `*va` expression will be something like `Element{SymRegion{va}, 0, va_list}`. We use `ElementRegions` for representing the result of the dereference. In this case, the `IsSymbolic` was set to `false` in the `getVAListAsRegion()`. Hence, before checking if the memregion is a SymRegion, we should take the base of that region. Analogously to the previous example, one can craft other cases: struct MyVaList { va_list l; }; int va_list_get_int(struct MyVaList va) { return va_arg(va.l, int); // FP } But it would also work if the `va_list` would be in the base or derived part of a class. `ObjCIvarRegions` are likely also susceptible. I'm not explicitly demonstrating these cases. PS: Check the `MemRegion::getBaseRegion()` definition. Fixes #55009 Reviewed By: xazax.hun Differential Revision: https://reviews.llvm.org/D124239	2022-04-26 08:49:05 +02:00
Vince Bridgers	3566bbe62f	[analyzer] Add option for AddrSpace in core.NullDereference check This change adds an option to detect all null dereferences for non-default address spaces, except for address spaces 256, 257 and 258. Those address spaces are special since null dereferences are not errors. All address spaces can be considered (except for 256, 257, and 258) by using -analyzer-config core.NullDereference:DetectAllNullDereferences=true. This option is false by default, retaining the original behavior. A LIT test was enhanced to cover this case, and the rst documentation was updated to describe this behavior. Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D122841	2022-04-24 03:51:49 -05:00
Vince Bridgers	5114db933d	[analyzer] Clean checker options from bool to DefaultBool (NFC) A recent review emphasized the preference to use DefaultBool instead of bool for checker options. This change is a NFC and cleans up some of the instances where bool was used, and could be changed to DefaultBool. Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D123464	2022-04-23 14:47:29 -05:00
Nathan James	cfb8169059	[clang] Add a raw_ostream operator<< overload for QualType Under the hood this prints the same as `QualType::getAsString()` but cuts out the middle-man when that string is sent to another raw_ostream. Also cleaned up all the call sites where this occurs. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D123926	2022-04-20 22:09:05 +01:00
Aaron Ballman	9955f14aaf	[C2x] Disallow functions without prototypes/functions with identifier lists WG14 has elected to remove support for K&R C functions in C2x. The feature was introduced into C89 already deprecated, so after this long of a deprecation period, the committee has made an empty parameter list mean the same thing in C as it means in C++: the function accepts no arguments exactly as if the function were written with (void) as the parameter list. This patch implements WG14 N2841 No function declarators without prototypes (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2841.htm) and WG14 N2432 Remove support for function definitions with identifier lists (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2432.pdf). It also adds The -fno-knr-functions command line option to opt into this behavior in other language modes. Differential Revision: https://reviews.llvm.org/D123955	2022-04-20 13:28:15 -04:00
Denys Petrov	e37726beb2	[analyzer] Implemented RangeSet::Factory::castTo function to perform promotions, truncations and conversions. Summary: Handle casts for ranges working similarly to APSIntType::apply function but for the whole range set. Support promotions, truncations and conversions. Example: promotion: char [0, 42] -> short [0, 42] -> int [0, 42] -> llong [0, 42] truncation: llong [4295033088, 4295033130] -> int [65792, 65834] -> short [256, 298] -> char [0, 42] conversion: char [-42, 42] -> uint [0, 42]U[4294967254, 4294967295] -> short[-42, 42] Differential Revision: https://reviews.llvm.org/D103094	2022-04-19 22:34:03 +03:00
Tom Ritter	82f3ed9904	[analyzer] Expose Taint.h to plugins Reviewed By: NoQ, xazax.hun, steakhal Differential Revision: https://reviews.llvm.org/D123155	2022-04-19 16:55:01 +02:00
Kristóf Umann	fd8e5762f8	[analyzer] Don't track function calls as control dependencies I recently evaluated ~150 of bug reports on open source projects relating to my GSoC'19 project, which was about tracking control dependencies that were relevant to a bug report. Here is what I found: when the condition is a function call, the extra notes were almost always unimportant, and often times intrusive: void f(int x) { x = nullptr; if (alwaysTrue()) // We don't need a whole lot of explanation // here, the function name is good enough. x = 5; } It almost always boiled down to a few "Returning null pointer, which participates in a condition later", or similar notes. I struggled to find a single case where the notes revealed anything interesting or some previously hidden correlation, which is kind of the point of condition tracking. This patch checks whether the condition is a function call, and if so, bails out. The argument against the patch is the popular feedback we hear from some of our users, namely that they can never have too much information. I was specifically fishing for examples that display best that my contribution did more good than harm, so admittedly I set the bar high, and one can argue that there can be non-trivial trickery inside functions, and function names may not be that descriptive. My argument for the patch is all those reports that got longer without any notable improvement in the report intelligibility. I think the few exceptional cases where this patch would remove notable information are an acceptable sacrifice in favor of more reports being leaner. Differential Revision: https://reviews.llvm.org/D116597	2022-04-08 10:16:58 +02:00
Gabor Marton	e63b81d10e	[analyzer][ctu] Only import const and trivial VarDecls Do import the definition of objects from a foreign translation unit if that's type is const and trivial. Differential Revision: https://reviews.llvm.org/D122805	2022-04-01 13:49:39 +02:00
Vince Bridgers	4d5b824e3d	[analyzer] Avoid checking addrspace pointers in cstring checker This change fixes an assert that occurs in the SMT layer when refuting a finding that uses pointers of two different sizes. This was found in a downstream build that supports two different pointer sizes, The CString Checker was attempting to compute an overlap for the 'to' and 'from' pointers, where the pointers were of different sizes. In the downstream case where this was found, a specialized memcpy routine patterned after memcpy_special is used. The analyzer core hits on this builtin because it matches the 'memcpy' portion of that builtin. This cannot be duplicated in the upstream test since there are no specialized builtins that match that pattern, but the case does reproduce in the accompanying LIT test case. The amdgcn target was used for this reproducer. See the documentation for AMDGPU address spaces here https://llvm.org/docs/AMDGPUUsage.html#address-spaces. The assert seen is: `Solver->getSort(LHS) == Solver->getSort(RHS) && "AST's must have the same sort!"' Ack to steakhal for reviewing the fix, and creating the test case. Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D118050	2022-03-31 17:34:56 +02:00
Vince Bridgers	fe8b2236ef	[analyzer] Fix "RhsLoc and LhsLoc bitwidth must be same" clang: <root>/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp:727: void assertEqualBitWidths(clang::ento::ProgramStateRef, clang::ento::Loc, clang::ento::Loc): Assertion `RhsBitwidth == LhsBitwidth && "RhsLoc and LhsLoc bitwidth must be same!"' This change adjusts the bitwidth of the smaller operand for an evalBinOp as a result of a comparison operation. This can occur in the specific case represented by the test cases for a target with different pointer sizes. Reviewed By: NoQ Differential Revision: https://reviews.llvm.org/D122513	2022-03-29 17:08:19 -05:00
Mike Rice	f82ec5532b	[OpenMP] Initial parsing/sema for the 'omp target parallel loop' construct Adds basic parsing/sema/serialization support for the #pragma omp target parallel loop directive. Differential Revision: https://reviews.llvm.org/D122359	2022-03-24 09:19:00 -07:00
Vince Bridgers	9ef7ac51af	[analyzer] Fix crash in RangedConstraintManager.cpp This change fixes a crash in RangedConstraintManager.cpp:assumeSym due to an unhandled BO_Div case. clang: <root>clang/lib/StaticAnalyzer/Core/RangedConstraintManager.cpp:51: virtual clang::ento::ProgramStateRef clang::ento::RangedConstraintManager::assumeSym(clang::ento::ProgramStateRef, clang::ento::SymbolRef, bool): Assertion `BinaryOperator::isComparisonOp(Op)' failed. Reviewed By: NoQ Differential Revision: https://reviews.llvm.org/D122277	2022-03-23 08:26:40 -05:00
Vince Bridgers	5fdc4dd777	[analyzer] refactor makeIntValWithPtrWidth, remove getZeroWithPtrWidth (NFC) This is a NFC refactoring to change makeIntValWithPtrWidth and remove getZeroWithPtrWidth to use types when forming values to match pointer widths. Some targets may have different pointer widths depending upon address space, so this needs to be comprehended. Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D120134	2022-03-23 08:26:37 -05:00
Mike Rice	2cedaee6f7	[OpenMP] Initial parsing/sema for the 'omp parallel loop' construct Adds basic parsing/sema/serialization support for the #pragma omp parallel loop directive. Differential Revision: https://reviews.llvm.org/D122247	2022-03-22 13:55:47 -07:00
Vince Bridgers	985888411d	[analyzer] Refactor makeNull to makeNullWithWidth (NFC) Usages of makeNull need to be deprecated in favor of makeNullWithWidth for architectures where the pointer size should not be assumed. This can occur when pointer sizes can be of different sizes, depending on address space for example. See https://reviews.llvm.org/D118050 as an example. This was uncovered initially in a downstream compiler project, and tested through those systems tests. steakhal performed systems testing across a large set of open source projects. Co-authored-by: steakhal Resolves: https://github.com/llvm/llvm-project/issues/53664 Reviewed By: NoQ, steakhal Differential Revision: https://reviews.llvm.org/D119601	2022-03-22 07:35:13 -05:00
Mike Rice	6bd8dc91b8	[OpenMP] Initial parsing/sema for the 'omp target teams loop' construct Adds basic parsing/sema/serialization support for the #pragma omp target teams loop directive. Differential Revision: https://reviews.llvm.org/D122028	2022-03-18 13:48:32 -07:00
Mike Rice	79f661edc1	[OpenMP] Initial parsing/sema for the 'omp teams loop' construct Adds basic parsing/sema/serialization support for the #pragma omp teams loop directive. Differential Revision: https://reviews.llvm.org/D121713	2022-03-16 14:39:18 -07:00
phyBrackets	90a6e35478	[analyzer][NFC] Merge similar conditional paths Reviewed By: aaron.ballman, steakhal Differential Revision: https://reviews.llvm.org/D121045	2022-03-07 22:05:27 +05:30
Endre Fülöp	4fd6c6e65a	[analyzer] Add more propagations to Taint analysis Add more functions as taint propators to GenericTaintChecker. Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D120369	2022-03-07 13:18:54 +01:00
Shivam	56eaf869be	[analyzer] Done some changes to detect Uninitialized read by the char array manipulation functions Few weeks back I was experimenting with reading the uninitialized values from src , which is actually a bug but the CSA seems to give up at that point . I was curious about that and I pinged @steakhal on the discord and according to him this seems to be a genuine issue and needs to be fix. So I goes with fixing this bug and thanks to @steakhal who help me creating this patch. This feature seems to break some tests but this was the genuine problem and the broken tests also needs to fix in certain manner. I add a test but yeah we need more tests,I'll try to add more tests.Thanks Reviewed By: steakhal, NoQ Differential Revision: https://reviews.llvm.org/D120489	2022-03-04 00:21:06 +05:30
Shivam	bd1917c88a	[analyzer] Done some changes to detect Uninitialized read by the char array manipulation functions Few weeks back I was experimenting with reading the uninitialized values from src , which is actually a bug but the CSA seems to give up at that point . I was curious about that and I pinged @steakhal on the discord and according to him this seems to be a genuine issue and needs to be fix. So I goes with fixing this bug and thanks to @steakhal who help me creating this patch. This feature seems to break some tests but this was the genuine problem and the broken tests also needs to fix in certain manner. I add a test but yeah we need more tests,I'll try to add more tests.Thanks Reviewed By: steakhal, NoQ Differential Revision: https://reviews.llvm.org/D120489	2022-03-03 23:21:26 +05:30
Kristóf Umann	d832078904	[analyzer] Improve NoOwnershipChangeVisitor's understanding of deallocators The problem with leak bug reports is that the most interesting event in the code is likely the one that did not happen -- lack of ownership change and lack of deallocation, which is often present within the same function that the analyzer inlined anyway, but not on the path of execution on which the bug occured. We struggle to understand that a function was responsible for freeing the memory, but failed. D105819 added a new visitor to improve memory leak bug reports. In addition to inspecting the ExplodedNodes of the bug pat, the visitor tries to guess whether the function was supposed to free memory, but failed to. Initially (in D108753), this was done by checking whether a CXXDeleteExpr is present in the function. If so, we assume that the function was at least party responsible, and prevent the analyzer from pruning bug report notes in it. This patch improves this heuristic by recognizing all deallocator functions that MallocChecker itself recognizes, by reusing MallocChecker::isFreeingCall. Differential Revision: https://reviews.llvm.org/D118880	2022-03-03 11:27:56 +01:00
Simon Pilgrim	ca94f28d15	[clang] ExprEngine::VisitCXXNewExpr - remove superfluous nullptr tests FD has already been dereferenced	2022-03-02 15:59:10 +00:00
Kristóf Umann	32ac21d049	[NFC][analyzer] Allow CallDescriptions to be matched with CallExprs Since CallDescriptions can only be matched against CallEvents that are created during symbolic execution, it was not possible to use it in syntactic-only contexts. For example, even though InnerPointerChecker can check with its set of CallDescriptions whether a function call is interested during analysis, its unable to check without hassle whether a non-analyzer piece of code also calls such a function. The patch adds the ability to use CallDescriptions in syntactic contexts as well. While we already have that in Signature, we still want to leverage the ability to use dynamic information when we have it (function pointers, for example). This could be done with Signature as well (StdLibraryFunctionsChecker does it), but it makes it even less of a drop-in replacement. Differential Revision: https://reviews.llvm.org/D119004	2022-03-01 17:13:04 +01:00
Balázs Kéri	d8a2afb244	[clang][analyzer] Add modeling of 'errno'. Add a checker to maintain the system-defined value 'errno'. The value is supposed to be set in the future by existing or new checkers that evaluate errno-modifying function calls. Reviewed By: NoQ, steakhal Differential Revision: https://reviews.llvm.org/D120310	2022-03-01 08:20:33 +01:00
Dawid Jurczak	b3e2dac27c	[NFC] Don't pass temporary LangOptions to Lexer Since https://reviews.llvm.org/D120334 we shouldn't pass temporary LangOptions to Lexer. This change fixes stack-use-after-scope UB in LocalizationChecker found by sanitizer-x86_64-linux-fast buildbot and resolve similar issue in HeaderIncludes.	2022-02-28 20:43:28 +01:00
Endre Fülöp	34a7387986	[analyzer] Add more sources to Taint analysis Add more functions as taint sources to GenericTaintChecker. Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D120236	2022-02-28 11:33:02 +01:00
Aaron Ballman	f9e8e92cf5	Revert "[clang][analyzer] Add modeling of 'errno'." This reverts commit `29b512ba32`. This broke several build bots: https://lab.llvm.org/buildbot/#/builders/86/builds/30183 https://lab.llvm.org/buildbot/#/builders/216/builds/488	2022-02-25 07:21:01 -05:00
Balázs Kéri	29b512ba32	[clang][analyzer] Add modeling of 'errno'. Add a checker to maintain the system-defined value 'errno'. The value is supposed to be set in the future by existing or new checkers that evaluate errno-modifying function calls. Reviewed By: NoQ, steakhal Differential Revision: https://reviews.llvm.org/D120310	2022-02-25 12:42:55 +01:00
Fangrui Song	ecff9b65b5	[analyzer] Just use default capture after `7fd60ee6e0`	2022-02-24 10:06:11 -08:00
Fangrui Song	7fd60ee6e0	[analyzer] Fix -Wunused-lambda-capture in -DLLVM_ENABLE_ASSERTIONS=off builds	2022-02-24 00:13:13 -08:00
Balazs Benics	7036413dc2	Revert "Revert "[analyzer] Fix taint rule of fgets and setproctitle_init"" This reverts commit `2acead35c1`. Let's try `REQUIRES: asserts`.	2022-02-23 12:55:31 +01:00
Balazs Benics	a848a5cf2f	Revert "Revert "[analyzer] Fix taint propagation by remembering to the location context"" This reverts commit `d16c5f4192`. Let's try `REQUIRES: asserts`.	2022-02-23 12:53:07 +01:00
Balazs Benics	fa0a80e017	Revert "Revert "[analyzer] Add failing test case demonstrating buggy taint propagation"" This reverts commit `b8ae323cca`. Let's try `REQUIRES: asserts`.	2022-02-23 10:48:06 +01:00
Artem Dergachev	e0e174845b	[analyzer] Fix a crash in NoStateChangeVisitor with body-farmed stack frames. LocationContext::getDecl() isn't useful for obtaining the "farmed" body because the (synthetic) body statement isn't actually attached to the (natural-grown) declaration in the AST. Differential Revision: https://reviews.llvm.org/D119509	2022-02-17 10:13:34 -08:00
Balazs Benics	b3c0014e5a	Revert "Revert "[analyzer] Prevent misuses of -analyze-function"" This reverts commit `620d99b7ed`. Let's see if removing the two offending RUN lines makes this patch pass. Not ideal to drop tests but, it's just a debugging feature, probably not that important.	2022-02-16 10:33:21 +01:00
Balazs Benics	b8ae323cca	Revert "[analyzer] Add failing test case demonstrating buggy taint propagation" This reverts commit `744745ae19`. I'm reverting this since this patch caused a build breakage. https://lab.llvm.org/buildbot/#/builders/91/builds/3818	2022-02-14 18:45:46 +01:00
Balazs Benics	d16c5f4192	Revert "[analyzer] Fix taint propagation by remembering to the location context" This reverts commit `b099e1e562`. I'm reverting this since the head of the patch stack caused a build breakage. https://lab.llvm.org/buildbot/#/builders/91/builds/3818	2022-02-14 18:45:46 +01:00
Balazs Benics	2acead35c1	Revert "[analyzer] Fix taint rule of fgets and setproctitle_init" This reverts commit `bf5963bf19`. I'm reverting this since the head of the patch stack caused a build breakage. https://lab.llvm.org/buildbot/#/builders/91/builds/3818	2022-02-14 18:45:46 +01:00
Balazs Benics	bf5963bf19	[analyzer] Fix taint rule of fgets and setproctitle_init There was a typo in the rule. `{{0}, ReturnValueIndex}` meant that the discrete index is `0` and the variadic index is `-1`. What we wanted instead is that both `0` and `-1` are in the discrete index list. Instead of this, we wanted to express that both `0` and the `ReturnValueIndex` is in the discrete arg list. The manual inspection revealed that `setproctitle_init` also suffered a probably incomplete propagation rule. Reviewed By: Szelethus, gamesh411 Differential Revision: https://reviews.llvm.org/D119129	2022-02-14 16:55:55 +01:00
Balazs Benics	b099e1e562	[analyzer] Fix taint propagation by remembering to the location context Fixes the issue D118987 by mapping the propagation to the callsite's LocationContext. This way we can keep track of the in-flight propagations. Note that empty propagation sets won't be inserted. Reviewed By: NoQ, Szelethus Differential Revision: https://reviews.llvm.org/D119128	2022-02-14 16:55:55 +01:00
Balazs Benics	744745ae19	[analyzer] Add failing test case demonstrating buggy taint propagation Recently we uncovered a serious bug in the `GenericTaintChecker`. It was already flawed before D116025, but that was the patch that turned this silent bug into a crash. It happens if the `GenericTaintChecker` has a rule for a function, which also has a definition. char fgets(char s, int n, FILE fp) { nested_call(); // no parameters! return (char )0; } // Within some function: fgets(..., tainted_fd); When the engine inlines the definition and finds a function call within that, the `PostCall` event for the call will get triggered sooner than the `PostCall` for the original function. This mismatch violates the assumption of the `GenericTaintChecker` which wants to propagate taint information from the `PreCall` event to the `PostCall` event, where it can actually bind taint to the return value of the same call. Let's get back to the example and go through step-by-step. The `GenericTaintChecker` will see the `PreCall<fgets(..., tainted_fd)>` event, so it would 'remember' that it needs to taint the return value and the buffer, from the `PostCall` handler, where it has access to the return value symbol. However, the engine will inline fgets and the `nested_call()` gets evaluated subsequently, which produces an unimportant `PreCall<nested_call()>`, then a `PostCall<nested_call()>` event, which is observed by the `GenericTaintChecker`, which will unconditionally mark tainted the 'remembered' arg indexes, trying to access a non-existing argument, resulting in a crash. If it doesn't crash, it will behave completely unintuitively, by marking completely unrelated memory regions tainted, which is even worse. The resulting assertion is something like this: Expr.h: const Expr *CallExpr::getArg(unsigned int) const: Assertion `Arg < getNumArgs() && "Arg access out of range!"' failed. The gist of the backtrace: CallExpr::getArg(unsigned int) const SimpleFunctionCall::getArgExpr(unsigned int) CallEvent::getArgSVal(unsigned int) const GenericTaintChecker::checkPostCall(const CallEvent &, CheckerContext&) const Prior to D116025, there was a check for the argument count before it applied taint, however, it still suffered from the same underlying issue/bug regarding propagation. This path does not intend to fix the bug, rather start a discussion on how to fix this. --- Let me elaborate on how I see this problem. This pre-call, post-call juggling is just a workaround. The engine should by itself propagate taint where necessary right where it invalidates regions. For the tracked values, which potentially escape, we need to erase the information we know about them; and this is exactly what is done by invalidation. However, in the case of taint, we basically want to approximate from the opposite side of the spectrum. We want to preserve taint in most cases, rather than cleansing them. Now, we basically sanitize all escaping tainted regions implicitly, since invalidation binds a fresh conjured symbol for the given region, and that has not been associated with taint. IMO this is a bad default behavior, we should be more aggressive about preserving taint if not further spreading taint to the reachable regions. We have a couple of options for dealing with it (let's call it //tainting policy//): 1) Taint only the parameters which were tainted prior to the call. 2) Taint the return value of the call, since it likely depends on the tainted input - if any arguments were tainted. 3) Taint all escaped regions - (maybe transitively using the cluster algorithm) - if any arguments were tainted. 4) Not taint anything - this is what we do right now :D The `ExprEngine` should not deal with taint on its own. It should be done by a checker, such as the `GenericTaintChecker`. However, the `Pre`-`PostCall` checker callbacks are not designed for this. `RegionChanges` would be a much better fit for modeling taint propagation. What we would need in the `RegionChanges` callback is the `State` prior invalidation, the `State` after the invalidation, and a `CheckerContext` in which the checker can create transitions, where it would place `NoteTags` for the modeled taint propagations and report errors if a taint sink rule gets violated. In this callback, we could query from the prior State, if the given value was tainted; then act and taint if necessary according to the checker's tainting policy. By using RegionChanges for this, we would 'fix' the mentioned propagation bug 'by-design'. Reviewed By: Szelethus Differential Revision: https://reviews.llvm.org/D118987	2022-02-14 16:55:55 +01:00
phyBrackets	6745b6a0f1	[analyzer][NFCi] Use the correct BugType in CStringChecker. There is different bug types for different types of bugs but the emitAdditionOverflowbug seems to use bugtype BT_NotCSting but actually it have to use BT_AdditionOverflow . Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D119462	2022-02-14 20:54:59 +05:30
Balazs Benics	abc873694f	[analyzer] Restrict CallDescription fuzzy builtin matching `CallDescriptions` for builtin functions relaxes the match rules somewhat, so that the `CallDescription` will match for calls that have some prefix or suffix. This was achieved by doing a `StringRef::contains()`. However, this is somewhat problematic for builtins that are substrings of each other. Consider the following: `CallDescription{ builtin, "memcpy"}` will match for `__builtin_wmemcpy()` calls, which is unfortunate. This patch addresses/works around the issue by checking if the characters around the function's name are not part of the 'name' semantically. In other words, to accept a match for `"memcpy"` the call should not have alphanumeric (`[a-zA-Z]`) characters around the 'match'. So, `CallDescription{ builtin, "memcpy"}` will not match on: - `__builtin_wmemcpy: there is a `w` alphanumeric character before the match. - `__builtin_memcpyFOoBar_inline`: there is a `F` character after the match. - `__builtin_memcpyX_inline`: there is an `X` character after the match. But it will still match for: - `memcpy`: exact match - `__builtin_memcpy`: there is an _ before the match - `__builtin_memcpy_inline`: there is an _ after the match - `memcpy_inline_builtinFooBar`: there is an _ after the match Reviewed By: NoQ Differential Revision: https://reviews.llvm.org/D118388	2022-02-11 10:45:18 +01:00
Sylvestre Ledru	f2c2e924e7	Fix a typo (occured => occurred) Reported: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1005195	2022-02-08 21:35:26 +01:00
Balazs Benics	620d99b7ed	Revert "[analyzer] Prevent misuses of -analyze-function" This reverts commit `841817b1ed`. Ah, it still fails on build bots for some reason. Pinning the target triple was not enough.	2022-02-08 17:42:46 +01:00
Balazs Benics	841817b1ed	[analyzer] Prevent misuses of -analyze-function Sometimes when I pass the mentioned option I forget about passing the parameter list for c++ sources. It would be also useful newcomers to learn about this. This patch introduces some logic checking common misuses involving `-analyze-function`. Reviewed-By: martong Differential Revision: https://reviews.llvm.org/D118690	2022-02-08 17:27:57 +01:00
Jun Zhang	65adf7c211	[NFC][Analyzer] Use range based for loop. Use range base loop loop to improve code readability. Differential Revision: https://reviews.llvm.org/D119103	2022-02-07 15:45:58 +08:00
Rashmi Mudduluru	faabdfcf7f	[analyzer] Add support for __attribute__((returns_nonnull)). Differential Revision: https://reviews.llvm.org/D118657	2022-02-02 11:46:52 -08:00
Balazs Benics	e99abc5d8a	Revert "[analyzer] Prevent misuses of -analyze-function" This reverts commit `9d6a615973`. Exit Code: 1 Command Output (stderr): -- /scratch/buildbot/bothome/clang-ve-ninja/llvm-project/clang/test/Analysis/analyze-function-guide.cpp:53:21: error: CHECK-EMPTY-NOT: excluded string found in input // CHECK-EMPTY-NOT: Every top-level function was skipped. ^ <stdin>:1:1: note: found here Every top-level function was skipped. ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Input file: <stdin> Check file: /scratch/buildbot/bothome/clang-ve-ninja/llvm-project/clang/test/Analysis/analyze-function-guide.cpp -dump-input=help explains the following input dump. Input was: <<<<<< 1: Every top-level function was skipped. not:53 !~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match expected 2: Pass the -analyzer-display-progress for tracking which functions are analyzed. >>>>>>	2022-02-02 11:44:27 +01:00
Balazs Benics	9d6a615973	[analyzer] Prevent misuses of -analyze-function Sometimes when I pass the mentioned option I forget about passing the parameter list for c++ sources. It would be also useful newcomers to learn about this. This patch introduces some logic checking common misuses involving `-analyze-function`. Reviewed-By: martong Differential Revision: https://reviews.llvm.org/D118690	2022-02-02 11:31:22 +01:00
Tres Popp	262cc74e0b	Fix pair construction with an implicit constructor inside.	2022-01-18 18:01:52 +01:00
Endre Fülöp	17f74240e6	[analyzer][NFC] Refactor GenericTaintChecker to use CallDescriptionMap GenericTaintChecker now uses CallDescriptionMap to describe the possible operation in code which trigger the introduction (sources), the removal (filters), the passing along (propagations) and detection (sinks) of tainted values. Reviewed By: steakhal, NoQ Differential Revision: https://reviews.llvm.org/D116025	2022-01-18 16:04:04 +01:00
Denys Petrov	d835dd4cf5	[analyzer] Produce SymbolCast symbols for integral types in SValBuilder::evalCast Summary: Produce SymbolCast for integral types in `evalCast` function. Apply several simplification techniques while producing the symbols. Added a boolean option `handle-integral-cast-for-ranges` under `-analyzer-config` flag. Disabled the feature by default. Differential Revision: https://reviews.llvm.org/D105340	2022-01-18 16:08:04 +02:00
Kazu Hirata	17d4bd3d78	[clang] Fix bugprone argument comments (NFC) Identified with bugprone-argument-comment.	2022-01-09 00:19:49 -08:00
Kazu Hirata	40446663c7	[clang] Use true/false instead of 1/0 (NFC) Identified with modernize-use-bool-literals.	2022-01-09 00:19:47 -08:00
Kazu Hirata	d1b127b5b7	[clang] Remove unused forward declarations (NFC)	2022-01-08 11:56:40 -08:00
Qiu Chaofan	c2cc70e4f5	[NFC] Fix endif comments to match with include guard	2022-01-07 15:52:59 +08:00
Kazu Hirata	d677a7cb05	[clang] Remove redundant member initialization (NFC) Identified with readability-redundant-member-init.	2022-01-02 10:20:23 -08:00
Kazu Hirata	298367ee6e	[clang] Use nullptr instead of 0 or NULL (NFC) Identified with modernize-use-nullptr.	2021-12-29 08:34:20 -08:00
Kazu Hirata	6c335b1a45	[clang] Remove unused "using" (NFC) Identified by misc-unused-using-decls.	2021-12-27 20:48:21 -08:00
Kazu Hirata	0542d15211	Remove redundant string initialization (NFC) Identified with readability-redundant-string-init.	2021-12-26 09:39:26 -08:00
Kazu Hirata	34558b039b	[StaticAnalyzer] Remove redundant declaration isStdSmartPtr (NFC) An identical declaration is present just a couple of lines above the line being removed in this patch. Identified with readability-redundant-declaration.	2021-12-25 00:35:41 -08:00
Sami Tolvanen	ec2e26eaf6	[Clang] Add __builtin_function_start Control-Flow Integrity (CFI) replaces references to address-taken functions with pointers to the CFI jump table. This is a problem for low-level code, such as operating system kernels, which may need the address of an actual function body without the jump table indirection. This change adds the __builtin_function_start() builtin, which accepts an argument that can be constant-evaluated to a function, and returns the address of the function body. Link: https://github.com/ClangBuiltLinux/linux/issues/1353 Depends on D108478 Reviewed By: pcc, rjmccall Differential Revision: https://reviews.llvm.org/D108479	2021-12-20 12:55:33 -08:00
Kazu Hirata	713ee230f8	[clang] Use llvm::reverse (NFC)	2021-12-17 16:51:42 -08:00
Denys Petrov	da8bd972a3	[analyzer][NFC] Change return value of StoreManager::attemptDownCast function from SVal to Optional<SVal> Summary: Refactor return value of `StoreManager::attemptDownCast` function by removing the last parameter `bool &Failed` and replace the return value `SVal` with `Optional<SVal>`. Make the function consistent with the family of `evalDerivedToBase` by renaming it to `evalBaseToDerived`. Aligned the code on the call side with these changes. Differential Revision: https://reviews.llvm.org/	2021-12-17 13:03:47 +02:00
Gabor Marton	bd9e23943a	[analyzer] Expand conversion check to check more expressions for overflow and underflow This expands checking for more expressions. This will check underflow and loss of precision when using call expressions like: void foo(unsigned); int i = -1; foo(i); This also includes other expressions as well, so it can catch negative indices to std::vector since it uses unsigned integers for [] and .at() function. Patch by: @pfultz2 Differential Revision: https://reviews.llvm.org/D46081	2021-12-15 11:41:34 +01:00
Denys Petrov	6a399bf4b3	[analyzer] Implemented RangeSet::Factory::unite function to handle intersections and adjacency Summary: Handle intersected and adjacent ranges uniting them into a single one. Example: intersection [0, 10] U [5, 20] = [0, 20] adjacency [0, 10] U [11, 20] = [0, 20] Differential Revision: https://reviews.llvm.org/D99797	2021-12-10 18:48:02 +02:00
Logan Smith	715c72b4fb	[NFC][analyzer] Return underlying strings directly instead of OS.str() This avoids an unnecessary copy required by 'return OS.str()', allowing instead for NRVO or implicit move. The .str() call (which flushes the stream) is no longer required since `65b13610a5`, which made raw_string_ostream unbuffered by default. Differential Revision: https://reviews.llvm.org/D115374	2021-12-09 16:05:46 -08:00
Gabor Marton	978431e80b	[Analyzer] SValBuilder: Simlify a SymExpr to the absolute simplest form Move the SymExpr simplification fixpoint logic into SValBuilder. Differential Revision: https://reviews.llvm.org/D114938	2021-12-07 10:02:32 +01:00
Balazs Benics	a6816b957d	[analyzer][solver] Fix assertion on (NonLoc, Op, Loc) expressions Previously, the `SValBuilder` could not encounter expressions of the following kind: NonLoc OP Loc Loc OP NonLoc Where the `Op` is other than `BO_Add`. As of now, due to the smarter simplification and the fixedpoint iteration, it turns out we can. It can happen if the `Loc` was perfectly constrained to a concrete value (`nonloc::ConcreteInt`), thus the simplifier can do constant-folding in these cases as well. Unfortunately, this could cause assertion failures, since we assumed that the operator must be `BO_Add`, causing a crash. --- In the patch, I decided to preserve the original behavior (aka. swap the operands (if the operator is commutative), but if the `RHS` was a `loc::ConcreteInt` call `evalBinOpNN()`. I think this interpretation of the arithmetic expression is closer to reality. I also tried naively introducing a separate handler for `loc::ConcreteInt` RHS, before doing handling the more generic `Loc` RHS case. However, it broke the `zoo1backwards()` test in the `nullptr.cpp` file. This highlighted for me the importance to preserve the original behavior for the `BO_Add` at least. PS: Sorry for introducing yet another branch into this `evalBinOpXX` madness. I've got a couple of ideas about refactoring these. We'll see if I can get to it. The test file demonstrates the issue and makes sure nothing similar happens. The `no-crash` annotated lines show, where we crashed before applying this patch. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D115149	2021-12-06 18:38:58 +01:00
Balazs Benics	9873ef409c	[analyzer] Ignore flex generated files Some projects [1,2,3] have flex-generated files besides bison-generated ones. Unfortunately, the comment `"/* A lexical scanner generated by flex */"` generated by the tools is not necessarily at the beginning of the file, thus we need to quickly skim through the file for this needle string. Luckily, StringRef can do this operation in an efficient way. That being said, now the bison comment is not required to be at the very beginning of the file. This allows us to detect a couple more cases [4,5,6]. Alternatively, we could say that we only allow whitespace characters before matching the bison/flex header comment. That would prevent the (probably) unnecessary string search in the buffer. However, I could not verify that these tools would actually respect this assumption. Additionally to this, e.g. the Twin project [1] has other non-whitespace characters (some preprocessor directives) before the flex-generated header comment. So the heuristic in the previous paragraph won't work with that. Thus, I would advocate the current implementation. According to my measurement, this patch won't introduce measurable performance degradation, even though we will do 2 linear scans. I introduce the ignore-bison-generated-files and ignore-flex-generated-files to disable skipping these files. Both of these options are true by default. [1]: https://github.com/cosmos72/twin/blob/master/server/rcparse_lex.cpp#L7 [2]: `22362cdcf9/sandbox/count-words/lexer.c (L6)` [3]: `11abdf6462/lab1/lex.yy.c (L6)` [4]: `47f5b2cfe2/B_yacc/1/y1.tab.h (L2)` [5]: `71d1bf9b1e/src/VBox/Additions/x11/x11include/xorg-server-1.8.0/parser.h (L2)` [6]: `3f773ceb13/Framework/OpenEars.framework/Versions/A/Headers/jsgf_parser.h (L2)` Reviewed By: xazax.hun Differential Revision: https://reviews.llvm.org/D114510	2021-12-06 10:20:17 +01:00
Gabor Marton	20f8733d4b	[Analyzer][solver] Simplification: Do a fixpoint iteration before the eq class merge This reverts commit `f02c5f3478` and addresses the issue mentioned in D114619 differently. Repeating the issue here: Currently, during symbol simplification we remove the original member symbol from the equivalence class (`ClassMembers` trait). However, we keep the reverse link (`ClassMap` trait), in order to be able the query the related constraints even for the old member. This asymmetry can lead to a problem when we merge equivalence classes: ``` ClassA: [a, b] // ClassMembers trait, a->a, b->a // ClassMap trait, a is the representative symbol ``` Now let,s delete `a`: ``` ClassA: [b] a->a, b->a ``` Let's merge ClassA into the trivial class `c`: ``` ClassA: [c, b] c->c, b->c, a->a ``` Now, after the merge operation, `c` and `a` are actually in different equivalence classes, which is inconsistent. This issue manifests in a test case (added in D103317): ``` void recurring_symbol(int b) { if (b * b != b) if ((b * b) * b * b != (b * b) * b) if (b * b == 1) } ``` Before the simplification we have these equivalence classes: ``` trivial EQ1: [b * b != b] trivial EQ2: [(b * b) * b * b != (b * b) * b] ``` During the simplification with `b * b == 1`, EQ1 is merged with `1 != b` `EQ1: [b * b != b, 1 != b]` and we remove the complex symbol, so `EQ1: [1 != b]` Then we start to simplify the only symbol in EQ2: `(b * b) * b * b != (b * b) * b --> 1 * b * b != 1 * b --> b * b != b` But `b * b != b` is such a symbol that had been removed previously from EQ1, thus we reach the above mentioned inconsistency. This patch addresses the issue by making it impossible to synthesise a symbol that had been simplified before. We achieve this by simplifying the given symbol to the absolute simplest form. Differential Revision: https://reviews.llvm.org/D114887	2021-12-01 22:23:41 +01:00
Gabor Marton	0a17896fe6	[Analyzer][Core] Make SValBuilder to better simplify svals with 3 symbols in the tree Add the capability to simplify more complex constraints where there are 3 symbols in the tree. In this change I extend simplifySVal to query constraints of children sub-symbols in a symbol tree. (The constraint for the parent is asked in getKnownValue.) Differential Revision: https://reviews.llvm.org/D103317	2021-11-30 11:24:59 +01:00
Gabor Marton	f02c5f3478	[Analyzer][solver] Do not remove the simplified symbol from the eq class Currently, during symbol simplification we remove the original member symbol from the equivalence class (`ClassMembers` trait). However, we keep the reverse link (`ClassMap` trait), in order to be able the query the related constraints even for the old member. This asymmetry can lead to a problem when we merge equivalence classes: ``` ClassA: [a, b] // ClassMembers trait, a->a, b->a // ClassMap trait, a is the representative symbol ``` Now lets delete `a`: ``` ClassA: [b] a->a, b->a ``` Let's merge the trivial class `c` into ClassA: ``` ClassA: [c, b] c->c, b->c, a->a ``` Now after the merge operation, `c` and `a` are actually in different equivalence classes, which is inconsistent. One solution to this problem is to simply avoid removing the original member and this is what this patch does. Other options I have considered: 1) Always merge the trivial class into the non-trivial class. This might work most of the time, however, will fail if we have to merge two non-trivial classes (in that case we no longer can track equivalences precisely). 2) In `removeMember`, update the reverse link as well. This would cease the inconsistency, but we'd loose precision since we could not query the constraints for the removed member. Differential Revision: https://reviews.llvm.org/D114619	2021-11-30 11:13:13 +01:00
Balazs Benics	af37d4b6fe	[analyzer][NFC] Refactor AnalysisConsumer::getModeForDecl() I just read this part of the code, and I found the nested ifs less readable. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D114441	2021-11-29 10:39:36 +01:00
Zarko Todorovski	d42a6432aa	[NFC][clang]Inclusive language: remove remaining uses of sanity Missed some uses of sanity check in previous commits.	2021-11-24 14:20:13 -05:00
Gabor Marton	12887a2024	[Analyzer][Core] Better simplification in SimpleSValBuilder::evalBinOpNN Make the SValBuilder capable to simplify existing SVals based on a newly added constraints when evaluating a BinOp. Before this patch, we called `simplify` only in some edge cases. However, we can and should investigate the constraints in all cases. Differential Revision: https://reviews.llvm.org/D113753	2021-11-23 16:38:01 +01:00
Gabor Marton	ffc32efd1c	[Analyzer][Core] Simplify IntSym in SValBuilder Make the SimpleSValBuilder capable to simplify existing IntSym expressions based on a newly added constraint on the sub-expression. Differential Revision: https://reviews.llvm.org/D113754	2021-11-22 17:33:43 +01:00
Zarko Todorovski	d8e5a0c42b	[clang][NFC] Inclusive terms: replace some uses of sanity in clang Rewording of comments to avoid using `sanity test, sanity check`. Reviewed By: aaron.ballman, Quuxplusone Differential Revision: https://reviews.llvm.org/D114025	2021-11-19 14:58:35 -05:00
Balazs Benics	d5de568cc7	[analyzer][NFC] MaybeUInt -> MaybeCount I forgot to include this in D113594 Differential Revision: https://reviews.llvm.org/D113594	2021-11-19 18:36:55 +01:00
Balazs Benics	e6ef134f3c	[analyzer][NFC] Use enum for CallDescription flags Yeah, let's prefer a slightly stronger type representing this. Reviewed By: martong, xazax.hun Differential Revision: https://reviews.llvm.org/D113595	2021-11-19 18:32:13 +01:00
Balazs Benics	97f1bf15b1	[analyzer][NFC] Consolidate the inner representation of CallDescriptions `CallDescriptions` have a `RequiredArgs` and `RequiredParams` members, but they are of different types, `unsigned` and `size_t` respectively. In the patch I use only `unsigned` for both, that should be large enough anyway. I also introduce the `MaybeUInt` type alias for `Optional<unsigned>`. Additionally, I also avoid the use of the //smart// less-than operator. template <typename T> constexpr bool operator<=(const Optional<T> &X, const T &Y); Which would check if the optional has a value and compare the data only after. I found it surprising, thus I think we are better off without it. Reviewed By: martong, xazax.hun Differential Revision: https://reviews.llvm.org/D113594	2021-11-19 18:32:13 +01:00
Balazs Benics	de9d7e42ac	[analyzer][NFC] CallDescription should own the qualified name parts Previously, CallDescription simply referred to the qualified name parts by `const char` pointers. In the future we might want to dynamically load and populate `CallDescriptionMaps`, hence we will need the `CallDescriptions` to actually own* their qualified name parts. Reviewed By: martong, xazax.hun Differential Revision: https://reviews.llvm.org/D113593	2021-11-19 18:32:13 +01:00
Balazs Benics	9ad0a90baa	[analyzer][NFC] Demonstrate the use of CallDescriptionSet Reviewed By: martong, xazax.hun Differential Revision: https://reviews.llvm.org/D113592	2021-11-19 18:32:13 +01:00
Balazs Benics	f18da190b0	[analyzer][NFC] Switch to using CallDescription::matches() instead of isCalled() This patch replaces each use of the previous API with the new one. In variadic cases, it will use the ADL `matchesAny(Call, CDs...)` variadic function. Also simplifies some code involving such operations. Reviewed By: martong, xazax.hun Differential Revision: https://reviews.llvm.org/D113591	2021-11-19 18:32:13 +01:00
Balazs Benics	6c512703a9	[analyzer][NFC] Introduce CallDescription::matches() in addition to isCalled() This patch introduces `CallDescription::matches()` member function, accepting a `CallEvent`. Semantically, `Call.isCalled(CD)` is the same as `CD.matches(Call)`. The patch also introduces the `matchesAny()` variadic free function template. It accepts a `CallEvent` and at least one `CallDescription` to match against. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D113590	2021-11-19 18:32:13 +01:00
Balazs Benics	d448fcd9b2	[analyzer][NFC] Introduce CallDescriptionSets Sometimes we only want to decide if some function is called, and we don't care which of the set. This `CallDescriptionSet` will have the same behavior, except instead of `lookup()` returning a pointer to the mapped value, the `contains()` returns `bool`. Internally, it uses the `CallDescriptionMap<bool>` for implementing the behavior. It is preferred, to reuse the generic `CallDescriptionMap::lookup()` logic, instead of duplicating it. The generic version might be improved by implementing a hash lookup or something along those lines. Reviewed By: martong, Szelethus Differential Revision: https://reviews.llvm.org/D113589	2021-11-19 18:32:13 +01:00
Kazu Hirata	74115602e8	[clang] Use range-based for loops with llvm::reverse (NFC)	2021-11-17 19:40:48 -08:00
Balazs Benics	0b9d3a6e53	[analyzer][NFC] Separate CallDescription from CallEvent `CallDescriptions` deserve its own translation unit. This patch simply moves the corresponding parts. Also includes the `CallDescription.h` where it's necessary. Reviewed By: martong, xazax.hun, Szelethus Differential Revision: https://reviews.llvm.org/D113587	2021-11-15 19:10:46 +01:00
Denys Petrov	f0bc7d2488	[analyzer] Fix region cast between the same types with different qualifiers. Summary: Specifically, this fixes the case when we get an access to array element through the pointer to element. This covers several FIXME's. in https://reviews.llvm.org/D111654. Example: const int arr[4][2]; const int ptr = arr[1]; // Fixes this. The issue is that `arr[1]` is `int` (&Element{Element{glob_arr5,1 S64b,int[2]},0 S64b,int}), and `ptr` is `const int*`. We don't take qualifiers into account. Consequently, we doesn't match the types as the same ones. Differential Revision: https://reviews.llvm.org/D113480	2021-11-15 19:23:00 +02:00
Kazu Hirata	d0ac215dd5	[clang] Use isa instead of dyn_cast (NFC)	2021-11-14 09:32:40 -08:00
Gabor Marton	01c9700aaa	[analyzer][solver] Remove reference to RangedConstraintManager We no longer need a reference to RangedConstraintManager, we call top level `State->assume` functions. Differential Revision: https://reviews.llvm.org/D113261	2021-11-12 11:44:49 +01:00
Gabor Marton	806329da07	[analyzer][solver] Iterate to a fixpoint during symbol simplification with constants D103314 introduced symbol simplification when a new constant constraint is added. Currently, we simplify existing equivalence classes by iterating over all existing members of them and trying to simplify each member symbol with simplifySVal. At the end of such a simplification round we may end up introducing a new constant constraint. Example: ``` if (a + b + c != d) return; if (c + b != 0) return; // Simplification starts here. if (b != 0) return; ``` The `c == 0` constraint is the result of the first simplification iteration. However, we could do another round of simplification to reach the conclusion that `a == d`. Generally, we could do as many new iterations until we reach a fixpoint. We can reach to a fixpoint by recursively calling `State->assume` on the newly simplified symbol. By calling `State->assume` we re-ignite the whole assume machinery (along e.g with adjustment handling). Why should we do this? By reaching a fixpoint in simplification we are capable of discovering infeasible states at the moment of the introduction of the first constant constraint. Let's modify the previous example just a bit, and consider what happens without the fixpoint iteration. ``` if (a + b + c != d) return; if (c + b != 0) return; // Adding a new constraint. if (a == d) return; // This brings in a contradiction. if (b != 0) return; clang_analyzer_warnIfReached(); // This produces a warning. // The path is already infeasible... if (c == 0) // ...but we realize that only when we evaluate `c == 0`. return; ``` What happens currently, without the fixpoint iteration? As the inline comments suggest, without the fixpoint iteration we are doomed to realize that we are on an infeasible path only after we are already walking on that. With fixpoint iteration we can detect that before stepping on that. With fixpoint iteration, the `clang_analyzer_warnIfReached` does not warn in the above example b/c during the evaluation of `b == 0` we realize the contradiction. The engine and the checkers do rely on that either `assume(Cond)` or `assume(!Cond)` should be feasible. This is in fact assured by the so called expensive checks (LLVM_ENABLE_EXPENSIVE_CHECKS). The StdLibraryFuncionsChecker is notably one of the checkers that has a very similar assertion. Before this patch, we simply added the simplified symbol to the equivalence class. In this patch, after we have added the simplified symbol, we remove the old (more complex) symbol from the members of the equivalence class (`ClassMembers`). Removing the old symbol is beneficial because during the next iteration of the simplification we don't have to consider again the old symbol. Contrary to how we handle `ClassMembers`, we don't remove the old Sym->Class relation from the `ClassMap`. This is important for two reasons: The constraints of the old symbol can still be found via it's equivalence class that it used to be the member of (1). We can spare one removal and thus one additional tree in the forest of `ClassMap` (2). Performance and complexity: Let us assume that in a State we have N non-trivial equivalence classes and that all constraints and disequality info is related to non-trivial classes. In the worst case, we can simplify only one symbol of one class in each iteration. The number of symbols in one class cannot grow b/c we replace the old symbol with the simplified one. Also, the number of the equivalence classes can decrease only, b/c the algorithm does a merge operation optionally. We need N iterations in this case to reach the fixpoint. Thus, the steps needed to be done in the worst case is proportional to `N*N`. Empirical results (attached) show that there is some hardly noticeable run-time and peak memory discrepancy compared to the baseline. In my opinion, these differences could be the result of measurement error. This worst case scenario can be extended to that cases when we have trivial classes in the constraints and in the disequality map are transforming to such a State where there are only non-trivial classes, b/c the algorithm does merge operations. A merge operation on two trivial classes results in one non-trivial class. Differential Revision: https://reviews.llvm.org/D106823	2021-11-12 11:44:49 +01:00
Denys Petrov	a12bfac292	[analyzer] Retrieve a value from list initialization of multi-dimensional array declaration. Summary: Add support of multi-dimensional arrays in `RegionStoreManager::getBindingForElement`. Handle nested ElementRegion's getting offsets and checking for being in bounds. Get values from the nested initialization lists using obtained offsets. Differential Revision: https://reviews.llvm.org/D111654	2021-11-08 16:17:55 +02:00
Balazs Benics	9b5c9c469d	[analyzer] Dump checker name if multiple checkers evaluate the same call Previously, if accidentally multiple checkers `eval::Call`-ed the same `CallEvent`, in debug builds the analyzer detected this and crashed with the message stating this. Unfortunately, the message did not state the offending checkers violating this invariant. This revision addresses this by printing a more descriptive message before aborting. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D112889	2021-11-02 14:42:14 +01:00
Kazu Hirata	4db2e4cebe	Use {DenseSet,SetVector,SmallPtrSet}::contains (NFC)	2021-10-30 19:00:19 -07:00
Zarko Todorovski	8659b241ae	[clang][NFC] Inclusive terms: Replace uses of whitelist in clang/lib/StaticAnalyzer Replace variable and functions names, as well as comments that contain whitelist with more inclusive terms. Reviewed By: aaron.ballman, martong Differential Revision: https://reviews.llvm.org/D112642	2021-10-29 16:51:36 -04:00
Denys Petrov	1deccd05ba	[analyzer] Retrieve a character from StringLiteral as an initializer for constant arrays. Summary: Assuming that values of constant arrays never change, we can retrieve values for specific position(index) right from the initializer, if presented. Retrieve a character code by index from StringLiteral which is an initializer of constant arrays in global scope. This patch has a known issue of getting access to characters past the end of the literal. The declaration, in which the literal is used, is an implicit cast of kind `array-to-pointer`. The offset should be in literal length's bounds. This should be distinguished from the states in the Standard C++20 [dcl.init.string] 9.4.2.3. Example: const char arr[42] = "123"; char c = arr[41]; // OK const char * const str = "123"; char c = str[41]; // NOK Differential Revision: https://reviews.llvm.org/D107339	2021-10-29 19:44:37 +03:00
Mike Rice	6f9c25167d	[OpenMP] Initial parsing/sema for the 'omp loop' construct Adds basic parsing/sema/serialization support for the #pragma omp loop directive. Differential Revision: https://reviews.llvm.org/D112499	2021-10-28 08:26:43 -07:00
Balazs Benics	49285f43e5	[analyzer] sprintf is a taint propagator not a source Due to a typo, `sprintf()` was recognized as a taint source instead of a taint propagator. It was because an empty taint source list - which is the first parameter of the `TaintPropagationRule` - encoded the unconditional taint sources. This typo effectively turned the `sprintf()` into an unconditional taint source. This patch fixes that typo and demonstrated the correct behavior with tests. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D112558	2021-10-28 11:03:02 +02:00
Gabor Marton	a8297ed994	[Analyzer][solver] Handle adjustments in constraint assignor remainder We can reuse the "adjustment" handling logic in the higher level of the solver by calling `State->assume`. Differential Revision: https://reviews.llvm.org/D112296	2021-10-27 17:14:34 +02:00
Gabor Marton	888af47095	[Analyzer][solver] Simplification: reorganize equalities with adjustment Initiate the reorganization of the equality information during symbol simplification. E.g., if we bump into `c + 1 == 0` during simplification then we'd like to express that `c == -1`. It makes sense to do this only with `SymIntExpr`s. Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D111642	2021-10-27 16:48:55 +02:00
Balazs Benics	c18407217e	[analyzer] Fix StringChecker for Unknown params It seems like protobuf crashed the `std::string` checker. Somehow it acquired `UnknownVal` as the sole `std::string` constructor parameter, causing a crash in the `castAs<Loc>()`. This patch addresses this. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D112551	2021-10-26 18:15:00 +02:00
Denys Petrov	3b1165ba3d	[analyzer] Retrieve incomplete array extent from its redeclaration. Summary: Fix a case when the extent can not be retrieved correctly from incomplete array declaration. Use redeclaration to get the array extent. Differential Revision: https://reviews.llvm.org/D111542	2021-10-25 15:14:10 +03:00
Denys Petrov	44e803ef6d	[analyzer][NFCI] Move a block from `getBindingForElement` to separate functions Summary: 1. Improve readability by moving deeply nested block of code from RegionStoreManager::getBindingForElement to new separate functions: - getConstantValFromConstArrayInitializer; - getSValFromInitListExpr. 2. Handle the case when index is a symbolic value. Write specific test cases. 3. Add test cases when there is no initialization expression presented. This patch implies to make next patches clearer and easier for review process. Differential Revision: https://reviews.llvm.org/D106681	2021-10-25 15:14:10 +03:00
Balazs Benics	e1fdec875f	[analyzer] Add std::string checker This patch adds a checker checking `std::string` operations. At first, it only checks the `std::string` single `const char *` constructor for nullness. If It might be `null`, it will constrain it to non-null and place a note tag there. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D111247	2021-10-25 11:15:40 +02:00
Balazs Benics	f9db6a44eb	Revert "[analyzer][solver] Introduce reasoning for not equal to operator" This reverts commit `cac8808f15`. #5 0x00007f28ec629859 abort (/lib/x86_64-linux-gnu/libc.so.6+0x25859) #6 0x00007f28ec629729 (/lib/x86_64-linux-gnu/libc.so.6+0x25729) #7 0x00007f28ec63af36 (/lib/x86_64-linux-gnu/libc.so.6+0x36f36) #8 0x00007f28ecc2cc46 llvm::APInt::compareSigned(llvm::APInt const&) const (libLLVMSupport.so.14git+0xeac46) #9 0x00007f28e7bbf957 (anonymous namespace)::SymbolicRangeInferrer::VisitBinaryOperator(clang::ento::RangeSet, clang::BinaryOperatorKind, clang::ento::RangeSet, clang::QualType) (libclangStaticAnalyzerCore.so.14git+0x1df957) #10 0x00007f28e7bbf2db (anonymous namespace)::SymbolicRangeInferrer::infer(clang::ento::SymExpr const) (libclangStaticAnalyzerCore.so.14git+0x1df2db) #11 0x00007f28e7bb2b5e (anonymous namespace)::RangeConstraintManager::assumeSymNE(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::SymExpr const, llvm::APSInt const&, llvm::APSInt const&) (libclangStaticAnalyzerCore.so.14git+0x1d2b5e) #12 0x00007f28e7bc67af clang::ento::RangedConstraintManager::assumeSymUnsupported(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::SymExpr const, bool) (libclangStaticAnalyzerCore.so.14git+0x1e67af) #13 0x00007f28e7be3578 clang::ento::SimpleConstraintManager::assumeAux(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::NonLoc, bool) (libclangStaticAnalyzerCore.so.14git+0x203578) #14 0x00007f28e7be33d8 clang::ento::SimpleConstraintManager::assume(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::NonLoc, bool) (libclangStaticAnalyzerCore.so.14git+0x2033d8) #15 0x00007f28e7be32fb clang::ento::SimpleConstraintManager::assume(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::DefinedSVal, bool) (libclangStaticAnalyzerCore.so.14git+0x2032fb) #16 0x00007f28e7b15dbc clang::ento::ConstraintManager::assumeDual(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::DefinedSVal) (libclangStaticAnalyzerCore.so.14git+0x135dbc) #17 0x00007f28e7b4780f clang::ento::ExprEngine::evalEagerlyAssumeBinOpBifurcation(clang::ento::ExplodedNodeSet&, clang::ento::ExplodedNodeSet&, clang::Expr const) (libclangStaticAnalyzerCore.so.14git+0x16780f) This is known to be triggered on curl, tinyxml2, tmux, twin and on xerces. But @bjope also reported similar crashes. So, I'm reverting it to make our internal bots happy again. Differential Revision: https://reviews.llvm.org/D106102	2021-10-23 21:01:59 +02:00
Kazu Hirata	d8e4170b0a	Ensure newlines at the end of files (NFC)	2021-10-23 08:45:29 -07:00
Manas	cac8808f15	[analyzer][solver] Introduce reasoning for not equal to operator Prior to this, the solver was only able to verify whether two symbols are equal/unequal, only when constants were involved. This patch allows the solver to work over ranges as well. Reviewed By: steakhal, martong Differential Revision: https://reviews.llvm.org/D106102 Patch by: @manas (Manas Gupta)	2021-10-22 12:00:08 +02:00
Gabor Marton	5f8dca0235	[Analyzer] Extend ConstraintAssignor to handle remainder op Summary: `a % b != 0` implies that `a != 0` for any `a` and `b`. This patch extends the ConstraintAssignor to do just that. In fact, we could do something similar with division and in case of multiplications we could have some other inferences, but I'd like to keep these for future patches. Fixes https://bugs.llvm.org/show_bug.cgi?id=51940 Reviewers: noq, vsavchenko, steakhal, szelethus, asdenyspetrov Subscribers: Differential Revision: https://reviews.llvm.org/D110357	2021-10-22 10:47:25 +02:00
Gabor Marton	e2a2c8328f	[Analyzer][NFC] Add RangedConstraintManager to ConstraintAssignor In this patch we store a reference to `RangedConstraintManager` in the `ConstraintAssignor`. This way it is possible to call back and reuse some functions of it. This patch is exclusively needed for its child patches, it is not intended to be a standalone patch. Differential Revision: https://reviews.llvm.org/D111640	2021-10-22 10:46:28 +02:00
Gabor Marton	01b4ddbfbb	[Analyzer][NFC] Move RangeConstraintManager's def before ConstraintAssignor's def In this patch we simply move the definition of RangeConstraintManager before the definition of ConstraintAssignor. This patch is exclusively needed for it's child patch, so in the child the diff would be clean and the review would be easier. Differential Revision: https://reviews.llvm.org/D110387	2021-10-22 10:46:28 +02:00
Simon Pilgrim	7562f3df89	InvalidPtrChecker - don't dereference a dyn_cast<> - use cast<> instead. Avoid dereferencing a nullptr returned by dyn_cast<>, by using cast<> instead which asserts that the cast is valid.	2021-10-20 18:06:00 +01:00
Balazs Benics	16be17ad4b	[analyzer][NFC] Refactor llvm::isa<> usages in the StaticAnalyzer It turns out llvm::isa<> is variadic, and we could have used this at a lot of places. The following patterns: x && isa<T1>(x) \|\| isa<T2>(x) ... Will be replaced by: isa_and_non_null<T1, T2, ...>(x) Sometimes it caused further simplifications, when it would cause even more code smell. Aside from this, keep in mind that within `assert()` or any macro functions, we need to wrap the isa<> expression within a parenthesis, due to the parsing of the comma. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D111982	2021-10-20 17:43:31 +02:00
Kazu Hirata	0abb5d293c	[Sema, StaticAnalyzer] Use StringRef::contains (NFC)	2021-10-20 08:02:36 -07:00
Balazs Benics	72d04d7b2b	[analyzer] Allow matching non-CallExprs using CallDescriptions Fallback to stringification and string comparison if we cannot compare the `IdentifierInfo`s, which is the case for C++ overloaded operators, constructors, destructors, etc. Examples: { "std", "basic_string", "basic_string", 2} // match the 2 param std::string constructor { "std", "basic_string", "~basic_string" } // match the std::string destructor { "aaa", "bbb", "operator int" } // matches the struct bbb conversion operator to int Reviewed By: martong Differential Revision: https://reviews.llvm.org/D111535	2021-10-18 14:57:24 +02:00
Balazs Benics	3ec7b91141	[analyzer][NFC] Refactor CallEvent::isCalled() Refactor the code to make it more readable. It will set up further changes, and improvements to this code in subsequent patches. This is a non-functional change. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D111534	2021-10-18 14:57:24 +02:00
Kazu Hirata	d245f2e859	[clang] Use llvm::erase_if (NFC)	2021-10-17 13:50:29 -07:00
Kazu Hirata	6a154e606e	[clang] Use llvm::is_contained (NFC)	2021-10-15 10:07:08 -07:00
Artem Dergachev	12cbc8cbf0	[analyzer] Fix property access kind detection inside parentheses. '(self.prop)' produces a surprising AST where ParenExpr resides inside `PseudoObjectExpr. This breaks ObjCMethodCall::getMessageKind() which in turn causes us to perform unnecessary dynamic dispatch bifurcation when evaluating body-farmed property accessors, which in turn causes us to explore infeasible paths.	2021-10-14 21:07:19 -07:00

... 4 5 6 7 8 ...

4997 Commits