Commit Graph

4997 Commits

Author SHA1 Message Date
Kazu Hirata
6ad0788c33 [clang] Use std::optional instead of llvm::Optional (NFC)
This patch replaces (llvm::|)Optional< with std::optional<.  I'll post
a separate patch to remove #include "llvm/ADT/Optional.h".

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-14 12:31:01 -08:00
Kazu Hirata
a1580d7b59 [clang] Add #include <optional> (NFC)
This patch adds #include <optional> to those files containing
llvm::Optional<...> or Optional<...>.

I'll post a separate patch to actually replace llvm::Optional with
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-14 11:07:21 -08:00
Benjamin Kramer
18b0d2c5d9 [analyzer] Fix a FIXME. NFCI 2023-01-13 16:15:16 +01:00
Alan Zhao
95a4c0c835 [clang] Reland parenthesized aggregate init patches
This commit relands the patches for implementing P0960R3 and P1975R0,
which describe initializing aggregates via a parenthesized list.

The relanded commits are:

* 40c52159d3 - P0960R3 and P1975R0: Allow initializing aggregates from
  a parenthesized list of values
* c77a91bb7b - Remove overly restrictive aggregate paren init logic
* 32d7aae04f - Fix a clang crash on invalid code in C++20 mode

This patch also fixes a crash in the original implementation.
Previously, if the input tried to call an implicitly deleted copy or
move constructor of a union, we would then try to initialize the union
by initializing it's first element with a reference to a union. This
behavior is incorrect (we should fail to initialize) and if the type of
the first element has a constructor with a single template typename
parameter, then Clang will explode. This patch fixes that issue by
checking that constructor overload resolution did not result in a
deleted function before attempting parenthesized aggregate
initialization.

Additionally, this patch also includes D140159, which contains some
minor fixes made in response to code review comments in the original
implementation that were made after that patch was submitted.

Co-authored-by: Sheng <ox59616e@gmail.com>

Fixes #54040, Fixes #59675

Reviewed By: ilya-biryukov

Differential Revision: https://reviews.llvm.org/D141546
2023-01-12 09:58:15 -08:00
Balazs Benics
840edd8ab2 [analyzer] Don't escape local static memregions on bind
When the engine processes a store to a variable, it will eventually call
`ExprEngine::processPointerEscapedOnBind()`. This function is supposed to
invalidate (put the given locations to an escape list) the locations
which we cannot reason about.

Unfortunately, local static variables are also put into this list.

This patch relaxes the guard condition, so that beyond stack variables,
static local variables are also ignored.

Differential Revision: https://reviews.llvm.org/D139534
2023-01-12 10:42:57 +01:00
Balázs Kéri
570bf972f5 [clang][analyzer] Remove report of null stream from StreamChecker.
The case of NULL stream passed to stream functions was reported by StreamChecker.
The same condition is checked already by StdLibraryFunctionsChecker and it is
enough to check at one place. The StreamChecker stops now analysis if a passed NULL
stream is encountered but generates no report.
This change removes a dependency between StdCLibraryFunctionArgs checker and
StreamChecker. There is now no more specific message reported by StreamChecker,
the previous weak-dependency is not needed. And StreamChecker can be used
without StdCLibraryFunctions checker or its ModelPOSIX option.

Reviewed By: Szelethus

Differential Revision: https://reviews.llvm.org/D137790
2023-01-09 09:49:08 +01:00
Benjamin Kramer
b6942a2880 [NFC] Hide implementation details in anonymous namespaces 2023-01-08 17:37:02 +01:00
Balázs Kéri
5cf85323a0 [clang][analyzer] Extend StreamChecker with some new functions.
The stream handling functions `ftell`, `rewind`, `fgetpos`, `fsetpos`
are evaluated in the checker more exactly than before.
New tests are added to test behavior of the checker together with
StdLibraryFunctionsChecker. The option ModelPOSIX of that checker
affects if (most of) the stream functions are recognized, and checker
StdLibraryFunctionArgs generates warnings if constraints for arguments
are not satisfied. The state of `errno` is set by StdLibraryFunctionsChecker
too for every case in the stream functions.
StreamChecker works with the stream state only, does not set the errno state,
and is not dependent on other checkers.

Reviewed By: Szelethus

Differential Revision: https://reviews.llvm.org/D140395
2023-01-06 12:22:21 +01:00
Balázs Kéri
3c7fe7d09d [clang][analyzer] Add stream related functions to StdLibraryFunctionsChecker.
Additional stream handling functions are added.
These are partially evaluated by StreamChecker, result of the addition is
check for more preconditions and construction of success and failure branches
with specific errno handling.

Reviewed By: Szelethus

Differential Revision: https://reviews.llvm.org/D140387
2023-01-06 11:04:24 +01:00
Alan Zhao
4e02ff2303 [clang] Revert parentesized aggregate initalization patches
This feature causes clang to crash when compiling Chrome - see
https://crbug.com/1405031 and
https://github.com/llvm/llvm-project/issues/59675

Revert "[clang] Fix a clang crash on invalid code in C++20 mode."

This reverts commit 32d7aae04f.

Revert "[clang] Remove overly restrictive aggregate paren init logic"

This reverts commit c77a91bb7b.

Revert "[clang][C++20] P0960R3 and P1975R0: Allow initializing aggregates from a parenthesized list of values"

This reverts commit 40c52159d3.
2023-01-04 15:09:36 -08:00
Kazu Hirata
9cf4419e24 [clang] Use std::optional instead of llvm::Optional (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-02 15:54:57 -08:00
serge-sans-paille
d9ab3e82f3
[clang] Use a StringRef instead of a raw char pointer to store builtin and call information
This avoids recomputing string length that is already known at compile time.

It has a slight impact on preprocessing / compile time, see

https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u

This a recommit of e953ae5bbc and the subsequent fixes caa713559b and 06b90e2e9c.

The above patchset caused some version of GCC to take eons to compile clang/lib/Basic/Targets/AArch64.cpp, as spotted in aa171833ab.
The fix is to make BuiltinInfo tables a compilation unit static variable, instead of a private static variable.

Differential Revision: https://reviews.llvm.org/D139881
2022-12-27 09:55:19 +01:00
Vitaly Buka
aa171833ab Revert "[clang] Use a StringRef instead of a raw char pointer to store builtin and call information"
Revert "Fix lldb option handling since e953ae5bbc (part 2)"
Revert "Fix lldb option handling since e953ae5bbc313fd0cc980ce021d487e5b5199ea4"

GCC build hangs on this bot https://lab.llvm.org/buildbot/#/builders/37/builds/19104
compiling CMakeFiles/obj.clangBasic.dir/Targets/AArch64.cpp.d

The bot uses GNU 11.3.0, but I can reproduce locally with gcc (Debian 12.2.0-3) 12.2.0.

This reverts commit caa713559b.
This reverts commit 06b90e2e9c.
This reverts commit e953ae5bbc.
2022-12-25 23:12:47 -08:00
serge-sans-paille
e953ae5bbc
[clang] Use a StringRef instead of a raw char pointer to store builtin and call information
This avoids recomputing string length that is already known at compile
time.

It has a slight impact on preprocessing / compile time, see

https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u

This is a recommit of 719d98dfa8 that into
account a GGC issue (probably
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92181) when dealing with
intiailizer_list and constant expressions.

Workaround this by avoiding initializer list, at the expense of a
temporary plain old array.

Differential Revision: https://reviews.llvm.org/D139881
2022-12-24 10:25:06 +01:00
serge-sans-paille
07d9ab9aa5
Revert "[clang] Use a StringRef instead of a raw char pointer to store builtin and call information"
There are still remaining issues with GCC 12, see for instance

https://lab.llvm.org/buildbot/#/builders/93/builds/12669

This reverts commit 5ce4e92264.
2022-12-23 13:29:21 +01:00
serge-sans-paille
5ce4e92264
[clang] Use a StringRef instead of a raw char pointer to store builtin and call information
This avoids recomputing string length that is already known at compile
time.

It has a slight impact on preprocessing / compile time, see

https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u

This is a recommit of 719d98dfa8 with a
change to llvm/utils/TableGen/OptParserEmitter.cpp to cope with GCC bug
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108158

Differential Revision: https://reviews.llvm.org/D139881
2022-12-23 12:48:17 +01:00
serge-sans-paille
b7065a31b5
Revert "[clang] Use a StringRef instead of a raw char pointer to store builtin and call information"
Failing builds: https://lab.llvm.org/buildbot#builders/9/builds/19030
This is GCC specific and has been reported upstream: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108158

This reverts commit 719d98dfa8.
2022-12-23 11:36:56 +01:00
serge-sans-paille
719d98dfa8
[clang] Use a StringRef instead of a raw char pointer to store builtin and call information
This avoids recomputing string length that is already known at compile
time.

It has a slight impact on preprocessing / compile time, see

https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u

Differential Revision: https://reviews.llvm.org/D139881
2022-12-23 10:31:47 +01:00
Archibald Elliott
f09cf34d00 [Support] Move TargetParsers to new component
This is a fairly large changeset, but it can be broken into a few
pieces:
- `llvm/Support/*TargetParser*` are all moved from the LLVM Support
  component into a new LLVM Component called "TargetParser". This
  potentially enables using tablegen to maintain this information, as
  is shown in https://reviews.llvm.org/D137517. This cannot currently
  be done, as llvm-tblgen relies on LLVM's Support component.
- This also moves two files from Support which use and depend on
  information in the TargetParser:
  - `llvm/Support/Host.{h,cpp}` which contains functions for inspecting
    the current Host machine for info about it, primarily to support
    getting the host triple, but also for `-mcpu=native` support in e.g.
    Clang. This is fairly tightly intertwined with the information in
    `X86TargetParser.h`, so keeping them in the same component makes
    sense.
  - `llvm/ADT/Triple.h` and `llvm/Support/Triple.cpp`, which contains
    the target triple parser and representation. This is very intertwined
    with the Arm target parser, because the arm architecture version
    appears in canonical triples on arm platforms.
- I moved the relevant unittests to their own directory.

And so, we end up with a single component that has all the information
about the following, which to me seems like a unified component:
- Triples that LLVM Knows about
- Architecture names and CPUs that LLVM knows about
- CPU detection logic for LLVM

Given this, I have also moved `RISCVISAInfo.h` into this component, as
it seems to me to be part of that same set of functionality.

If you get link errors in your components after this patch, you likely
need to add TargetParser into LLVM_LINK_COMPONENTS in CMake.

Differential Revision: https://reviews.llvm.org/D137838
2022-12-20 11:05:50 +00:00
Benjamin Kramer
2916b99182 [ADT] Alias llvm::Optional to std::optional
This avoids the continuous API churn when upgrading things to use
std::optional and makes trivial string replace upgrades possible.

I tested this with GCC 7.5, the oldest supported GCC I had around.

Differential Revision: https://reviews.llvm.org/D140332
2022-12-20 01:01:46 +01:00
Balazs Benics
f61a08b67f [analyzer] Fix crash inside RangeConstraintManager.cpp introduced by D112621
It seems like `LHS` and `RHS` could be empty range sets.
This caused an assertion failure inside RangeConstraintManager.

I'm hoisting out the check from the function into the call-site.
This way we could assert that we only want to deal with non-empty range
sets.

The relevant part of the trace:
```
 #6 0x00007fe6ff5f81a6 __assert_fail_base (/lib64/libc.so.6+0x2f1a6)
 #7 0x00007fe6ff5f8252 (/lib64/libc.so.6+0x2f252)
 #8 0x00000000049caed2 (anonymous namespace)::SymbolicRangeInferrer::VisitBinaryOperator(clang::ento::RangeSet, clang::BinaryOperatorKind, clang::ento::RangeSet, clang::QualType) RangeConstraintManager.cpp:0:0
 #9 0x00000000049c9867 (anonymous namespace)::SymbolicRangeInferrer::infer(clang::ento::SymExpr const*) RangeConstraintManager.cpp:0:0
#10 0x00000000049bebf5 (anonymous namespace)::RangeConstraintManager::assumeSymNE(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::SymExpr const*, llvm::APSInt const&, llvm::APSInt const&) RangeConstraintManager.cpp:0:0
#11 0x00000000049d368c clang::ento::RangedConstraintManager::assumeSymUnsupported(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::SymExpr const*, bool) (../../main-github/llvm/build-all/bin/clang+0x49d368c)
#12 0x00000000049f0b09 clang::ento::SimpleConstraintManager::assumeAux(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::NonLoc, bool) (../../main-github/llvm/build-all/bin/clang+0x49f0b09)
#13 0x00000000049f096a clang::ento::SimpleConstraintManager::assume(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::NonLoc, bool) (../../main-github/llvm/build-all/bin/clang+0x49f096a)
#14 0x00000000049f086d clang::ento::SimpleConstraintManager::assumeInternal(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::DefinedSVal, bool) (../../main-github/llvm/build-all/bin/clang+0x49f086d)
#15 0x000000000492d3e3 clang::ento::ConstraintManager::assumeDual(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::DefinedSVal) (../../main-github/llvm/build-all/bin/clang+0x492d3e3)
#16 0x0000000004955b6d clang::ento::ExprEngine::evalEagerlyAssumeBinOpBifurcation(clang::ento::ExplodedNodeSet&, clang::ento::ExplodedNodeSet&, clang::Expr const*) (../../main-github/llvm/build-all/bin/clang+0x4955b6d)
#17 0x00000000049514b6 clang::ento::ExprEngine::Visit(clang::Stmt const*, clang::ento::ExplodedNode*, clang::ento::ExplodedNodeSet&) (../../main-github/llvm/build-all/bin/clang+0x49514b6)
#18 0x000000000494c73e clang::ento::ExprEngine::ProcessStmt(clang::Stmt const*, clang::ento::ExplodedNode*) (../../main-github/llvm/build-all/bin/clang+0x494c73e)
#19 0x000000000494c459 clang::ento::ExprEngine::processCFGElement(clang::CFGElement, clang::ento::ExplodedNode*, unsigned int, clang::ento::NodeBuilderContext*) (../../main-github/llvm/build-all/bin/clang+0x494c459)
#20 0x000000000492f3d0 clang::ento::CoreEngine::HandlePostStmt(clang::CFGBlock const*, unsigned int, clang::ento::ExplodedNode*) (../../main-github/llvm/build-all/bin/clang+0x492f3d0)
#21 0x000000000492e1f6 clang::ento::CoreEngine::ExecuteWorkList(clang::LocationContext const*, unsigned int, llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>) (../../main-github/llvm/build-all/bin/clang+0x492e1f6)
```

Differential Revision: https://reviews.llvm.org/D112621
2022-12-19 12:49:43 +01:00
Fangrui Song
53e5cd4d3e llvm::Optional::value => operator*/operator->
std::optional::value() has undesired exception checking semantics and is
unavailable in older Xcode (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). The
call sites block std::optional migration.

This makes `ninja clang` work in the absence of llvm::Optional::value.
2022-12-17 06:37:59 +00:00
Fangrui Song
21c4dc7997 std::optional::value => operator*/operator->
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).

This fixes clang.
2022-12-17 00:42:05 +00:00
Sprite
a9f9f3dff4 Correct typos (NFC)
Just found some typos while reading the llvm/circt project.

compliment -> complement
emitsd -> emits
2022-12-16 10:51:26 -08:00
Fangrui Song
b1df3a2c0b [Support] llvm::Optional => std::optional
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-16 08:49:10 +00:00
Kazu Hirata
6eb0b0a045 Don't include Optional.h
These files no longer use llvm::Optional.
2022-12-14 21:16:22 -08:00
Alan Zhao
40c52159d3 [clang][C++20] P0960R3 and P1975R0: Allow initializing aggregates from a parenthesized list of values
This patch implements P0960R3, which allows initialization of aggregates
via parentheses.

As an example:

```
struct S { int i, j; };
S s1(1, 1);

int arr1[2](1, 2);
```

This patch also implements P1975R0, which fixes the wording of P0960R3
for single-argument parenthesized lists so that statements like the
following are allowed:

```
S s2(1);
S s3 = static_cast<S>(1);
S s4 = (S)1;

int (&&arr2)[] = static_cast<int[]>(1);
int (&&arr3)[2] = static_cast<int[2]>(1);
```

This patch was originally authored by @0x59616e and completed by
@ayzhao.

Fixes #54040, Fixes #54041

Co-authored-by: Sheng <ox59616e@gmail.com>

Full write up : https://discourse.llvm.org/t/c-20-rfc-suggestion-desired-regarding-the-implementation-of-p0960r3/63744

Reviewed By: ilya-biryukov

Differential Revision: https://reviews.llvm.org/D129531
2022-12-14 07:54:15 -08:00
Balázs Kéri
da0660691f [clang][analyzer] No new nodes when bug is detected in StdLibraryFunctionsChecker.
The checker applies constraints in a sequence and adds new nodes for these states.
If a constraint violation is found this sequence should be stopped with a sink
(error) node. Instead the `generateErrorNode` did add a new error node as a new
branch that is parallel to the other node sequence, the other branch was not
stopped and analysis was continuing on that invalid branch.
To add an error node after any previous node a new version of `generateErrorNode`
is needed, this function is added here and used by `StdLibraryFunctionsChecker`.
The added test executes a situation where the checker adds a number of
constraints before it finds a constraint violation.

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D137722
2022-12-14 09:51:43 +01:00
Paul Pelzl
6ab01d4a5c [analyzer] Nullability: Enable analysis of non-inlined nullable objc properties.
The NullabilityChecker has a very early policy decision that non-inlined
property accesses will be inferred as returning nonnull, despite nullability
annotations to the contrary. This decision eliminates false positives related
to very common code patterns that look like this:

if (foo.prop) {
    [bar doStuffWithNonnull:foo.prop];
}

While this probably represents a correct nil-check, the analyzer can't
determine correctness without gaining visibility into the property
implementation.

Unfortunately, inferring nullable properties as nonnull comes at the cost of
significantly reduced code coverage. My goal here is to enable detection of
many property-related nullability violations without a large increase
in false positives.

The approach is to introduce a heuristic: after accessing the value of
a property, if the analyzer at any time proves that the property value is
nonnull (which would happen in particular due to a nil-check conditional),
then subsequent property accesses on that code path will be *inferred*
as nonnull. This captures the pattern described above, which I believe
to be the dominant source of false positives in real code.

https://reviews.llvm.org/D131655
2022-12-12 14:19:26 -08:00
Kazu Hirata
3e572733d9 [StaticAnalyzer] Use std::optional in RetainCountDiagnostics.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-10 21:15:44 -08:00
Kazu Hirata
eacf7c874b [StaticAnalyzer] Use std::optional in MallocChecker.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-10 21:15:43 -08:00
Kazu Hirata
a67a11536e [StaticAnalyzer] Use std::optional in BugReporterVisitors.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-10 21:11:31 -08:00
Kazu Hirata
602af71c29 [StaticAnalyzer] Use std::optional in BugReporter.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-10 21:11:29 -08:00
Kazu Hirata
ec94a5b716 [StaticAnalyzer] Use std::optional in BugReporter.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-10 20:54:38 -08:00
Kazu Hirata
1cb7fba3e5 [StaticAnalyzer] Don't use Optional<T>::create (NFC)
std::optional<T> does not have an equivalent method.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-10 12:35:03 -08:00
Krzysztof Parzyszek
29041bc050 [APInt] Convert GetMostSignificantDifferentBit to std::optional 2022-12-10 14:03:29 -06:00
Kazu Hirata
f7dffc28b3 Don't include None.h (NFC)
I've converted all known uses of None to std::nullopt, so we no longer
need to include None.h.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-10 11:24:26 -08:00
Kazu Hirata
628556b1c5 [Checkers] Use std::optional in UnixAPIChecker.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-10 08:10:24 -08:00
Kazu Hirata
b5fdd533e5 [Checkers] Use std::optional in StdLibraryFunctionsChecker.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-10 08:09:00 -08:00
Kazu Hirata
b5716decbd [RetainCountChecker] Use std::optional in RetainCountDiagnostics.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-10 08:07:12 -08:00
Kazu Hirata
02c905cd4d [Checkers] Use std::optional in MallocChecker.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-10 08:04:28 -08:00
Kazu Hirata
6c8b8a6a2a [Checkers] Use std::optional in GenericTaintChecker.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-10 08:00:24 -08:00
Kazu Hirata
9ddc8af97f [Checkers] Use std::optional in BasicObjCFoundationChecks.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-10 07:58:36 -08:00
Kazu Hirata
37a3e98c84 [clang] Use std::nullopt instead of None in comments (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-09 18:39:01 -08:00
Manas
77ab7281aa [analyzer][solver] Introduce reasoning for not equal to operator
With this patch, the solver can infer results for not equal (!=) operator
over Ranges as well. This also fixes the issue of comparison between
different types, by first converting the RangeSets to the resulting type,
which then can be used for comparisons.

Patch by Manas.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D112621
2022-12-09 13:30:57 +01:00
Kazu Hirata
c25cc84b87 [clang] Don't including None.h (NFC)
These source files no longer use None, so they do not need to include
None.h.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-08 23:36:50 -08:00
Kazu Hirata
22731dbd75 [clang] Use std::nullopt instead of None in comments (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 20:31:05 -08:00
Kazu Hirata
35b4fbb559 [clang] Use std::nullopt instead of None in comments (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 15:57:24 -08:00
Kazu Hirata
180600660b [StaticAnalyzer] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-03 11:34:24 -08:00
Jan Svoboda
abf0c6c0c0 Use CTAD on llvm::SaveAndRestore
Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D139229
2022-12-02 15:36:12 -08:00
Alex Richardson
a602f76a24 [clang][TargetInfo] Use LangAS for getPointer{Width,Align}()
Mixing LLVM and Clang address spaces can result in subtle bugs, and there
is no need for this hook to use the LLVM IR level address spaces.
Most of this change is just replacing zero with LangAS::Default,
but it also allows us to remove a few calls to getTargetAddressSpace().

This also removes a stale comment+workaround in
CGDebugInfo::CreatePointerLikeType(): ASTContext::getTypeSize() does
return the expected size for ReferenceType (and handles address spaces).

Differential Revision: https://reviews.llvm.org/D138295
2022-11-30 20:24:01 +00:00
Balazs Benics
dbb94b415a [analyzer] Remove the unused LocalCheckers.h header 2022-11-28 13:08:38 +01:00
Kazu Hirata
20ba079dda [StaticAnalyzer] Don't use Optional::create (NFC)
Note that std::optional does not offer create().

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-25 15:38:53 -08:00
Balazs Benics
097ce76165 [analyzer] Deprecate FAM analyzer-config, recommend -fstrict-flex-arrays instead
By default, clang assumes that all trailing array objects could be a
FAM. So, an array of undefined size, size 0, size 1, or even size 42 is
considered as FAMs for optimizations at least.

One needs to override the default behavior by supplying the
`-fstrict-flex-arrays=<N>` flag, with `N > 0` value to reduce the set of
FAM candidates. Value `3` is the most restrictive and `0` is the most
permissive on this scale.

0: all trailing arrays are FAMs
1: only incomplete, zero and one-element arrays are FAMs
2: only incomplete, zero-element arrays are FAMs
3: only incomplete arrays are FAMs

If the user is happy with consdering single-element arrays as FAMs, they
just need to remove the
`consider-single-element-arrays-as-flexible-array-members` from the
command line.
Otherwise, if they don't want to recognize such cases as FAMs, they
should specify `-fstrict-flex-arrays` anyway, which will be picked up by
CSA.

Any use of the deprecated analyzer-config value will trigger a warning
explaining what to use instead.
The `-analyzer-config-help` is updated accordingly.

Depends on D138657

Reviewed By: xazax.hun

Differential Revision: https://reviews.llvm.org/D138659
2022-11-25 10:24:56 +01:00
Balazs Benics
93b98eb399 [analyzer] getBinding should auto-detect type only if it was not given
Casting a pointer to a suitably large integral type by reinterpret-cast
should result in the same value as by using the `__builtin_bit_cast()`.
The compiler exploits this: https://godbolt.org/z/zMP3sG683

However, the analyzer does not bind the same symbolic value to these
expressions, resulting in weird situations, such as failing equality
checks and even results in crashes: https://godbolt.org/z/oeMP7cj8q

Previously, in the `RegionStoreManager::getBinding()` even if `T` was
non-null, we replaced it with `TVR->getValueType()` in case the `MR` was
`TypedValueRegion`.
It doesn't make much sense to auto-detect the type if the type is
already given. By not doing the auto-detection, we would just do the
right thing and perform the load by that type.
This means that we will cast the value to that type.

So, in this patch, I'm proposing to do auto-detection only if the type
was null.

Here is a snippet of code, annotated by the previous and new dump values.
`LocAsInteger` should wrap the `SymRegion`, since we want to load the
address as if it was an integer.
In none of the following cases should type auto-detection be triggered,
hence we should eventually reach an `evalCast()` to lazily cast the loaded
value into that type.

```lang=C++
void LValueToRValueBitCast_dumps(void *p, char (*array)[8]) {
  clang_analyzer_dump(p);     // remained: &SymRegion{reg_$0<void * p>}
  clang_analyzer_dump(array); // remained: {{&SymRegion{reg_$1<char (*)[8] array>}
  clang_analyzer_dump((unsigned long)p);
  // remained: {{&SymRegion{reg_$0<void * p>} [as 64 bit integer]}}
  clang_analyzer_dump(__builtin_bit_cast(unsigned long, p));     <--------- change #1
  // previously: {{&SymRegion{reg_$0<void * p>}}}
  // now:        {{&SymRegion{reg_$0<void * p>} [as 64 bit integer]}}
  clang_analyzer_dump((unsigned long)array); // remained: {{&SymRegion{reg_$1<char (*)[8] array>} [as 64 bit integer]}}
  clang_analyzer_dump(__builtin_bit_cast(unsigned long, array)); <--------- change #2
  // previously: {{&SymRegion{reg_$1<char (*)[8] array>}}}
  // now:        {{&SymRegion{reg_$1<char (*)[8] array>} [as 64 bit integer]}}
}
```

Reviewed By: xazax.hun

Differential Revision: https://reviews.llvm.org/D136603
2022-11-23 15:52:11 +01:00
Vaibhav Yenamandra
7b6fe711b2 Refactor StaticAnalyzer to use clang::SarifDocumentWriter
Refactor StaticAnalyzer to use clang::SarifDocumentWriter for
serializing sarif diagnostics.

Uses clang::SarifDocumentWriter to generate SARIF output in the
StaticAnalyzer.

Various bugfixes are also made to clang::SarifDocumentWriter.

Summary of changes:

clang/lib/Basic/Sarif.cpp:
  * Fix bug in adjustColumnPos introduced from prev move, it now uses
    FullSourceLoc::getDecomposedExpansionLoc which provides the correct
    location (in the presence of macros) instead of
    FullSourceLoc::getDecomposedLoc.
  * Fix createTextRegion so that it handles caret ranges correctly,
    this should bring it to parity with the previous implementation.

clang/test/Analysis/diagnostics/Inputs/expected-sarif:
  * Update the schema URL to the offical website
  * Add the emitted defaultConfiguration sections to all rules
  * Annotate results with the "level" property

clang/lib/StaticAnalyzer/Core/SarifDiagnostics.cpp:
  * Update SarifDiagnostics class to hold a clang::SarifDocumentWriter
    that it uses to convert diagnostics to SARIF.
2022-11-17 14:47:02 -05:00
Tomasz Kamiński
2fb3bec932 [analyzer] Fix crash for array-delete of UnknownVal values.
We now skip the destruction of array elements for `delete[] p`,
if the value of `p` is UnknownVal and does not have corresponding region.
This eliminate the crash in `getDynamicElementCount` on that
region and matches the behavior for deleting the array of
non-constant range.

Reviewed By: isuckatcs

Differential Revision: https://reviews.llvm.org/D136671
2022-11-09 15:06:46 +01:00
Rageking8
94738a5ac3 Fix duplicate word typos; NFC
This revision fixes typos where there are 2 consecutive words which are
duplicated. There should be no code changes in this revision (only
changes to comments and docs). Do let me know if there are any
undesirable changes in this revision. Thanks.
2022-11-08 07:21:23 -05:00
Nathan James
108e41d962
[clang][NFC] Use c++17 style variable type traits
This was done as a test for D137302 and it makes sense to push these changes

Reviewed By: shafik

Differential Revision: https://reviews.llvm.org/D137491
2022-11-07 18:25:48 +00:00
Jennifer Yu
ea64e66f7b [OPENMP]Initial support for error directive.
Differential Revision: https://reviews.llvm.org/D137209
2022-11-02 14:25:28 -07:00
Bill Wendling
7f93ae8086 [clang] Implement -fstrict-flex-arrays=3
The -fstrict-flex-arrays=3 is the most restrictive type of flex arrays.
No number, including 0, is allowed in the FAM. In the cases where a "0"
is used, the resulting size is the same as if a zero-sized object were
substituted.

This is needed for proper _FORTIFY_SOURCE coverage in the Linux kernel,
among other reasons. So while the only reason for specifying a
zero-length array at the end of a structure is for specify a FAM,
treating it as such will cause _FORTIFY_SOURCE not to work correctly;
__builtin_object_size will report -1 instead of 0 for a destination
buffer size to keep any kernel internals from using the deprecated
members as fake FAMs.

For example:

  struct broken {
      int foo;
      int fake_fam[0];
      struct something oops;
  };

There have been bugs where the above struct was created because "oops"
was added after "fake_fam" by someone not realizing. Under
__FORTIFY_SOURCE, doing:

  memcpy(p->fake_fam, src, len);

raises no warnings when __builtin_object_size(p->fake_fam, 1) returns -1
and may stomp on "oops."

Omitting a warning when using the (invalid) zero-length array is how GCC
treats -fstrict-flex-arrays=3. A warning in that situation is likely an
irritant, because requesting this option level is explicitly requesting
this behavior.

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836

Differential Revision: https://reviews.llvm.org/D134902
2022-10-27 10:50:04 -07:00
Kristóf Umann
a504ddc8bf [analyzer] Initialize regions returned by CXXNew to undefined
Discourse mail:
https://discourse.llvm.org/t/analyzer-why-do-we-suck-at-modeling-c-dynamic-memory/65667

malloc() returns a piece of uninitialized dynamic memory. We were (almost)
always able to model this behaviour. Its C++ counterpart, operator new is a
lot more complex, because it allows for initialization, the most complicated of which is the usage of constructors.

We gradually became better in modeling constructors, but for some reason, most
likely for reasons lost in history, we never actually modeled the case when the
memory returned by operator new was just simply uninitialized. This patch
(attempts) to fix this tiny little error.

Differential Revision: https://reviews.llvm.org/D135375
2022-10-26 17:22:12 +02:00
Gabor Marton
82a50812f7 [analyzer][StdLibraryFunctionsChecker] Add NoteTags for applied arg
constraints

In this patch I add a new NoteTag for each applied argument constraint.
This way, any other checker that reports a bug - where the applied
constraint is relevant - will display the corresponding note. With this
change we provide more information for the users to understand some
bug reports easier.

Differential Revision: https://reviews.llvm.org/D101526

Reviewed By: NoQ
2022-10-26 16:33:25 +02:00
Balazs Benics
aa12a48c82 [analyzer] Fix assertion failure with conflicting prototype calls
It turns out we can reach the `Init.castAs<nonlock::CompoundVal>()`
expression with other kinds of SVals. Such as by `nonloc::ConcreteInt`
in this example: https://godbolt.org/z/s4fdxrcs9

```lang=C++
int buffer[10];
void b();
void top() {
  b(&buffer);
}
void b(int *c) {
  *c = 42; // would crash
}
```
In this example, we try to store `42` to the `Elem{buffer, 0}`.

This situation can appear if the CallExpr refers to a function
declaration without prototype. In such cases, the engine will pick the
redecl of the referred function decl which has function body, hence has
a function prototype.

This weird situation will have an interesting effect to the AST, such as
the argument at the callsite will miss a cast, which would cast the
`int (*)[10]` expression into `int *`, which means that when we evaluate
the `*c = 42` expression, we want to bind `42` to an array, causing the
crash.

Look at the AST of the callsite with and without the function prototype:
https://godbolt.org/z/Gncebcbdb
The only difference is that without the proper function prototype, we
will not have the `ImplicitCastExpr` `BitCasting` from `int (*)[10]`
to `int *` to match the expected type of the parameter declaration.

In this patch, I'm proposing to emit a cast in the mentioned edge-case,
to bind the argument value of the expected type to the parameter.

I'm only proposing this if the runtime definition has exactly the same
number of parameters as the callsite feeds it by arguments.
If that's not the case, I believe, we are better off by binding `Unknown`
to those parameters.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D136162
2022-10-26 11:27:01 +02:00
Tomasz Kamiński
6194229c62 [analyzer] Make directly bounded LazyCompoundVal as lazily copied
Previously, `LazyCompoundVal` bindings to subregions referred by
`LazyCopoundVals`, were not marked as //lazily copied//.

This change returns `LazyCompoundVals` from `getInterestingValues()`,
so their regions can be marked as //lazily copied// in `RemoveDeadBindingsWorker::VisitBinding()`.

Depends on D134947

Authored by: Tomasz Kamiński <tomasz.kamiński@sonarsource.com>

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D135136
2022-10-19 16:06:32 +02:00
Tomasz Kamiński
a6b42040ad [analyzer] Fix the liveness of Symbols for values in regions referred by LazyCompoundVal
To illustrate our current understanding, let's start with the following program:
https://godbolt.org/z/33f6vheh1
```lang=c++
void clang_analyzer_printState();

struct C {
   int x;
   int y;
   int more_padding;
};

struct D {
   C c;
   int z;
};

C foo(D d, int new_x, int new_y) {
   d.c.x = new_x;       // B1
   assert(d.c.x < 13);  // C1

   C c = d.c;           // L

   assert(d.c.y < 10);  // C2
   assert(d.z < 5);     // C3

   d.c.y = new_y;       // B2

   assert(d.c.y < 10);  // C4

   return c;  // R
}
```
In the code, we create a few bindings to subregions of root region `d` (`B1`, `B2`), a constrain on the values  (`C1`, `C2`, ….), and create a `lazyCompoundVal` for the part of the region `d` at point `L`, which is returned at point `R`.

Now, the question is which of these should remain live as long the return value of the `foo` call is live. In perfect a word we should preserve:

  # only the bindings of the subregions of `d.c`, which were created before the copy at `L`. In our example, this includes `B1`, and not `B2`.  In other words, `new_x` should be live but `new_y` shouldn’t.

  # constraints on the values of `d.c`, that are reachable through `c`. This can be created both before the point of making the copy (`L`) or after. In our case, that would be `C1` and `C2`. But not `C3` (`d.z` value is not reachable through `c`) and `C4` (the original value of`d.c.y` was overridden at `B2` after the creation of `c`).

The current code in the `RegionStore` covers the use case (1), by using the `getInterestingValues()` to extract bindings to parts of the referred region present in the store at the point of copy. This also partially covers point (2), in case when constraints are applied to a location that has binding at the point of the copy (in our case `d.c.x` in `C1` that has value `new_x`), but it fails to preserve the constraints that require creating a new symbol for location (`d.c.y` in `C2`).

We introduce the concept of //lazily copied// locations (regions) to the `SymbolReaper`, i.e. for which a program can access the value stored at that location, but not its address. These locations are constructed as a set of regions referred to by `lazyCompoundVal`. A //readable// location (region) is a location that //live// or //lazily copied// . And symbols that refer to values in regions are alive if the region is //readable//.

For simplicity, we follow the current approach to live regions and mark the base region as //lazily copied//, and consider any subregions as //readable//. This makes some symbols falsy live (`d.z` in our example) and keeps the corresponding constraints alive.

The rename `Regions` to `LiveRegions` inside  `RegionStore` is NFC change, that was done to make it clear, what is difference between regions stored in this two sets.

Regression Test: https://reviews.llvm.org/D134941
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>

Reviewed By: martong, xazax.hun

Differential Revision: https://reviews.llvm.org/D134947
2022-10-19 16:06:32 +02:00
Kazu Hirata
08901c8a98 [clang] Use llvm::reverse (NFC) 2022-10-15 21:54:13 -07:00
Matheus Izvekov
bcd9ba2b7e
[clang] Track the templated entity in type substitution.
This is a change to how we represent type subsitution in the AST.
Instead of only storing the replaced type, we track the templated
entity we are substituting, plus an index.
We modify MLTAL to track the templated entity at each level.

Otherwise, it's much more expensive to go from the template parameter back
to the templated entity, and not possible to do in some cases, as when
we instantiate outer templates, parameters might still reference the
original entity.

This also allows us to very cheaply lookup the templated entity we saw in
the naming context and find the corresponding argument it was replaced
from, such as for implementing template specialization resugaring.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Differential Revision: https://reviews.llvm.org/D131858
2022-10-15 22:08:36 +02:00
Balazs Benics
b062ee7dc4 [analyzer] Workaround crash on encountering Class non-type template parameters
The Clang Static Analyzer will crash on this code:
```lang=C++
struct Box {
  int value;
};
template <Box V> int get() {
  return V.value;
}
template int get<Box{-1}>();
```
https://godbolt.org/z/5Yb1sMMMb

The problem is that we don't account for encountering `TemplateParamObjectDecl`s
within the `DeclRefExpr` handler in the `ExprEngine`.

IMO we should create a new memregion for representing such template
param objects, to model their language semantics.
Such as:
 - it should have global static storage
 - for two identical values, their addresses should be identical as well
http://eel.is/c%2B%2Bdraft/temp.param#8

I was thinking of introducing a `TemplateParamObjectRegion` under `DeclRegion`
for this purpose. It could have `TemplateParamObjectDecl` as a field.

The `TemplateParamObjectDecl::getValue()` returns `APValue`, which might
represent multiple levels of structures, unions and other goodies -
making the transformation from `APValue` to `SVal` a bit complicated.

That being said, for now, I think having `Unknowns` for such cases is
definitely an improvement to crashing, hence I'm proposing this patch.

Reviewed By: xazax.hun

Differential Revision: https://reviews.llvm.org/D135763
2022-10-13 08:41:31 +02:00
Arseniy Zaostrovnykh
ec6da3fb9d Fix false positive related to handling of [[noreturn]] function pointers
Before this change, the `NoReturnFunctionChecker` was missing function pointers
with a `[[noreturn]]` attribute, while `CFG` was constructed taking that into
account, which leads CSA to take impossible paths. The reason was that the
`NoReturnFunctionChecker` was looking for the attribute in the type of the
entire call expression rather than the type of the function being called.

This change makes the `[[noreturn]]` attribute of a function pointer visible
to `NoReturnFunctionChecker`. This leads to a more coherent behavior of the
CSA on the AST involving.

Reviewed By: xazax.hun

Differential Revision: https://reviews.llvm.org/D135682
2022-10-12 14:46:32 +02:00
Soumi Manna
3b652fc6d6 [analyzer] Fix static code analysis concerns
ProcessMemberDtor(), ProcessDeleteDtor(), and ProcessAutomaticObjDtor():
Fix static analyzer warnings with suspicious dereference of pointer
'Pred' in function call before NULL checks - NFCI

Differential Revision: https://reviews.llvm.org/D135290
2022-10-07 16:58:37 +02:00
Bill Wendling
7404b855e5 [clang][NFC] Use enum for -fstrict-flex-arrays
Use enums for the strict flex arrays flag so that it's more readable.

Differential Revision: https://reviews.llvm.org/D135107
2022-10-06 10:45:41 -07:00
Argyrios Kyrtzidis
371883f46d [clang/Sema] Fix non-deterministic order for certain kind of diagnostics
In the context of caching clang invocations it is important to emit diagnostics in deterministic order;
the same clang invocation should result in the same diagnostic output.

rdar://100336989

Differential Revision: https://reviews.llvm.org/D135118
2022-10-05 12:58:01 -07:00
Tomasz Kamiński
4ff836a138 [analyzer] Pass correct bldrCtx to computeObjectUnderConstruction
In case when the prvalue is returned from the function (kind is one
of `SimpleReturnedValueKind`, `CXX17ElidedCopyReturnedValueKind`),
then it construction happens in context of the caller.
We pass `BldrCtx` explicitly, as `currBldrCtx` will always refer to callee
context.

In the following example:
```
struct Result {int value; };
Result create() { return Result{10}; }
int accessValue(Result r) { return r.value; }

void test() {
   for (int i = 0; i < 2; ++i)
      accessValue(create());
}
```

In case when the returned object was constructed directly into the
argument to a function call `accessValue(create())`, this led to
inappropriate value of `blockCount` being used to locate parameter region,
and as a consequence resulting object (from `create()`) was constructed
into a different region, that was later read by inlined invocation of
outer function (`accessValue`).
This manifests itself only in case when calling block is visited more
than once (loop in above example), as otherwise there is no difference
in `blockCount` value between callee and caller context.
This happens only in case when copy elision is disabled (before C++17).

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D132030
2022-09-26 11:39:10 +02:00
Jan Korous
85d97aac80 [analyzer] Support implicit parameter 'self' in path note
showBRParamDiagnostics assumed stores happen only via function parameters while that
can also happen via implicit parameters like 'self' or 'this'.
The regression test caused a failed assert in the original cast to ParmVarDecl.

Differential Revision: https://reviews.llvm.org/D133815
2022-09-21 17:26:09 -07:00
isuckatcs
6931d311ea [analyzer] Cleanup some artifacts from non-POD array evaluation
Most of the state traits used for non-POD array evaluation were
only cleaned up if the ctors/dtors were inlined, since the cleanup
happened in ExprEngine::processCallExit(). This patch makes sure
they are removed even if said functions are not inlined.

Differential Revision: https://reviews.llvm.org/D133643
2022-09-17 22:46:27 +02:00
Kazu Hirata
8009d236e5 [clang] Don't include SetVector.h (NFC) 2022-09-17 13:36:13 -07:00
Balazs Benics
7cddf9cad1 [analyzer] Dump the environment entry kind as well
By this change the `exploded-graph-rewriter` will display the class kind
of the expression of the environment entry. It makes easier to decide if
the given entry corresponds to the lvalue or to the rvalue of some
expression.

It turns out the rewriter already had support for visualizing it, but
probably was never actually used?

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D132109
2022-09-13 09:04:27 +02:00
Balazs Benics
afcd862b2e [analyzer] LazyCompoundVals should be always bound as default bindings
`LazyCompoundVals` should only appear as `default` bindings in the
store. This fixes the second case in this patch-stack.

Depends on: D132142

Reviewed By: xazax.hun

Differential Revision: https://reviews.llvm.org/D132143
2022-09-13 08:58:46 +02:00
Balazs Benics
f8643a9b31 [analyzer] Prefer wrapping SymbolicRegions by ElementRegions
It turns out that in certain cases `SymbolRegions` are wrapped by
`ElementRegions`; in others, it's not. This discrepancy can cause the
analyzer not to recognize if the two regions are actually referring to
the same entity, which then can lead to unreachable paths discovered.

Consider this example:

```lang=C++
struct Node { int* ptr; };
void with_structs(Node* n1) {
  Node c = *n1; // copy
  Node* n2 = &c;
  clang_analyzer_dump(*n1); // lazy...
  clang_analyzer_dump(*n2); // lazy...
  clang_analyzer_dump(n1->ptr); // rval(n1->ptr): reg_$2<int * SymRegion{reg_$0<struct Node * n1>}.ptr>
  clang_analyzer_dump(n2->ptr); // rval(n2->ptr): reg_$1<int * Element{SymRegion{reg_$0<struct Node * n1>},0 S64b,struct Node}.ptr>
  clang_analyzer_eval(n1->ptr != n2->ptr); // UNKNOWN, bad!
  (void)(*n1);
  (void)(*n2);
}
```

The copy of `n1` will insert a new binding to the store; but for doing
that it actually must create a `TypedValueRegion` which it could pass to
the `LazyCompoundVal`. Since the memregion in question is a
`SymbolicRegion` - which is untyped, it needs to first wrap it into an
`ElementRegion` basically implementing this untyped -> typed conversion
for the sake of passing it to the `LazyCompoundVal`.
So, this is why we have `Element{SymRegion{.}, 0,struct Node}` for `n1`.

The problem appears if the analyzer evaluates a read from the expression
`n1->ptr`. The same logic won't apply for `SymbolRegionValues`, since
they accept raw `SubRegions`, hence the `SymbolicRegion` won't be
wrapped into an `ElementRegion` in that case.

Later when we arrive at the equality comparison, we cannot prove that
they are equal.

For more details check the corresponding thread on discourse:
https://discourse.llvm.org/t/are-symbolicregions-really-untyped/64406

---

In this patch, I'm eagerly wrapping each `SymbolicRegion` by an
`ElementRegion`; basically canonicalizing to this form.
It seems reasonable to do so since any object can be thought of as a single
array of that object; so this should not make much of a difference.

The tests also underpin this assumption, as only a few were broken by
this change; and actually fixed a FIXME along the way.

About the second example, which does the same copy operation - but on
the heap - it will be fixed by the next patch.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D132142
2022-09-13 08:58:46 +02:00
isuckatcs
a11e51e91f [analyzer] Track trivial copy/move constructors and initializer lists in the BugReporter
If an object has a trivial copy/move constructor, it's not inlined
on invocation but a trivial copy is performed instead. This patch
handles trivial copies in the bug reporter by matching the field
regions of the 2 objects involved in the copy/move construction,
and tracking the appropriate region further. This patch also
introduces some support for tracking values in initializer lists.

Differential Revision: https://reviews.llvm.org/D131262
2022-09-05 17:06:27 +02:00
isuckatcs
a46154cb1c [analyzer] Warn if the size of the array in new[] is undefined
This patch introduces a new checker, called NewArraySize checker,
which detects if the expression that yields the element count of
the array in new[], results in an Undefined value.

Differential Revision: https://reviews.llvm.org/D131299
2022-09-04 23:06:58 +02:00
Kazu Hirata
b7a7aeee90 [clang] Qualify auto in range-based for loops (NFC) 2022-09-03 23:27:27 -07:00
isuckatcs
b5147937b2 [analyzer] Add more information to the Exploded Graph
This patch dumps every state trait in the egraph. Also
the empty state traits are no longer dumped, instead
they are treated as null by the egraph rewriter script,
which solves reverse compatibility issues.

Differential Revision: https://reviews.llvm.org/D131187
2022-09-03 00:21:05 +02:00
Balázs Kéri
d56a1c6824 [clang][analyzer] Errno modeling code refactor (NFC).
Some of the code used in StdLibraryFunctionsChecker is applicable to
other checkers, this is put into common functions. Errno related
parts of the checker are simplified and renamed. Documentations in
errno_modeling functions are updated.

This change makes it available to have more checkers that perform
modeling of some standard functions. These can set the errno state
with common functions and the bug report messages (note tags) can
look similar.

Reviewed By: steakhal, martong

Differential Revision: https://reviews.llvm.org/D131879
2022-09-01 09:05:59 +02:00
Martin Storsjö
efc76a1ac5 [analyzer] Silence GCC warnings about unused variables. NFC.
Use `isa<T>()` instead of `Type *Var = dyn_cast<T>()`
when the result of the cast isn't used.
2022-08-29 13:26:13 +03:00
ziqingluo-90
a5e354ec4d [analyzer] Fixing a bug raising false positives of stack block object
leaking in ARC mode

When ARC (automatic reference count) is enabled, (objective-c) block
objects are automatically retained and released thus they do not leak.
Without ARC, they still can leak from an expiring stack frame like
other stack variables.
With this commit, the static analyzer now puts a block object in an
"unknown" region if ARC is enabled because it is up to the
implementation to choose whether to put the object on stack initially
(then move to heap when needed) or in heap directly under ARC.
Therefore, the `StackAddrEscapeChecker` has no need to know
specifically about ARC at all and it will not report errors on objects
in "unknown" regions.

Reviewed By: NoQ (Artem Dergachev)

Differential Revision: https://reviews.llvm.org/D131009
2022-08-26 12:19:32 -07:00
isuckatcs
e3e9082b01 [analyzer] Fix for incorrect handling of 0 length non-POD array construction
Prior to this patch when the analyzer encountered a non-POD 0 length array,
it still invoked the constructor for 1 element, which lead to false positives.
This patch makes sure that we no longer construct any elements when we see a
0 length array.

Differential Revision: https://reviews.llvm.org/D131501
2022-08-25 12:42:02 +02:00
isuckatcs
aac73a31ad [analyzer] Process non-POD array element destructors
The constructors of non-POD array elements are evaluated under
certain conditions. This patch makes sure that in such cases
we also evaluate the destructors.

Differential Revision: https://reviews.llvm.org/D130737
2022-08-24 01:28:21 +02:00
Fred Tingaud
16cb3be626 [analyzer] Deadstore static analysis: Fix false positive on C++17 assignments
Dead store detection automatically checks that an expression is a
CXXConstructor and skips it because of potential side effects. In C++17,
with guaranteed copy elision, this check can fail because we actually
receive the implicit cast of a CXXConstructor.
Most checks in the dead store analysis were already stripping all casts
and parenthesis and those that weren't were either forgotten (like the
constructor) or would not suffer from it, so this patch proposes to
factorize the stripping.
It has an impact on where the dead store warning is reported in the case
of an explicit cast, from

  auto a = static_cast<B>(A());
           ^~~~~~~~~~~~~~~~~~~

to

  auto a = static_cast<B>(A());
                          ^~~

which we think is an improvement.

Patch By: frederic-tingaud-sonarsource

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D126534
2022-08-23 18:33:26 +02:00
isuckatcs
c81bf940c7 [analyzer] Handling non-POD multidimensional arrays in ArrayInitLoopExpr
This patch makes it possible for lambdas, implicit copy/move ctors
and structured bindings to handle non-POD multidimensional arrays.

Differential Revision: https://reviews.llvm.org/D131840
2022-08-22 13:53:53 +02:00
isuckatcs
3c482632e6 [analyzer] Remove pattern matching of lambda capture initializers
Prior to this patch we handled lambda captures based on their
initializer expression, which resulted in pattern matching. With
C++17 copy elision the initializer expression can be anything,
and this approach proved to be fragile and a source of crashes.
This patch removes pattern matching and only checks whether the
object is under construction or not.

Differential Revision: https://reviews.llvm.org/D131944
2022-08-22 13:00:31 +02:00
isuckatcs
a47ec1b797 [analyzer][NFC] Be more descriptive when we replay without inlining
This patch adds a ProgramPointTag to the EpsilonPoint created
before we replay a call without inlining.

Differential Revision: https://reviews.llvm.org/D132246
2022-08-19 18:05:52 +02:00
isuckatcs
b4e3e3a3eb [analyzer] Fix a crash on copy elided initialized lambda captures
Inside `ExprEngine::VisitLambdaExpr()` we wasn't prepared for a
copy elided initialized capture's `InitExpr`. This patch teaches
the analyzer how to handle such situation.

Differential Revision: https://reviews.llvm.org/D131784
2022-08-13 00:22:01 +02:00
Denys Petrov
adcd4b1c0b [analyzer] [NFC] Fix comments into more regular form. 2022-08-11 21:28:23 +03:00
malavikasamak
c74a204826 [analyzer] Fix false positive in use-after-move checker
Differential Revision: https://reviews.llvm.org/D131525
2022-08-09 17:26:30 -07:00
Fangrui Song
32197830ef [clang][clang-tools-extra] LLVM_NODISCARD => [[nodiscard]]. NFC 2022-08-09 07:11:18 +00:00
Fangrui Song
3f18f7c007 [clang] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D131346
2022-08-08 09:12:46 -07:00
Balázs Kéri
501faaa0d6 [clang][analyzer] Add more wide-character functions to CStringChecker
Support for functions wmempcpy, wmemmove, wmemcmp is added to the checker.
The same tests are copied that exist for the non-wide versions, with
non-wide functions and character types changed to the wide version.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D130470
2022-08-05 10:32:53 +02:00
Corentin Jabot
127bf44385 [Clang][C++20] Support capturing structured bindings in lambdas
This completes the implementation of P1091R3 and P1381R1.

This patch allow the capture of structured bindings
both for C++20+ and C++17, with extension/compat warning.

In addition, capturing an anonymous union member,
a bitfield, or a structured binding thereof now has a
better diagnostic.

We only support structured bindings - as opposed to other kinds
of structured statements/blocks. We still emit an error for those.

In addition, support for structured bindings capture is entirely disabled in
OpenMP mode as this needs more investigation - a specific diagnostic indicate the feature is not yet supported there.

Note that the rest of P1091R3 (static/thread_local structured bindings) was already implemented.

at the request of @shafik, i can confirm the correct behavior of lldb wit this change.

Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/52720

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D122768
2022-08-04 10:12:53 +02:00
Corentin Jabot
a274219600 Revert "[Clang][C++20] Support capturing structured bindings in lambdas"
This reverts commit 44f2baa380.

Breaks self builds and seems to have conformance issues.
2022-08-03 21:00:29 +02:00
Corentin Jabot
44f2baa380 [Clang][C++20] Support capturing structured bindings in lambdas
This completes the implementation of P1091R3 and P1381R1.

This patch allow the capture of structured bindings
both for C++20+ and C++17, with extension/compat warning.

In addition, capturing an anonymous union member,
a bitfield, or a structured binding thereof now has a
better diagnostic.

We only support structured bindings - as opposed to other kinds
of structured statements/blocks. We still emit an error for those.

In addition, support for structured bindings capture is entirely disabled in
OpenMP mode as this needs more investigation - a specific diagnostic indicate the feature is not yet supported there.

Note that the rest of P1091R3 (static/thread_local structured bindings) was already implemented.

at the request of @shafik, i can confirm the correct behavior of lldb wit this change.

Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/52720

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D122768
2022-08-03 20:00:01 +02:00
isuckatcs
10a7ee0bac [analyzer] Fix for the crash in #56873
In ExprEngine::bindReturnValue() we cast an SVal to DefinedOrUnknownSVal,
however this SVal can also be Undefined, which leads to an assertion failure.

Fixes: #56873

Differential Revision: https://reviews.llvm.org/D130974
2022-08-03 19:25:02 +02:00
Gabriel Ravier
5674a3c880 Fixed a number of typos
I went over the output of the following mess of a command:

(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z |
 parallel --xargs -0 cat | aspell list --mode=none --ignore-case |
 grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n |
 grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)

and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).

Differential Revision: https://reviews.llvm.org/D130827
2022-08-01 13:13:18 -04:00
Matheus Izvekov
15f3cd6bfc
[clang] Implement ElaboratedType sugaring for types written bare
Without this patch, clang will not wrap in an ElaboratedType node types written
without a keyword and nested name qualifier, which goes against the intent that
we should produce an AST which retains enough details to recover how things are
written.

The lack of this sugar is incompatible with the intent of the type printer
default policy, which is to print types as written, but to fall back and print
them fully qualified when they are desugared.

An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still
requires pointer alignment due to pre-existing bug in the TypeLoc buffer
handling.

---

Troubleshooting list to deal with any breakage seen with this patch:

1) The most likely effect one would see by this patch is a change in how
   a type is printed. The type printer will, by design and default,
   print types as written. There are customization options there, but
   not that many, and they mainly apply to how to print a type that we
   somehow failed to track how it was written. This patch fixes a
   problem where we failed to distinguish between a type
   that was written without any elaborated-type qualifiers,
   such as a 'struct'/'class' tags and name spacifiers such as 'std::',
   and one that has been stripped of any 'metadata' that identifies such,
   the so called canonical types.
   Example:
   ```
   namespace foo {
     struct A {};
     A a;
   };
   ```
   If one were to print the type of `foo::a`, prior to this patch, this
   would result in `foo::A`. This is how the type printer would have,
   by default, printed the canonical type of A as well.
   As soon as you add any name qualifiers to A, the type printer would
   suddenly start accurately printing the type as written. This patch
   will make it print it accurately even when written without
   qualifiers, so we will just print `A` for the initial example, as
   the user did not really write that `foo::` namespace qualifier.

2) This patch could expose a bug in some AST matcher. Matching types
   is harder to get right when there is sugar involved. For example,
   if you want to match a type against being a pointer to some type A,
   then you have to account for getting a type that is sugar for a
   pointer to A, or being a pointer to sugar to A, or both! Usually
   you would get the second part wrong, and this would work for a
   very simple test where you don't use any name qualifiers, but
   you would discover is broken when you do. The usual fix is to
   either use the matcher which strips sugar, which is annoying
   to use as for example if you match an N level pointer, you have
   to put N+1 such matchers in there, beginning to end and between
   all those levels. But in a lot of cases, if the property you want
   to match is present in the canonical type, it's easier and faster
   to just match on that... This goes with what is said in 1), if
   you want to match against the name of a type, and you want
   the name string to be something stable, perhaps matching on
   the name of the canonical type is the better choice.

3) This patch could expose a bug in how you get the source range of some
   TypeLoc. For some reason, a lot of code is using getLocalSourceRange(),
   which only looks at the given TypeLoc node. This patch introduces a new,
   and more common TypeLoc node which contains no source locations on itself.
   This is not an inovation here, and some other, more rare TypeLoc nodes could
   also have this property, but if you use getLocalSourceRange on them, it's not
   going to return any valid locations, because it doesn't have any. The right fix
   here is to always use getSourceRange() or getBeginLoc/getEndLoc which will dive
   into the inner TypeLoc to get the source range if it doesn't find it on the
   top level one. You can use getLocalSourceRange if you are really into
   micro-optimizations and you have some outside knowledge that the TypeLocs you are
   dealing with will always include some source location.

4) Exposed a bug somewhere in the use of the normal clang type class API, where you
   have some type, you want to see if that type is some particular kind, you try a
   `dyn_cast` such as `dyn_cast<TypedefType>` and that fails because now you have an
   ElaboratedType which has a TypeDefType inside of it, which is what you wanted to match.
   Again, like 2), this would usually have been tested poorly with some simple tests with
   no qualifications, and would have been broken had there been any other kind of type sugar,
   be it an ElaboratedType or a TemplateSpecializationType or a SubstTemplateParmType.
   The usual fix here is to use `getAs` instead of `dyn_cast`, which will look deeper
   into the type. Or use `getAsAdjusted` when dealing with TypeLocs.
   For some reason the API is inconsistent there and on TypeLocs getAs behaves like a dyn_cast.

5) It could be a bug in this patch perhaps.

Let me know if you need any help!

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Differential Revision: https://reviews.llvm.org/D112374
2022-07-27 11:10:54 +02:00
Chuanqi Xu
5588985212 [NFC] Convert a dyn_cast<> to an isa<> 2022-07-27 13:56:38 +08:00
Balazs Benics
a80418eec0 [analyzer] Improve loads from reinterpret-cast fields
Consider this example:

```lang=C++
struct header {
  unsigned a : 1;
  unsigned b : 1;
};
struct parse_t {
  unsigned bits0 : 1;
  unsigned bits2 : 2; // <-- header
  unsigned bits4 : 4;
};
int parse(parse_t *p) {
  unsigned copy = p->bits2;
  clang_analyzer_dump(copy);
  // expected-warning@-1 {{reg_$1<unsigned int SymRegion{reg_$0<struct Bug_55934::parse_t * p>}.bits2>}}

  header *bits = (header *)&copy;
  clang_analyzer_dump(bits->b); // <--- Was UndefinedVal previously.
  // expected-warning@-1 {{derived_$2{reg_$1<unsigned int SymRegion{reg_$0<struct Bug_55934::parse_t * p>}.bits2>,Element{copy,0 S64b,struct Bug_55934::header}.b}}}
  return bits->b; // no-warning: it's not UndefinedVal
}
```

`bits->b` should have the same content as the second bit of `p->bits2`
(assuming that the bitfields are in spelling order).

---

The `Store` has the correct bindings. The problem is with the load of `bits->b`.
It will eventually reach `RegionStoreManager::getBindingForField()` with
`Element{copy,0 S64b,struct header}.b`, which is a `FieldRegion`.
It did not find any direct bindings, so the `getBindingForFieldOrElementCommon()`
gets called. That won't find any bindings, but it sees that the variable
is on the //stack//, thus it must be an uninitialized local variable;
thus it returns `UndefinedVal`.

Instead of doing this, it should have created a //derived symbol//
representing the slice of the region corresponding to the member.
So, if the value of `copy` is `reg1`, then the value of `bits->b` should
be `derived{reg1, elem{copy,0, header}.b}`.

Actually, the `getBindingForElement()` already does exactly this for
reinterpret-casts, so I decided to hoist that and reuse the logic.

Fixes #55934

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D128535
2022-07-26 12:31:21 +02:00
Benjamin Kramer
ad17e69923 [analyzer] Fix unused variable warning in release builds. NFC. 2022-07-26 11:29:38 +02:00
David Spickett
f3fbbe1cf3 [clang][analyzer][NFC] Use value_or instead of ValueOr
The latter is deprecated.
2022-07-26 09:16:45 +00:00
isuckatcs
a618d5e0dd [analyzer] Structured binding to tuple-like types
Introducing support for creating structured binding
to tuple-like types.

Differential Revision: https://reviews.llvm.org/D128837
2022-07-26 10:24:29 +02:00
isuckatcs
996b092c5e [analyzer] Lambda capture non-POD type array
This patch introduces a new `ConstructionContext` for
lambda capture. This `ConstructionContext` allows the
analyzer to construct the captured object directly into
it's final region, and makes it possible to capture
non-POD arrays.

Differential Revision: https://reviews.llvm.org/D129967
2022-07-26 09:40:25 +02:00
isuckatcs
8a13326d18 [analyzer] ArrayInitLoopExpr with array of non-POD type
This patch introduces the evaluation of ArrayInitLoopExpr
in case of structured bindings and implicit copy/move
constructor. The idea is to call the copy constructor for
every element in the array. The parameter of the copy
constructor is also manually selected, as it is not a part
of the CFG.

Differential Revision: https://reviews.llvm.org/D129496
2022-07-26 09:07:22 +02:00
Kazu Hirata
3f3930a451 Remove redundaunt virtual specifiers (NFC)
Identified with tidy-modernize-use-override.
2022-07-25 23:00:59 -07:00
Kazu Hirata
ae002f8bca Use isa instead of dyn_cast (NFC) 2022-07-25 23:00:58 -07:00
Balázs Kéri
94ca2beccc [clang][analyzer] Added partial wide character support to CStringChecker
Support for functions wmemcpy, wcslen, wcsnlen is added to the checker.
Documentation and tests are updated and extended with the new functions.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D130091
2022-07-25 09:23:14 +02:00
Kazu Hirata
95a932fb15 Remove redundaunt override specifiers (NFC)
Identified with modernize-use-override.
2022-07-24 22:28:11 -07:00
Kazu Hirata
a210f404da [clang] Remove redundant virtual specifies (NFC)
Identified with modernize-use-override.
2022-07-24 22:02:58 -07:00
Kazu Hirata
9e88cbcc40 Use any_of (NFC) 2022-07-24 14:48:11 -07:00
Denys Petrov
a364987368 [analyzer][NFC] Use SValVisitor instead of explicit helper functions
Summary: Get rid of explicit function splitting in favor of specifically designed Visitor. Move logic from a family of `evalCastKind` and `evalCastSubKind` helper functions to `SValVisitor`.

Differential Revision: https://reviews.llvm.org/D130029
2022-07-19 23:10:00 +03:00
serge-sans-paille
f764dc99b3 [clang] Introduce -fstrict-flex-arrays=<n> for stricter handling of flexible arrays
Some code [0] consider that trailing arrays are flexible, whatever their size.
Support for these legacy code has been introduced in
f8f6324983 but it prevents evaluation of
__builtin_object_size and __builtin_dynamic_object_size in some legit cases.

Introduce -fstrict-flex-arrays=<n> to have stricter conformance when it is
desirable.

n = 0: current behavior, any trailing array member is a flexible array. The default.
n = 1: any trailing array member of undefined, 0 or 1 size is a flexible array member
n = 2: any trailing array member of undefined or 0 size is a flexible array member

This takes into account two specificities of clang: array bounds as macro id
disqualify FAM, as well as non standard layout.

Similar patch for gcc discuss here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836

[0] https://docs.freebsd.org/en/books/developers-handbook/sockets/#sockets-essential-functions
2022-07-18 12:45:52 +02:00
Denys Petrov
bc08c3cb7f [analyzer] Add new function clang_analyzer_value to ExprInspectionChecker
Summary: Introduce a new function 'clang_analyzer_value'. It emits a report that in turn prints a RangeSet or APSInt associated with SVal. If there is no associated value, prints "n/a".
2022-07-15 20:07:04 +03:00
Denys Petrov
82f76c0477 [analyzer][NFC] Tidy up handler-functions in SymbolicRangeInferrer
Summary: Sorted some handler-functions into more appropriate visitor functions of the SymbolicRangeInferrer.
- Spread `getRangeForNegatedSub` body over several visitor functions: `VisitSymExpr`, `VisitSymIntExpr`, `VisitSymSymExpr`.
- Moved `getRangeForComparisonSymbol` from `infer` to `VisitSymSymExpr`.

Differential Revision: https://reviews.llvm.org/D129678
2022-07-15 19:24:57 +03:00
Fangrui Song
3c849d0aef Modernize Optional::{getValueOr,hasValue} 2022-07-15 01:20:39 -07:00
Jonas Devlieghere
888673b6e3
Revert "[clang] Implement ElaboratedType sugaring for types written bare"
This reverts commit 7c51f02eff because it
stills breaks the LLDB tests. This was  re-landed without addressing the
issue or even agreement on how to address the issue. More details and
discussion in https://reviews.llvm.org/D112374.
2022-07-14 21:17:48 -07:00
Matheus Izvekov
7c51f02eff
[clang] Implement ElaboratedType sugaring for types written bare
Without this patch, clang will not wrap in an ElaboratedType node types written
without a keyword and nested name qualifier, which goes against the intent that
we should produce an AST which retains enough details to recover how things are
written.

The lack of this sugar is incompatible with the intent of the type printer
default policy, which is to print types as written, but to fall back and print
them fully qualified when they are desugared.

An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still
requires pointer alignment due to pre-existing bug in the TypeLoc buffer
handling.

---

Troubleshooting list to deal with any breakage seen with this patch:

1) The most likely effect one would see by this patch is a change in how
   a type is printed. The type printer will, by design and default,
   print types as written. There are customization options there, but
   not that many, and they mainly apply to how to print a type that we
   somehow failed to track how it was written. This patch fixes a
   problem where we failed to distinguish between a type
   that was written without any elaborated-type qualifiers,
   such as a 'struct'/'class' tags and name spacifiers such as 'std::',
   and one that has been stripped of any 'metadata' that identifies such,
   the so called canonical types.
   Example:
   ```
   namespace foo {
     struct A {};
     A a;
   };
   ```
   If one were to print the type of `foo::a`, prior to this patch, this
   would result in `foo::A`. This is how the type printer would have,
   by default, printed the canonical type of A as well.
   As soon as you add any name qualifiers to A, the type printer would
   suddenly start accurately printing the type as written. This patch
   will make it print it accurately even when written without
   qualifiers, so we will just print `A` for the initial example, as
   the user did not really write that `foo::` namespace qualifier.

2) This patch could expose a bug in some AST matcher. Matching types
   is harder to get right when there is sugar involved. For example,
   if you want to match a type against being a pointer to some type A,
   then you have to account for getting a type that is sugar for a
   pointer to A, or being a pointer to sugar to A, or both! Usually
   you would get the second part wrong, and this would work for a
   very simple test where you don't use any name qualifiers, but
   you would discover is broken when you do. The usual fix is to
   either use the matcher which strips sugar, which is annoying
   to use as for example if you match an N level pointer, you have
   to put N+1 such matchers in there, beginning to end and between
   all those levels. But in a lot of cases, if the property you want
   to match is present in the canonical type, it's easier and faster
   to just match on that... This goes with what is said in 1), if
   you want to match against the name of a type, and you want
   the name string to be something stable, perhaps matching on
   the name of the canonical type is the better choice.

3) This patch could exposed a bug in how you get the source range of some
   TypeLoc. For some reason, a lot of code is using getLocalSourceRange(),
   which only looks at the given TypeLoc node. This patch introduces a new,
   and more common TypeLoc node which contains no source locations on itself.
   This is not an inovation here, and some other, more rare TypeLoc nodes could
   also have this property, but if you use getLocalSourceRange on them, it's not
   going to return any valid locations, because it doesn't have any. The right fix
   here is to always use getSourceRange() or getBeginLoc/getEndLoc which will dive
   into the inner TypeLoc to get the source range if it doesn't find it on the
   top level one. You can use getLocalSourceRange if you are really into
   micro-optimizations and you have some outside knowledge that the TypeLocs you are
   dealing with will always include some source location.

4) Exposed a bug somewhere in the use of the normal clang type class API, where you
   have some type, you want to see if that type is some particular kind, you try a
   `dyn_cast` such as `dyn_cast<TypedefType>` and that fails because now you have an
   ElaboratedType which has a TypeDefType inside of it, which is what you wanted to match.
   Again, like 2), this would usually have been tested poorly with some simple tests with
   no qualifications, and would have been broken had there been any other kind of type sugar,
   be it an ElaboratedType or a TemplateSpecializationType or a SubstTemplateParmType.
   The usual fix here is to use `getAs` instead of `dyn_cast`, which will look deeper
   into the type. Or use `getAsAdjusted` when dealing with TypeLocs.
   For some reason the API is inconsistent there and on TypeLocs getAs behaves like a dyn_cast.

5) It could be a bug in this patch perhaps.

Let me know if you need any help!

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Differential Revision: https://reviews.llvm.org/D112374
2022-07-15 04:16:55 +02:00
isuckatcs
b032e3ff61 [analyzer] Evaluate construction of non-POD type arrays
Introducing the support for evaluating the constructor
of every element in an array. The idea is to record the
index of the current array member being constructed and
create a loop during the analysis. We looping over the
same CXXConstructExpr as many times as many elements
the array has.

Differential Revision: https://reviews.llvm.org/D127973
2022-07-14 23:30:21 +02:00
Ella Ma
32fe1a4be9 [analyzer] Fixing SVal::getType returns Null Type for NonLoc::ConcreteInt in boolean type
In method `TypeRetrievingVisitor::VisitConcreteInt`, `ASTContext::getIntTypeForBitwidth` is used to get the type for `ConcreteInt`s.
However, the getter in ASTContext cannot handle the boolean type with the bit width of 1, which will make method `SVal::getType` return a Null `Type`.
In this patch, a check for this case is added to fix this problem by returning the bool type directly when the bit width is 1.

Differential Revision: https://reviews.llvm.org/D129737
2022-07-14 22:00:38 +08:00
Kazu Hirata
cb2c8f694d [clang] Use value instead of getValue (NFC) 2022-07-13 23:39:33 -07:00
einvbri
1d7e58cfad [analyzer] Fix use of length in CStringChecker
CStringChecker is using getByteLength to get the length of a string
literal. For targets where a "char" is 8-bits, getByteLength() and
getLength() will be equal for a C string, but for targets where a "char"
is 16-bits getByteLength() returns the size in octets.

This is verified in our downstream target, but we have no way to add a
test case for this case since there is no target supporting 16-bit
"char" upstream. Since this cannot have a test case, I'm asserted this
change is "correct by construction", and visually inspected to be
correct by way of the following example where this was found.

The case that shows this fails using a target with 16-bit chars is here.
getByteLength() for the string literal returns 4, which fails when
checked against "char x[4]". With the change, the string literal is
evaluated to a size of 2 which is a correct number of "char"'s for a
16-bit target.

```
void strcpy_no_overflow_2(char *y) {
  char x[4];
  strcpy(x, "12"); // with getByteLength(), returns 4 using 16-bit chars
}
```

This change exposed that embedded nulls within the string are not
handled. This is documented as a FIXME for a future fix.

```
    void strcpy_no_overflow_3(char *y) {
      char x[3];
      strcpy(x, "12\0");
    }

```

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D129269
2022-07-13 19:19:23 -05:00
Jonas Devlieghere
3968936b92
Revert "[clang] Implement ElaboratedType sugaring for types written bare"
This reverts commit bdc6974f92 because it
breaks all the LLDB tests that import the std module.

  import-std-module/array.TestArrayFromStdModule.py
  import-std-module/deque-basic.TestDequeFromStdModule.py
  import-std-module/deque-dbg-info-content.TestDbgInfoContentDequeFromStdModule.py
  import-std-module/forward_list.TestForwardListFromStdModule.py
  import-std-module/forward_list-dbg-info-content.TestDbgInfoContentForwardListFromStdModule.py
  import-std-module/list.TestListFromStdModule.py
  import-std-module/list-dbg-info-content.TestDbgInfoContentListFromStdModule.py
  import-std-module/queue.TestQueueFromStdModule.py
  import-std-module/stack.TestStackFromStdModule.py
  import-std-module/vector.TestVectorFromStdModule.py
  import-std-module/vector-bool.TestVectorBoolFromStdModule.py
  import-std-module/vector-dbg-info-content.TestDbgInfoContentVectorFromStdModule.py
  import-std-module/vector-of-vectors.TestVectorOfVectorsFromStdModule.py

https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/45301/
2022-07-13 09:20:30 -07:00
Kazu Hirata
53daa177f8 [clang, clang-tools-extra] Use has_value instead of hasValue (NFC) 2022-07-12 22:47:41 -07:00
Matheus Izvekov
bdc6974f92
[clang] Implement ElaboratedType sugaring for types written bare
Without this patch, clang will not wrap in an ElaboratedType node types written
without a keyword and nested name qualifier, which goes against the intent that
we should produce an AST which retains enough details to recover how things are
written.

The lack of this sugar is incompatible with the intent of the type printer
default policy, which is to print types as written, but to fall back and print
them fully qualified when they are desugared.

An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still
requires pointer alignment due to pre-existing bug in the TypeLoc buffer
handling.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Differential Revision: https://reviews.llvm.org/D112374
2022-07-13 02:10:09 +02:00
Gabor Marton
2df120784a [analyzer] Fix assertion in simplifySymbolCast
Depends on D128068.
Added a new test code that fails an assertion in the baseline.
That is because `getAPSIntType` works only with integral types.

Differential Revision: https://reviews.llvm.org/D126779
2022-07-05 19:00:23 +02:00
Gabor Marton
5d7fa481cf [analyzer] Do not emit redundant SymbolCasts
In `RegionStore::getBinding` we call `evalCast` unconditionally to align
the stored value's type to the one that is being queried. However, the
stored type might be the same, so we may end up having redundant
`SymbolCasts` emitted.

The solution is to check whether the `to` and `from` type are the same
in `makeNonLoc`.

Note, we can't just do type equivalence check at the beginning of `evalCast`
because when `evalCast` is called from `getBinding` then the original type
(`OriginalTy`) is not set, so one operand is missing for the comparison. In
`evalCastSubKind(nonloc::SymbolVal)` when the original type is not set,
we get the `from` type via `SymbolVal::getType()`.

Differential Revision: https://reviews.llvm.org/D128068
2022-07-05 18:42:34 +02:00
Fazlay Rabbi
38bcd483dd [OpenMP] Initial parsing and semantic support for 'parallel masked taskloop simd' construct
This patch gives basic parsing and semantic support for
"parallel masked taskloop simd" construct introduced in
OpenMP 5.1 (section 2.16.10)

Differential Revision: https://reviews.llvm.org/D128946
2022-07-01 08:57:15 -07:00
Fazlay Rabbi
d64ba896d3 [OpenMP] Initial parsing and sema support for 'parallel masked taskloop' construct
This patch gives basic parsing and semantic support for
"parallel masked taskloop" construct introduced in
OpenMP 5.1 (section 2.16.9)

Differential Revision: https://reviews.llvm.org/D128834
2022-06-30 11:44:17 -07:00
Corentin Jabot
64ab2b1dcc Improve handling of static assert messages.
Instead of dumping the string literal (which
quotes it and escape every non-ascii symbol),
we can use the content of the string when it is a
8 byte string.

Wide, UTF-8/UTF-16/32 strings are still completely
escaped, until we clarify how these entities should
behave (cf https://wg21.link/p2361).

`FormatDiagnostic` is modified to escape
non printable characters and invalid UTF-8.

This ensures that unicode characters, spaces and new
lines are properly rendered in static messages.
This make clang more consistent with other implementation
and fixes this tweet
https://twitter.com/jfbastien/status/1298307325443231744 :)

Of note, `PaddingChecker` did print out new lines that were
later removed by the diagnostic printing code.
To be consistent with its tests, the new lines are removed
from the diagnostic.

Unicode tables updated to both use the Unicode definitions
and the Unicode 14.0 data.

U+00AD SOFT HYPHEN is still considered a print character
to match existing practices in terminals, in addition of
being considered a formatting character as per Unicode.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D108469
2022-06-29 14:57:35 +02:00
isuckatcs
9d2e830737 [analyzer] Fix BindingDecl evaluation for reference types
The case when the bound variable is reference type in a
BindingDecl wasn't handled, which lead to false positives.

Differential Revision: https://reviews.llvm.org/D128716
2022-06-29 13:01:19 +02:00
Fazlay Rabbi
73e5d7bdff [OpenMP] Initial parsing and sema support for 'masked taskloop simd' construct
This patch gives basic parsing and semantic support for
"masked taskloop simd" construct introduced in OpenMP 5.1 (section 2.16.8)

Differential Revision: https://reviews.llvm.org/D128693
2022-06-28 15:27:49 -07:00
Corentin Jabot
a774ba7f60 Revert "Improve handling of static assert messages."
This reverts commit 870b6d2183.

This seems to break some libc++ tests, reverting while investigating
2022-06-29 00:03:23 +02:00
Corentin Jabot
870b6d2183 Improve handling of static assert messages.
Instead of dumping the string literal (which
quotes it and escape every non-ascii symbol),
we can use the content of the string when it is a
8 byte string.

Wide, UTF-8/UTF-16/32 strings are still completely
escaped, until we clarify how these entities should
behave (cf https://wg21.link/p2361).

`FormatDiagnostic` is modified to escape
non printable characters and invalid UTF-8.

This ensures that unicode characters, spaces and new
lines are properly rendered in static messages.
This make clang more consistent with other implementation
and fixes this tweet
https://twitter.com/jfbastien/status/1298307325443231744 :)

Of note, `PaddingChecker` did print out new lines that were
later removed by the diagnostic printing code.
To be consistent with its tests, the new lines are removed
from the diagnostic.

Unicode tables updated to both use the Unicode definitions
and the Unicode 14.0 data.

U+00AD SOFT HYPHEN is still considered a print character
to match existing practices in terminals, in addition of
being considered a formatting character as per Unicode.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D108469
2022-06-28 22:26:00 +02:00
Vitaly Buka
cdfa15da94 Revert "[clang] Introduce -fstrict-flex-arrays=<n> for stricter handling of flexible arrays"
This reverts D126864 and related fixes.

This reverts commit 572b08790a.
This reverts commit 886715af96.
2022-06-27 14:03:09 -07:00
Kazu Hirata
97afce08cb [clang] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.
2022-06-25 22:26:24 -07:00
Kazu Hirata
3b7c3a654c Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3.
2022-06-25 11:56:50 -07:00
Kazu Hirata
aa8feeefd3 Don't use Optional::hasValue (NFC) 2022-06-25 11:55:57 -07:00
Fazlay Rabbi
42bb88e2aa [OpenMP] Initial parsing and sema support for 'masked taskloop' construct
This patch gives basic parsing and semantic support for "masked taskloop"
construct introduced in OpenMP 5.1 (section 2.16.7)

Differential Revision: https://reviews.llvm.org/D128478
2022-06-24 10:00:08 -07:00
serge-sans-paille
886715af96 [clang] Introduce -fstrict-flex-arrays=<n> for stricter handling of flexible arrays
Some code [0] consider that trailing arrays are flexible, whatever their size.
Support for these legacy code has been introduced in
f8f6324983 but it prevents evaluation of
__builtin_object_size and __builtin_dynamic_object_size in some legit cases.

Introduce -fstrict-flex-arrays=<n> to have stricter conformance when it is
desirable.

n = 0: current behavior, any trailing array member is a flexible array. The default.
n = 1: any trailing array member of undefined, 0 or 1 size is a flexible array member
n = 2: any trailing array member of undefined or 0 size is a flexible array member
n = 3: any trailing array member of undefined size is a flexible array member (strict c99 conformance)

Similar patch for gcc discuss here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836

[0] https://docs.freebsd.org/en/books/developers-handbook/sockets/#sockets-essential-functions
2022-06-24 16:13:29 +02:00
isuckatcs
8ef628088b [analyzer] Structured binding to arrays
Introducing structured binding to data members and more.
To handle binding to arrays, ArrayInitLoopExpr is also
evaluated, which enables the analyzer to store information
in two more cases. These are:
  - when a lambda-expression captures an array by value
  - in the implicit copy/move constructor for a class
    with an array member

Differential Revision: https://reviews.llvm.org/D126613
2022-06-23 11:38:21 +02:00
Balázs Kéri
7dc81c6244 [clang][analyzer] Fix StdLibraryFunctionsChecker 'mkdir' return value.
The functions 'mkdir', 'mknod', 'mkdirat', 'mknodat' return 0 on success
and -1 on failure. The checker modeled these functions with a >= 0
return value on success which is changed to 0 only. This fix makes
ErrnoChecker work better for these functions.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D127277
2022-06-23 11:27:26 +02:00
Balázs Kéri
957014da2d [clang][Analyzer] Add errno state to standard functions modeling.
This updates StdLibraryFunctionsChecker to set the state of 'errno'
by using the new errno_modeling functionality.
The errno value is set in the PostCall callback. Setting it in call::Eval
did not work for some reason and then every function should be
EvalCallAsPure which may be bad to do. Now the errno value and state
is not allowed to be checked in any PostCall checker callback because
it is unspecified if the errno was set already or will be set later
by this checker.

Reviewed By: martong, steakhal

Differential Revision: https://reviews.llvm.org/D125400
2022-06-21 08:56:41 +02:00
Kazu Hirata
ca4af13e48 [clang] Don't use Optional::getValue (NFC) 2022-06-20 22:59:26 -07:00
Kazu Hirata
0916d96d12 Don't use Optional::hasValue (NFC) 2022-06-20 20:17:57 -07:00
Kazu Hirata
064a08cd95 Don't use Optional::hasValue (NFC) 2022-06-20 20:05:16 -07:00
Kazu Hirata
5413bf1bac Don't use Optional::hasValue (NFC) 2022-06-20 11:33:56 -07:00
Kazu Hirata
452db157c9 [clang] Don't use Optional::hasValue (NFC) 2022-06-20 10:51:34 -07:00
Balázs Kéri
60f3b07118 [clang][analyzer] Add checker for bad use of 'errno'.
Extend checker 'ErrnoModeling' with a state of 'errno' to indicate
the importance of the 'errno' value and how it should be used.
Add a new checker 'ErrnoChecker' that observes use of 'errno' and
finds possible wrong uses, based on the "errno state".
The "errno state" should be set (together with value of 'errno')
by other checkers (that perform modeling of the given function)
in the future. Currently only a test function can set this value.
The new checker has no user-observable effect yet.

Reviewed By: martong, steakhal

Differential Revision: https://reviews.llvm.org/D122150
2022-06-20 10:07:31 +02:00
Kazu Hirata
06decd0b41 [clang] Use value_or instead of getValueOr (NFC) 2022-06-18 23:21:34 -07:00
isuckatcs
e77ac66b8c [Static Analyzer] Structured binding to data members
Introducing structured binding to data members.

Differential Revision: https://reviews.llvm.org/D127643
2022-06-17 19:50:10 +02:00
isuckatcs
92bf652d40 [Static Analyzer] Small array binding policy
If a lazyCompoundVal to a struct is bound to the store, there is a policy which decides
whether a copy gets created instead.

This patch introduces a similar policy for arrays, which is required to model structured
binding to arrays without false negatives.

Differential Revision: https://reviews.llvm.org/D128064
2022-06-17 18:56:13 +02:00
Jennifer Yu
bb83f8e70b [OpenMP] Initial parsing and sema for 'parallel masked' construct
Differential Revision: https://reviews.llvm.org/D127454
2022-06-16 18:01:15 -07:00
Balazs Benics
929e60b6bd [analyzer] Relax constraints on const qualified regions
The arithmetic restriction seems to be artificial.
The comment below seems to be stale.
Thus, we remove both.

Depends on D127306.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D127763
2022-06-15 17:08:27 +02:00
Balazs Benics
f4fc3f6ba3 [analyzer] Treat system globals as mutable if they are not const
Previously, system globals were treated as immutable regions, unless it
was the `errno` which is known to be frequently modified.

D124244 wants to add a check for stores to immutable regions.
It would basically turn all stores to system globals into an error even
though we have no reason to believe that those mutable sys globals
should be treated as if they were immutable. And this leads to
false-positives if we apply D124244.

In this patch, I'm proposing to treat mutable sys globals actually
mutable, hence allocate them into the `GlobalSystemSpaceRegion`, UNLESS
they were declared as `const` (and a primitive arithmetic type), in
which case, we should use `GlobalImmutableSpaceRegion`.

In any other cases, I'm using the `GlobalInternalSpaceRegion`, which is
no different than the previous behavior.

---

In the tests I added, only the last `expected-warning` was different, compared to the baseline.
Which is this:
```lang=C++
void test_my_mutable_system_global_constraint() {
  assert(my_mutable_system_global > 2);
  clang_analyzer_eval(my_mutable_system_global > 2); // expected-warning {{TRUE}}
  invalidate_globals();
  clang_analyzer_eval(my_mutable_system_global > 2); // expected-warning {{UNKNOWN}} It was previously TRUE.
}
void test_my_mutable_system_global_assign(int x) {
  my_mutable_system_global = x;
  clang_analyzer_eval(my_mutable_system_global == x); // expected-warning {{TRUE}}
  invalidate_globals();
  clang_analyzer_eval(my_mutable_system_global == x); // expected-warning {{UNKNOWN}} It was previously TRUE.
}
```

---

Unfortunately, the taint checker will be also affected.
The `stdin` global variable is a pointer, which is assumed to be a taint
source, and the rest of the taint propagation rules will propagate from
it.
However, since mutable variables are no longer treated immutable, they
also get invalidated, when an opaque function call happens, such as the
first `scanf(stdin, ...)`. This would effectively remove taint from the
pointer, consequently disable all the rest of the taint propagations
down the line from the `stdin` variable.

All that said, I decided to look through `DerivedSymbol`s as well, to
acquire the memregion in that case as well. This should preserve the
previously existing taint reports.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D127306
2022-06-15 17:08:27 +02:00
Balazs Benics
96ccb690a0 [analyzer][NFC] Prefer using isa<> instead getAs<> in conditions
Depends on D125709

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D127742
2022-06-15 16:58:13 +02:00
Balazs Benics
481f860324 [analyzer][NFC] Remove dead field of UnixAPICheckers
Initially, I thought there is some fundamental bug here by not using the
bool fields, but it turns out D55425 split this checker into two
separate ones; making these fields dead.

Depends on D127836, which uncovered this issue.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D127838
2022-06-15 16:50:12 +02:00
Balazs Benics
6c4f9998ae [analyzer] Fix StreamErrorState hash bug
The `Profile` function was incorrectly implemented.
The `StreamErrorState` has an implicit `bool` conversion operator, which
will result in a different hash than faithfully hashing the raw value of
the enum.

I don't have a test for it, since it seems difficult to find one.
Even if we would have one, any change in the hashing algorithm would
have a chance of breaking it, so I don't think it would justify the
effort.

Depends on D127836, which uncovered this issue by marking the related
`Profile` function dead.

Reviewed By: martong, balazske

Differential Revision: https://reviews.llvm.org/D127839
2022-06-15 16:50:12 +02:00
Balazs Benics
f1b18a79b7 [analyzer][NFC] Remove dead code and modernize surroundings
Thanks @kazu for helping me clean these parts in D127799.

I'm leaving the dump methods, along with the unused visitor handlers and
the forwarding methods.

The dead parts actually helped to uncover two bugs, to which I'm going
to post separate patches.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D127836
2022-06-15 16:50:12 +02:00
Balazs Benics
40940fb2a6 [analyzer][NFC] Substitute the SVal::evalMinus and evalComplement functions
Depends on D126127

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D127734
2022-06-14 18:56:43 +02:00
Balazs Benics
cfc915149c [analyzer][NFC] Relocate unary transfer functions
This is an initial step of removing the SimpleSValBuilder abstraction. The SValBuilder alone should be enough.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D126127
2022-06-14 18:56:43 +02:00
Balazs Benics
de6ba9704d [analyzer][Casting] Support isa, cast, dyn_cast of SVals
This change specializes the LLVM RTTI mechanism for SVals.
After this change, we can use the well-known `isa`, `cast`, `dyn_cast`.

Examples:

  // SVal V = ...;
  // Loc MyLoc = ...;

  bool IsInteresting = isa<loc::MemRegionVal, loc::GotoLabel>(MyLoc);
  auto MRV = cast<loc::MemRegionVal>(MyLoc);
  Optional<loc::MemRegionVal> MaybeMRV = dyn_cast<loc::MemRegionVal>(V)

The current `SVal::getAs` and `castAs` member functions are redundant at
this point, but I believe that they are still handy.

The member function version is terse and reads left-to-right, which IMO
is a great plus. However, we should probably add a variadic `isa` member
function version to have the same casting API in both cases.

Thanks for the extensive TMP help @bzcheeseman!

Reviewed By: bzcheeseman

Differential Revision: https://reviews.llvm.org/D125709
2022-06-14 13:43:04 +02:00
Balazs Benics
ffe7950ebc Reland "[analyzer] Deprecate -analyzer-store region flag"
I'm trying to remove unused options from the `Analyses.def` file, then
merge the rest of the useful options into the `AnalyzerOptions.def`.
Then make sure one can set these by an `-analyzer-config XXX=YYY` style
flag.
Then surface the `-analyzer-config` to the `clang` frontend;

After all of this, we can pursue the tablegen approach described
https://discourse.llvm.org/t/rfc-tablegen-clang-static-analyzer-engine-options-for-better-documentation/61488

In this patch, I'm proposing flag deprecations.
We should support deprecated analyzer flags for exactly one release. In
this case I'm planning to drop this flag in `clang-16`.

In the clang frontend, now we won't pass this option to the cc1
frontend, rather emit a warning diagnostic reminding the users about
this deprecated flag, which will be turned into error in clang-16.

Unfortunately, I had to remove all the tests referring to this flag,
causing a mass change. I've also added a test for checking this warning.

I've seen that `scan-build` also uses this flag, but I think we should
remove that part only after we turn this into a hard error.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D126215
2022-06-14 09:20:41 +02:00
Kazu Hirata
f13019f836 [clang] Use any_of and none_of (NFC) 2022-06-12 10:17:12 -07:00
Kazu Hirata
f5ef2c5838 [clang] Convert for_each to range-based for loops (NFC) 2022-06-10 22:39:45 -07:00
Nico Weber
8406839d19 Revert "[analyzer] Deprecate -analyzer-store region flag"
This reverts commit d50d9946d1.
Broke check-clang, see comments on https://reviews.llvm.org/D126067

Also revert dependent change "[analyzer] Deprecate the unused 'analyzer-opt-analyze-nested-blocks' cc1 flag"
This reverts commit 07b4a6d046.

Also revert "[analyzer] Fix buildbots after introducing a new frontend warning"
This reverts commit 90374df15d.
(See https://reviews.llvm.org/rG90374df15ddc58d823ca42326a76f58e748f20eb)
2022-06-10 08:50:13 -04:00
Balazs Benics
b73c2280f5 [analyzer][NFC] Remove unused RegionStoreFeatures
Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D126216
2022-06-10 13:02:26 +02:00
Balazs Benics
d50d9946d1 [analyzer] Deprecate -analyzer-store region flag
I'm trying to remove unused options from the `Analyses.def` file, then
merge the rest of the useful options into the `AnalyzerOptions.def`.
Then make sure one can set these by an `-analyzer-config XXX=YYY` style
flag.
Then surface the `-analyzer-config` to the `clang` frontend;

After all of this, we can pursue the tablegen approach described
https://discourse.llvm.org/t/rfc-tablegen-clang-static-analyzer-engine-options-for-better-documentation/61488

In this patch, I'm proposing flag deprecations.
We should support deprecated analyzer flags for exactly one release. In
this case I'm planning to drop this flag in `clang-16`.

In the clang frontend, now we won't pass this option to the cc1
frontend, rather emit a warning diagnostic reminding the users about
this deprecated flag, which will be turned into error in clang-16.

Unfortunately, I had to remove all the tests referring to this flag,
causing a mass change. I've also added a test for checking this warning.

I've seen that `scan-build` also uses this flag, but I think we should
remove that part only after we turn this into a hard error.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D126215
2022-06-10 12:57:15 +02:00
Balazs Benics
07a7fd314a [analyzer] Print the offending function at EndAnalysis crash
I've faced crashes in the past multiple times when some
`check::EndAnalysis` callback caused some crash.
It's really anoying that it doesn't tell which function triggered this
callback.

This patch adds the well-known trace for that situation as well.
Example:
  1.      <eof> parser at end of file
  2.      While analyzing stack:
          #0 Calling test11

Note that this does not have tests.
I've considered `unittests` for this purpose, by using the
`ASSERT_DEATH()` similarly how we check double eval called functions in
`ConflictingEvalCallsTest.cpp`, however, that the testsuite won't invoke
the custom handlers. Only the message of the `llvm_unreachable()` will
be printed. Consequently, it's not applicable for us testing this
feature.

I've also considered using an end-to-end LIT test for this.
For that, we would need to somehow overload the `clang_analyzer_crash()`
`ExprInspection` handler, to get triggered by other events than the
`EvalCall`. I'm not saying that we could not come up with a generic way
of causing crash in a specific checker callback, but I'm not sure if
that would worth the effort.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D127389
2022-06-10 12:21:17 +02:00
Gabor Marton
bc2c759aee [analyzer] Fix assertion failure after getKnownValue call
Depends on D126560. `getKnownValue` has been changed by the parent patch
in a way that simplification was removed. This is not correct when the
function is called by the Checkers. Thus, a new internal function is
introduced, `getConstValue`, which simply queries the constraint manager.
This `getConstValue` is used internally in the `SimpleSValBuilder` when a
binop is evaluated, this way we avoid the recursion into the `Simplifier`.

Differential Revision: https://reviews.llvm.org/D127285
2022-06-09 16:13:57 +02:00
Vince Bridgers
c7fa4e8a8b [analyzer] Fix null pointer deref in CastValueChecker
A crash was seen in CastValueChecker due to a null pointer dereference.

The fix uses QualType::getAsString to avoid the null dereference
when a CXXRecordDecl cannot be obtained. A small reproducer is added,
and cast value notes LITs are updated for the new debug messages.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D127105
2022-06-07 13:34:06 -04:00
Gabor Marton
8131ee4c43 [analyzer] Remove NotifyAssumeClients
Depends on D126560.

Differential Revision: https://reviews.llvm.org/D126878
2022-06-07 13:02:03 +02:00
Gabor Marton
17e9ea6138 [analyzer][NFC] Add LLVM_UNLIKELY to assumeDualImpl
Aligned with the measures we had in D124674, this condition seems to be
unlikely.

Nevertheless, I've made some new measurments with stats just for this,
and data confirms this is indeed unlikely.

Differential Revision: https://reviews.llvm.org/D127190
2022-06-07 12:48:48 +02:00
Gabor Marton
f66f4d3b07 [analyzer] Track assume call stack to detect fixpoint
Assume functions might recurse (see `reAssume` or `tryRearrange`).
During the recursion, the State might not change anymore, that means we
reached a fixpoint. In this patch, we avoid infinite recursion of assume
calls by checking already visited States on the stack of assume function
calls. This patch renders the previous "workaround" solution (D47155)
unnecessary. Note that this is not an NFC patch. If we were to limit the
maximum stack depth of the assume calls to 1 then would it be equivalent
with the previous solution in D47155.

Additionally, in D113753, we simplify the symbols right at the beginning
of evalBinOpNN. So, a call to `simplifySVal` in `getKnownValue` (added
in D51252) is no longer needed.

Fixes https://github.com/llvm/llvm-project/issues/55851

Differential Revision: https://reviews.llvm.org/D126560
2022-06-07 08:36:11 +02:00
Kazu Hirata
e0039b8d6a Use llvm::less_second (NFC) 2022-06-04 22:48:32 -07:00
Kazu Hirata
4969a6924d Use llvm::less_first (NFC) 2022-06-04 21:23:18 -07:00
Balazs Benics
7d24641f89 [llvm][analyzer][NFC] Introduce SFINAE for specializing FoldingSetTraits
Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D126803
2022-06-02 19:46:38 +02:00
Balazs Benics
cf1f1b7240 [analyzer][NFC] Uplift checkers after D126801
Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D126802
2022-06-02 19:46:38 +02:00
Balazs Benics
33ca5a447e [analyzer][NFC] Add partial specializations for ProgramStateTraits
I'm also hoisting common code from the existing specializations into a
common trait impl to reduce code duplication.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D126801
2022-06-02 19:46:38 +02:00
Gabor Marton
81e44414aa [analyzer][NFC] Move overconstrained check from reAssume to assumeDualImpl
Depends on D126406. Checking of the overconstrained property is much
better suited here.

Differential Revision: https://reviews.llvm.org/D126707
2022-06-02 11:41:19 +02:00
Gabor Marton
160798ab9b [analyzer] Handle SymbolCast in SValBuilder
Make the SimpleSValBuilder to be able to look up and use a constraint
for an operand of a SymbolCast, when the operand is constrained to a
const value.
This part of the SValBuilder is responsible for constant folding. We
need this constant folding, so the engine can work with less symbols,
this way it can be more efficient. Whenever a symbol is constrained with
a constant then we substitute the symbol with the corresponding integer.
If a symbol is constrained with a range, then the symbol is kept and we
fall-back to use the range based constraint manager, which is not that
efficient. This patch is the natural extension of the existing constant
folding machinery with the support of SymbolCast symbols.

Differential Revision: https://reviews.llvm.org/D126481
2022-06-01 08:42:04 +02:00
Balazs Benics
a73b50ad06 Revert "[llvm][clang][bolt][NFC] Use llvm::less_first() when applicable"
This reverts commit 3988bd1398.

Did not build on this bot:
https://lab.llvm.org/buildbot#builders/215/builds/6372

/usr/include/c++/9/bits/predefined_ops.h:177:11: error: no match for call to
‘(llvm::less_first) (std::pair<long unsigned int, llvm::bolt::BinaryBasicBlock*>&, const std::pair<long unsigned int, std::nullptr_t>&)’
  177 |  { return bool(_M_comp(*__it, __val)); }
2022-05-27 11:19:18 +02:00
Balazs Benics
3988bd1398 [llvm][clang][bolt][NFC] Use llvm::less_first() when applicable
One could reuse this functor instead of rolling out your own version.
There were a couple other cases where the code was similar, but not
quite the same, such as it might have an assertion in the lambda or other
constructs. Thus, I've not touched any of those, as it might change the
behavior in some way.

As per https://discourse.llvm.org/t/submitting-simple-nfc-patches/62640/3?u=steakhal
Chris Lattner
> LLVM intentionally has a “yes, you can apply common sense judgement to
> things” policy when it comes to code review. If you are doing mechanical
> patches (e.g. adopting less_first) that apply to the entire monorepo,
> then you don’t need everyone in the monorepo to sign off on it. Having
> some +1 validation from someone is useful, but you don’t need everyone
> whose code you touch to weigh in.

Differential Revision: https://reviews.llvm.org/D126068
2022-05-27 11:15:23 +02:00
Balazs Benics
f13050eca3 [analyzer][NFCi] Annotate major nonnull returning functions
This patch annotates the most important analyzer function APIs.
Also adds a couple of assertions for uncovering any potential issues
earlier in the constructor; in those cases, the member functions were
already dereferencing the members unconditionally anyway.

Measurements showed no performance impact, nor crashes.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D126198
2022-05-27 11:05:50 +02:00
Gabor Marton
6ab69efe61 [analyzer][NFC] Rename GREngine->CoreEngine, GRExprEngine->ExprEngine in comments and txt files
fixes #115
2022-05-27 11:04:35 +02:00
Balazs Benics
3a666dd37a [analyzer][NFC] Use MemRegion::getRegion()'s return value unconditionally
Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D126123
2022-05-27 10:07:06 +02:00
Balazs Benics
813acb1297 [analyzer][NFC] Remove unused SVal::hasConjuredSymbol
Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D126130
2022-05-27 10:07:06 +02:00
Balazs Benics
81066603a8 [analyzer][NFC] Remove unused nonloc::ConcreteInt::evalBinOp
Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D126129
2022-05-27 10:07:06 +02:00
Balazs Benics
f6eab43764 [analyzer][NFC] Inline loc::ConcreteInt::evalBinOp
This patch also refactored some of the enclosing parts.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D126128
2022-05-27 10:07:06 +02:00
Balazs Benics
ee8987d585 [analyzer][NFC] Inline ExprEngine::evalMinus
Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D126125
2022-05-27 10:07:06 +02:00
Balazs Benics
7a2d6dea73 [analyzer][NFC] Inline ExprEngine::evalComplement
Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D126124
2022-05-27 10:07:06 +02:00
Gabor Marton
88abc50398 [analyzer][solver] Handle UnarySymExpr in RangeConstraintSolver
Fixes https://github.com/llvm/llvm-project/issues/55241

Differential Revision: https://reviews.llvm.org/D125395
2022-05-26 14:09:46 +02:00
Gabor Marton
b5b2aec1ff [analyzer] Add UnarySymExpr
This patch adds a new descendant to the SymExpr hierarchy. This way, now
we can assign constraints to symbolic unary expressions. Only the unary
minus and bitwise negation are handled.

Differential Revision: https://reviews.llvm.org/D125318
2022-05-26 14:00:27 +02:00
Gabor Marton
ca3d962548 [analyzer] Return from reAssume if State is posteriorly overconstrained
Depends on D124758. That patch introduced serious regression in the run-time in
some special cases. This fixes that.

Differential Revision: https://reviews.llvm.org/D126406
2022-05-26 13:50:40 +02:00
Gabor Marton
f75bc5bfc8 [analyzer] Fix symbol simplification assertion failure
Fixes https://github.com/llvm/llvm-project/issues/55546

The assertion mentioned in the issue is triggered because an
inconsistency is formed in the Sym->Class and Class->Sym relations. A
simpler but similar inconsistency is demonstrated here:
https://reviews.llvm.org/D114887 .

Previously in `removeMember`, we didn't remove the old symbol's
Sym->Class relation. Back then, we explained it with the following two
bullet points:
> 1) This way constraints for the old symbol can still be found via it's
> equivalence class that it used to be the member of.
> 2) Performance and resource reasons. We can spare one removal and thus one
> additional tree in the forest of `ClassMap`.

This patch do remove the old symbol's Sym->Class relation in order to
keep the Sym->Class relation consistent with the Class->Sym relations.
Point 2) above has negligible performance impact, empirical measurements
do not show any noticeable difference in the run-time. Point 1) above
seems to be a not well justified statement. This is because we cannot
create a new symbol that would be equal to the old symbol after the
simplification had happened. The reason for this is that the SValBuilder
uses the available constant constraints for each sub-symbol.

Differential Revision: https://reviews.llvm.org/D126281
2022-05-25 10:55:50 +02:00
Gabor Marton
96fba640cf [analyzer][NFC] Factor out the copy-paste code repetition of assumeDual and assumeInclusiveRangeDual
Depends on D125892. There might be efficiency and performance
implications by using a lambda. Thus, I am going to conduct measurements
to see if there is any noticeable impact.
I've been thinking about two more alternatives:
1) Make `assumeDualImpl` a variadic template and (perfect) forward the
   arguments for the used `assume` function.
2) Use a macros.
I have concerns though, whether these alternatives would deteriorate the
readability of the code.

Differential Revision: https://reviews.llvm.org/D125954
2022-05-23 09:32:44 +02:00
Gabor Marton
32f189b0d9 [analyzer] Implement assumeInclusiveRange in terms of assumeInclusiveRangeDual
Depends on D124758. This is the very same thing we have done for
assumeDual, but this time we do it for assumeInclusiveRange. This patch
is basically a no-brainer copy of that previous patch.

Differential Revision:
https://reviews.llvm.org/D125892
2022-05-23 09:32:44 +02:00
Jay Foad
6bec3e9303 [APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf
Most clients only used these methods because they wanted to be able to
extend or truncate to the same bit width (which is a no-op). Now that
the standard zext, sext and trunc allow this, there is no reason to use
the OrSelf versions.

The OrSelf versions additionally have the strange behaviour of allowing
extending to a *smaller* width, or truncating to a *larger* width, which
are also treated as no-ops. A small amount of client code relied on this
(ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and
needed rewriting.

Differential Revision: https://reviews.llvm.org/D125557
2022-05-19 11:23:13 +01:00
Usama Hameed
dd7233bc67 [Analyzer] Remove extra space from NSErrorChecker message.
Differential Revision: https://reviews.llvm.org/D125840
2022-05-18 14:35:12 -07:00
Gabor Marton
56b9b97c1e [clang][analyzer][ctu] Make CTU a two phase analysis
This new CTU implementation is the natural extension of the normal single TU
analysis. The approach consists of two analysis phases. During the first phase,
we do a normal single TU analysis. During this phase, if we find a foreign
function (that could be inlined from another TU) then we don’t inline that
immediately, we rather mark that to be analysed later.
When the first phase is finished then we start the second phase, the CTU phase.
In this phase, we continue the analysis from that point (exploded node)
which had been enqueued during the first phase. We gradually extend the
exploded graph of the single TU analysis with the new node that was
created by the inlining of the foreign function.

We count the number of analysis steps of the first phase and we limit the
second (ctu) phase with this number.

This new implementation makes it convenient for the users to run the
single-TU and the CTU analysis in one go, they don't need to run the two
analysis separately. Thus, we name this new implementation as "onego" CTU.

Discussion:
https://discourse.llvm.org/t/rfc-much-faster-cross-translation-unit-ctu-analysis-implementation/61728

Differential Revision: https://reviews.llvm.org/D123773
2022-05-18 10:35:52 +02:00
Balazs Benics
a1025e6ffe [analyzer] Introduce clang_analyzer_dumpSvalType introspection function
In some rare cases the type of an SVal might be interesting.
This introspection function exposes this information in tests.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D125532
2022-05-13 17:07:58 +02:00
Balazs Benics
d5ffc1ed8b [analyzer][NFC] Tighten some of the SValBuilder return types
This is purely a cosmetic change.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D125463
2022-05-13 17:04:34 +02:00
Endre Fülöp
094fb13b88 [analyzer] Add taint to the BoolAssignmentChecker
BoolAssignment checker is now taint-aware and warns if a tainted value is
assigned.

Original author: steakhal

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D125360
2022-05-13 09:27:28 +02:00
Tomasz Kamiński
14742443a2 Reland "[analyzer] Canonicalize SymIntExpr so the RHS is positive when possible"
This PR changes the `SymIntExpr` so the expression that uses a
negative value as `RHS`, for example: `x +/- (-N)`, is modeled as
`x -/+ N` instead.

This avoids producing a very large `RHS` when the symbol is cased to
an unsigned number, and as consequence makes the value more robust in
presence of casts.

Note that this change is not applied if `N` is the lowest negative
value for which negation would not be representable.

Reviewed By: steakhal

Patch By: tomasz-kaminski-sonarsource!

Differential Revision: https://reviews.llvm.org/D124658
2022-05-12 15:40:11 +02:00
Gabor Marton
34ac048aef [analyzer] Replace adjacent assumeInBound calls to assumeInBoundDual
This is to minimize superfluous assume calls.

Depends on D124758

Differential Revision: https://reviews.llvm.org/D124761
2022-05-10 10:16:55 +02:00
Gabor Marton
1c1c1e25f9 [analyzer] Implement assume in terms of assumeDual
Summary:
By evaluating both children states, now we are capable of discovering
infeasible parent states. In this patch, `assume` is implemented in the terms
of `assumeDuali`. This might be suboptimal (e.g. where there are adjacent
assume(true) and assume(false) calls, next patches addresses that). This patch
fixes a real CRASH.
Fixes https://github.com/llvm/llvm-project/issues/54272

Differential Revision:
https://reviews.llvm.org/D124758
2022-05-10 10:16:55 +02:00
Gabor Marton
c4fa05f5f7 [analyzer] Indicate if a parent state is infeasible
In some cases a parent State is already infeasible, but we recognize
this only if an additonal constraint is added. This patch is the first
of a series to address this issue. In this patch `assumeDual` is changed
to clone the parent State but with an `Infeasible` flag set, and this
infeasible-parent is returned both for the true and false case. Then
when we add a new transition in the exploded graph and the destination
is marked as infeasible, the node will be a sink node.

Related bug:
https://github.com/llvm/llvm-project/issues/50883
Actually, this patch does not solve that bug in the solver, rather with
this patch we can handle the general parent-infeasible cases.

Next step would be to change the State API and require all checkers to
use the `assume*Dual` API and deprecate the simple `assume` calls.

Hopefully, the next patch will introduce `assumeInBoundDual` and will
solve the CRASH we have here:
https://github.com/llvm/llvm-project/issues/54272

Differential Revision: https://reviews.llvm.org/D124674
2022-05-10 10:16:55 +02:00
Fred Tingaud
1ec1cdcfb4 [analyzer] Inline operator delete when MayInlineCXXAllocator is set.
This patch restores the symmetry between how operator new and operator delete
are handled by also inlining the content of operator delete when possible.

Patch by Fred Tingaud.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D124845
2022-05-09 15:44:33 +02:00
Balazs Benics
da5b5ae852 Revert "[analyzer] Canonicalize SymIntExpr so the RHS is positive when possible"
It seems like multiple users are affected by a crash introduced by this
commit, thus I'm reverting it for the time being.
Read more about the found reproducers at Phabricator.

Differential Revision: https://reviews.llvm.org/D124658

This reverts commit f0d6cb4a5c.
2022-05-06 12:13:51 +02:00
Brian Tracy
87a55137e2 Fix "the the" typo in documentation and user facing strings
There are many more instances of this pattern, but I chose to limit this change to .rst files (docs), anything in libcxx/include, and string literals. These have the highest chance of being seen by end users.

Reviewed By: #libc, Mordante, martong, ldionne

Differential Revision: https://reviews.llvm.org/D124708
2022-05-05 17:52:08 +02:00
Tomasz Kamiński
f0d6cb4a5c [analyzer] Canonicalize SymIntExpr so the RHS is positive when possible
This PR changes the `SymIntExpr` so the expression that uses a
negative value as `RHS`, for example: `x +/- (-N)`, is modeled as
`x -/+ N` instead.

This avoids producing a very large `RHS` when the symbol is cased to
an unsigned number, and as consequence makes the value more robust in
presence of casts.

Note that this change is not applied if `N` is the lowest negative
value for which negation would not be representable.

Reviewed By: steakhal

Patch By: tomasz-kaminski-sonarsource!

Differential Revision: https://reviews.llvm.org/D124658
2022-05-05 17:48:49 +02:00
einvbri
df5801806d [analyzer] Get direct binding for specific punned case
Region store was not able to see through this case to the actual
initialized value of STRUCT ff. This change addresses this case by
getting the direct binding. This was found and debugged in a downstream
compiler, with debug guidance from @steakhal. A positive and negative
test case is added.

The specific case where this issue was exposed.

  typedef struct {
    int a:1;
    int b[2];
  } STRUCT;

  int main() {
    STRUCT ff = {0};
    STRUCT* pff = &ff;
    int a = ((int)pff + 1);
    return a;
  }

Reviewed By: steakhal, martong

Differential Revision: https://reviews.llvm.org/D124349
2022-05-05 04:53:45 -05:00
Ali Shuja Siddiqui
cf7cd664f3 [analyzer] Check for std::__addressof for inner pointer checker
This is an extension to diff D99260. This adds an additional exception
for `std::__addressof` in `InnerPointerChecker`.

Patch By alishuja (Ali Shuja Siddiqui)!

Reviewed By: martong, alishuja

Differential Revision: https://reviews.llvm.org/D109467
2022-05-03 14:05:19 +02:00
Marco Antognini
68ee5ec07d [Analyzer] Fix assumptions about const field with member-initializer
Essentially, having a default member initializer for a constant member
does not necessarily imply the member will have the given default value.

Remove part of a2e053638b ([analyzer] Treat more const variables and
fields as known contants., 2018-05-04).

Fix #47878

Reviewed By: r.stahl, steakhal

Differential Revision: https://reviews.llvm.org/D124621
2022-05-03 11:27:45 +02:00
Marco Antognini
f34639828f [Analyzer] Minor cleanups in StreamChecker
Remove unnecessary conversion to Optional<> and incorrect assumption
that BindExpr can return a null state.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D124681
2022-05-02 17:50:10 +02:00
Marco Antognini
5a47accda8 [Analyzer] Fix clang::ento::taint::dumpTaint definition
Ensure the definition is in the "taint" namespace, like its declaration.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D124462
2022-05-02 17:44:06 +02:00
Balazs Benics
5ce7050f70 [analyzer] Allow exploded graph dumps in release builds
Historically, exploded graph dumps were disabled in non-debug builds.
It was done so probably because a regular user should not dump the
internal representation of the analyzer anyway and the dump methods
might introduce unnecessary binary size overhead.

It turns out some of the users actually want to dump this.

Note that e.g. `LiveExpressionsDumper`, `LiveVariablesDumper`,
`ControlDependencyTreeDumper` etc. worked previously, and they are
unaffected by this change.
However, `CFGViewer` and `CFGDumper` still won't work for a similar
reason. AFAIK only these two won't work after this change.

Addresses #53873

---

**baseline**

| binary | size | size after strip |
| clang | 103M | 83M |
| clang-tidy | 67M | 54M |

**after this change**

| binary | size | size after strip |
| clang | 103M | 84M |
| clang-tidy | 67M | 54M |

CMake configuration:
```
cmake -S llvm -GNinja -DBUILD_SHARED_LIBS=OFF -DCMAKE_BUILD_TYPE=Release
-DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang
-DLLVM_ENABLE_ASSERTIONS=OFF -DLLVM_USE_LINKER=lld
-DLLVM_ENABLE_DUMP=OFF -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra"
-DLLVM_ENABLE_Z3_SOLVER=ON -DLLVM_TARGETS_TO_BUILD="X86"
```
Built by `clang-14.0.0`.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D124442
2022-05-02 11:42:08 +02:00
Balazs Benics
fd7efe33f1 [analyzer] Fix cast evaluation on scoped enums in ExprEngine
We ignored the cast if the enum was scoped.
This is bad since there is no implicit conversion from the scoped enum to the corresponding underlying type.

The fix is basically: isIntegralOrEnumerationType() -> isIntegralOr**Unscoped**EnumerationType()

This materialized in crashes on analyzing the LLVM itself using the Z3 refutation.
Refutation synthesized the given Z3 Binary expression (`BO_And` of `unsigned char` aka. 8 bits
and an `int` 32 bits) with the wrong bitwidth in the end, which triggered an assert.

Now, we evaluate the cast according to the standard.

This bug could have been triggered using the Z3 CM according to
https://bugs.llvm.org/show_bug.cgi?id=44030

Fixes #47570 #43375

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D85528
2022-05-02 10:54:26 +02:00
Balazs Benics
5a2e595eb8 [analyzer] Fix Static Analyzer g_memdup false-positive
`g_memdup()` allocates and copies memory, thus we should not assume that
the returned memory region is uninitialized because it might not be the
case.

PS: It would be even better to copy the bindings to mimic the actual
content of the buffer, but this works too.

Fixes #53617

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D124436
2022-05-02 10:35:51 +02:00
Andrew Ng
57c55165eb [analyzer] Fix return of llvm::StringRef to destroyed std::string
This issue was discovered whilst testing with ASAN.

Differential Revision: https://reviews.llvm.org/D124683
2022-05-01 12:24:32 +01:00
Artem Dergachev
f68c0a2f58 [analyzer] Add path note tags to standard library function summaries.
The patch is straightforward except the tiny fix in BugReporterVisitors.cpp
that suppresses a default note for "Assuming pointer value is null" when
a note tag from the checker is present. This is probably the right thing to do
but also definitely not a complete solution to the problem of different sources
of path notes being unaware of each other, which is a large and annoying issue
that we have to deal with. Note tags really help there because they're nicely
introspectable. The problem is demonstrated by the newly added getenv() test.

Differential Revision: https://reviews.llvm.org/D122285
2022-04-28 17:17:05 -07:00
Balazs Benics
be744da01f [analyzer] Fix ValistChecker false-positive involving symbolic pointers
In the following example:

  int va_list_get_int(va_list *va) {
    return va_arg(*va, int); // FP
  }

The `*va` expression will be something like `Element{SymRegion{va}, 0, va_list}`.
We use `ElementRegions` for representing the result of the dereference.
In this case, the `IsSymbolic` was set to `false` in the
`getVAListAsRegion()`.

Hence, before checking if the memregion is a SymRegion, we should take
the base of that region.

Analogously to the previous example, one can craft other cases:

  struct MyVaList {
    va_list l;
  };
  int va_list_get_int(struct MyVaList va) {
    return va_arg(va.l, int); // FP
  }

But it would also work if the `va_list` would be in the base or derived
part of a class. `ObjCIvarRegions` are likely also susceptible.
I'm not explicitly demonstrating these cases.

PS: Check the `MemRegion::getBaseRegion()` definition.

Fixes #55009

Reviewed By: xazax.hun

Differential Revision: https://reviews.llvm.org/D124239
2022-04-26 08:49:05 +02:00
Vince Bridgers
3566bbe62f [analyzer] Add option for AddrSpace in core.NullDereference check
This change adds an option to detect all null dereferences for
    non-default address spaces, except for address spaces 256, 257 and 258.
    Those address spaces are special since null dereferences are not errors.

    All address spaces can be considered (except for 256, 257, and 258) by
    using -analyzer-config
    core.NullDereference:DetectAllNullDereferences=true. This option is
    false by default, retaining the original behavior.

    A LIT test was enhanced to cover this case, and the rst documentation
    was updated to describe this behavior.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D122841
2022-04-24 03:51:49 -05:00
Vince Bridgers
5114db933d [analyzer] Clean checker options from bool to DefaultBool (NFC)
A recent review emphasized the preference to use DefaultBool instead of
bool for checker options. This change is a NFC and cleans up some of the
instances where bool was used, and could be changed to DefaultBool.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D123464
2022-04-23 14:47:29 -05:00
Nathan James
cfb8169059
[clang] Add a raw_ostream operator<< overload for QualType
Under the hood this prints the same as `QualType::getAsString()` but cuts out the middle-man when that string is sent to another raw_ostream.

Also cleaned up all the call sites where this occurs.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D123926
2022-04-20 22:09:05 +01:00
Aaron Ballman
9955f14aaf [C2x] Disallow functions without prototypes/functions with identifier lists
WG14 has elected to remove support for K&R C functions in C2x. The
feature was introduced into C89 already deprecated, so after this long
of a deprecation period, the committee has made an empty parameter list
mean the same thing in C as it means in C++: the function accepts no
arguments exactly as if the function were written with (void) as the
parameter list.

This patch implements WG14 N2841 No function declarators without
prototypes (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2841.htm)
and WG14 N2432 Remove support for function definitions with identifier
lists (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2432.pdf).

It also adds The -fno-knr-functions command line option to opt into
this behavior in other language modes.

Differential Revision: https://reviews.llvm.org/D123955
2022-04-20 13:28:15 -04:00
Denys Petrov
e37726beb2 [analyzer] Implemented RangeSet::Factory::castTo function to perform promotions, truncations and conversions.
Summary: Handle casts for ranges working similarly to APSIntType::apply function but for the whole range set. Support promotions, truncations and conversions.
Example:
promotion: char [0, 42] -> short [0, 42] -> int [0, 42] -> llong [0, 42]
truncation: llong [4295033088, 4295033130] -> int [65792, 65834] -> short [256, 298] -> char [0, 42]
conversion: char [-42, 42] -> uint [0, 42]U[4294967254, 4294967295] -> short[-42, 42]

Differential Revision: https://reviews.llvm.org/D103094
2022-04-19 22:34:03 +03:00
Tom Ritter
82f3ed9904 [analyzer] Expose Taint.h to plugins
Reviewed By: NoQ, xazax.hun, steakhal

Differential Revision: https://reviews.llvm.org/D123155
2022-04-19 16:55:01 +02:00
Kristóf Umann
fd8e5762f8 [analyzer] Don't track function calls as control dependencies
I recently evaluated ~150 of bug reports on open source projects relating to my
GSoC'19 project, which was about tracking control dependencies that were
relevant to a bug report.

Here is what I found: when the condition is a function call, the extra notes
were almost always unimportant, and often times intrusive:

void f(int *x) {
  x = nullptr;
  if (alwaysTrue()) // We don't need a whole lot of explanation
                    // here, the function name is good enough.
    *x = 5;
}
It almost always boiled down to a few "Returning null pointer, which participates
in a condition later", or similar notes. I struggled to find a single case
where the notes revealed anything interesting or some previously hidden
correlation, which is kind of the point of condition tracking.

This patch checks whether the condition is a function call, and if so, bails
out.

The argument against the patch is the popular feedback we hear from some of our
users, namely that they can never have too much information. I was specifically
fishing for examples that display best that my contribution did more good than
harm, so admittedly I set the bar high, and one can argue that there can be
non-trivial trickery inside functions, and function names may not be that
descriptive.

My argument for the patch is all those reports that got longer without any
notable improvement in the report intelligibility. I think the few exceptional
cases where this patch would remove notable information are an acceptable
sacrifice in favor of more reports being leaner.

Differential Revision: https://reviews.llvm.org/D116597
2022-04-08 10:16:58 +02:00
Gabor Marton
e63b81d10e [analyzer][ctu] Only import const and trivial VarDecls
Do import the definition of objects from a foreign translation unit if that's type is const and trivial.

Differential Revision: https://reviews.llvm.org/D122805
2022-04-01 13:49:39 +02:00
Vince Bridgers
4d5b824e3d [analyzer] Avoid checking addrspace pointers in cstring checker
This change fixes an assert that occurs in the SMT layer when refuting a
finding that uses pointers of two different sizes. This was found in a
downstream build that supports two different pointer sizes, The CString
Checker was attempting to compute an overlap for the 'to' and 'from'
pointers, where the pointers were of different sizes.

In the downstream case where this was found, a specialized memcpy
routine patterned after memcpy_special is used. The analyzer core hits
on this builtin because it matches the 'memcpy' portion of that builtin.
This cannot be duplicated in the upstream test since there are no
specialized builtins that match that pattern, but the case does
reproduce in the accompanying LIT test case. The amdgcn target was used
for this reproducer. See the documentation for AMDGPU address spaces here
https://llvm.org/docs/AMDGPUUsage.html#address-spaces.

The assert seen is:

`*Solver->getSort(LHS) == *Solver->getSort(RHS) && "AST's must have the same sort!"'

Ack to steakhal for reviewing the fix, and creating the test case.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D118050
2022-03-31 17:34:56 +02:00
Vince Bridgers
fe8b2236ef [analyzer] Fix "RhsLoc and LhsLoc bitwidth must be same"
clang: <root>/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp:727:
void assertEqualBitWidths(clang::ento::ProgramStateRef,
  clang::ento::Loc, clang::ento::Loc): Assertion `RhsBitwidth ==
  LhsBitwidth && "RhsLoc and LhsLoc bitwidth must be same!"'

This change adjusts the bitwidth of the smaller operand for an evalBinOp
as a result of a comparison operation. This can occur in the specific
case represented by the test cases for a target with different pointer
sizes.

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D122513
2022-03-29 17:08:19 -05:00
Mike Rice
f82ec5532b [OpenMP] Initial parsing/sema for the 'omp target parallel loop' construct
Adds basic parsing/sema/serialization support for the
 #pragma omp target parallel loop directive.

Differential Revision: https://reviews.llvm.org/D122359
2022-03-24 09:19:00 -07:00
Vince Bridgers
9ef7ac51af [analyzer] Fix crash in RangedConstraintManager.cpp
This change fixes a crash in RangedConstraintManager.cpp:assumeSym due to an
unhandled BO_Div case.

clang: <root>clang/lib/StaticAnalyzer/Core/RangedConstraintManager.cpp:51:
  virtual clang::ento::ProgramStateRef
  clang::ento::RangedConstraintManager::assumeSym(clang::ento::ProgramStateRef,
    clang::ento::SymbolRef, bool):
  Assertion `BinaryOperator::isComparisonOp(Op)' failed.

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D122277
2022-03-23 08:26:40 -05:00
Vince Bridgers
5fdc4dd777 [analyzer] refactor makeIntValWithPtrWidth, remove getZeroWithPtrWidth (NFC)
This is a NFC refactoring to change makeIntValWithPtrWidth
and remove getZeroWithPtrWidth to use types when forming values to match
pointer widths. Some targets may have different pointer widths depending
upon address space, so this needs to be comprehended.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D120134
2022-03-23 08:26:37 -05:00
Mike Rice
2cedaee6f7 [OpenMP] Initial parsing/sema for the 'omp parallel loop' construct
Adds basic parsing/sema/serialization support for the
  #pragma omp parallel loop directive.

 Differential Revision: https://reviews.llvm.org/D122247
2022-03-22 13:55:47 -07:00
Vince Bridgers
985888411d [analyzer] Refactor makeNull to makeNullWithWidth (NFC)
Usages of makeNull need to be deprecated in favor of makeNullWithWidth
for architectures where the pointer size should not be assumed. This can
occur when pointer sizes can be of different sizes, depending on address
space for example. See https://reviews.llvm.org/D118050 as an example.

This was uncovered initially in a downstream compiler project, and
tested through those systems tests.

steakhal performed systems testing across a large set of open source
projects.

Co-authored-by: steakhal
Resolves: https://github.com/llvm/llvm-project/issues/53664

Reviewed By: NoQ, steakhal

Differential Revision: https://reviews.llvm.org/D119601
2022-03-22 07:35:13 -05:00
Mike Rice
6bd8dc91b8 [OpenMP] Initial parsing/sema for the 'omp target teams loop' construct
Adds basic parsing/sema/serialization support for the
 #pragma omp target teams loop directive.

Differential Revision: https://reviews.llvm.org/D122028
2022-03-18 13:48:32 -07:00
Mike Rice
79f661edc1 [OpenMP] Initial parsing/sema for the 'omp teams loop' construct
Adds basic parsing/sema/serialization support for the #pragma omp teams loop
directive.

Differential Revision: https://reviews.llvm.org/D121713
2022-03-16 14:39:18 -07:00
phyBrackets
90a6e35478 [analyzer][NFC] Merge similar conditional paths
Reviewed By: aaron.ballman, steakhal

Differential Revision: https://reviews.llvm.org/D121045
2022-03-07 22:05:27 +05:30
Endre Fülöp
4fd6c6e65a [analyzer] Add more propagations to Taint analysis
Add more functions as taint propators to GenericTaintChecker.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D120369
2022-03-07 13:18:54 +01:00
Shivam
56eaf869be [analyzer] Done some changes to detect Uninitialized read by the char array manipulation functions
Few weeks back I was experimenting with reading the uninitialized values from src , which is actually a bug but the CSA seems to give up at that point . I was curious about that and I pinged @steakhal on the discord and according to him this seems to be a genuine issue and needs to be fix. So I goes with fixing this bug and thanks to @steakhal who help me creating this patch. This feature seems to break some tests but this was the genuine problem and the broken tests also needs to fix in certain manner. I add a test but yeah we need more tests,I'll try to add more tests.Thanks

Reviewed By: steakhal, NoQ

Differential Revision: https://reviews.llvm.org/D120489
2022-03-04 00:21:06 +05:30
Shivam
bd1917c88a [analyzer] Done some changes to detect Uninitialized read by the char array manipulation functions
Few weeks back I was experimenting with reading the uninitialized values from src , which is actually a bug but the CSA seems to give up at that point . I was curious about that and I pinged @steakhal on the discord and according to him this seems to be a genuine issue and needs to be fix. So I goes with fixing this bug and thanks to @steakhal who help me creating this patch. This feature seems to break some tests but this was the genuine problem and the broken tests also needs to fix in certain manner. I add a test but yeah we need more tests,I'll try to add more tests.Thanks

Reviewed By: steakhal, NoQ

Differential Revision: https://reviews.llvm.org/D120489
2022-03-03 23:21:26 +05:30