mirror of
https://github.com/RPCS3/llvm-mirror.git
synced 2024-12-13 22:58:50 +00:00
docs: Add some information about Fuzzing LLVM itself
This splits some content out of the libFuzzer docs and adds a fair amount of detail about the fuzzers in LLVM. llvm-svn: 315544
This commit is contained in:
parent
6925d8046a
commit
b9249fcac2
221
docs/FuzzingLLVM.rst
Normal file
221
docs/FuzzingLLVM.rst
Normal file
@ -0,0 +1,221 @@
|
||||
================================
|
||||
Fuzzing LLVM libraries and tools
|
||||
================================
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
:depth: 2
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
The LLVM tree includes a number of fuzzers for various components. These are
|
||||
built on top of :doc:`LibFuzzer <LibFuzzer>`.
|
||||
|
||||
|
||||
Available Fuzzers
|
||||
=================
|
||||
|
||||
clang-fuzzer
|
||||
------------
|
||||
|
||||
A |generic fuzzer| that tries to compile textual input as C++ code. Some of the
|
||||
bugs this fuzzer has reported are `on bugzilla <https://llvm.org/pr23057>`__
|
||||
and `on OSS Fuzz's tracker
|
||||
<https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-fuzzer>`__.
|
||||
|
||||
clang-proto-fuzzer
|
||||
------------------
|
||||
|
||||
A |protobuf fuzzer| that compiles valid C++ programs generated from a protobuf
|
||||
class that describes a subset of the C++ language.
|
||||
|
||||
This fuzzer accepts clang command line options after `ignore_remaining_args=1`.
|
||||
For example, the following command will fuzz clang with a higher optimization
|
||||
level:
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
% bin/clang-proto-fuzzer <corpus-dir> -ignore_remaining_args=1 -O3
|
||||
|
||||
clang-format-fuzzer
|
||||
-------------------
|
||||
|
||||
A |generic fuzzer| that runs clang-format_ on C++ text fragments. Some of the
|
||||
bugs this fuzzer has reported are `on bugzilla <https://llvm.org/pr23052>`__
|
||||
and `on OSS Fuzz's tracker
|
||||
<https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-format-fuzzer`__.
|
||||
|
||||
.. _clang-format: https://clang.llvm.org/docs/ClangFormat.html
|
||||
|
||||
llvm-as-fuzzer
|
||||
--------------
|
||||
|
||||
A |generic fuzzer| that tries to parse text as :doc:`LLVM assembly <LangRef>`.
|
||||
Some of the bugs this fuzzer has reported are `on bugzilla
|
||||
<https://llvm.org/pr24639>`__
|
||||
|
||||
llvm-dwarfdump-fuzzer
|
||||
---------------------
|
||||
|
||||
A |generic fuzzer| that interprets inputs as object files and runs
|
||||
:doc:`llvm-dwarfdump <CommandGuide/llvm-dwarfdump>` on them. Some of the bugs
|
||||
this fuzzer has reported are `on OSS Fuzz's tracker
|
||||
<https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+llvm-dwarfdump-fuzzer`__.
|
||||
|
||||
llvm-isel-fuzzer
|
||||
----------------
|
||||
|
||||
A |LLVM IR fuzzer| aimed at finding bugs in instruction selection.
|
||||
|
||||
This fuzzer accepts flags after `ignore_remaining_args=1`. The flags match
|
||||
those of :doc:`llc <CommandGuide/llc>` and the triple is required. For example,
|
||||
the following command would fuzz AArch64 with :doc:`GlobalISel`:
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
% bin/llvm-isel-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple aarch64 -global-isel -O0
|
||||
|
||||
llvm-mc-assemble-fuzzer
|
||||
-----------------------
|
||||
|
||||
A |generic fuzzer| that fuzzes the MC layer's assemblers by treating inputs as
|
||||
target specific assembly.
|
||||
|
||||
Note that this fuzzer has an unusual command line interface which is not fully
|
||||
compatible with all of libFuzzer's features. Fuzzer arguments must be passed
|
||||
after ``--fuzzer-args``, and any ``llc`` flags must use two dashes. For
|
||||
example, to fuzz the AArch64 assembler you might use the following command:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
llvm-mc-fuzzer --triple=aarch64-linux-gnu --fuzzer-args -max_len=4
|
||||
|
||||
This scheme will likely change in the future.
|
||||
|
||||
llvm-mc-disassemble-fuzzer
|
||||
--------------------------
|
||||
|
||||
A |generic fuzzer| that fuzzes the MC layer's disassemblers by treating inputs
|
||||
as assembled binary data.
|
||||
|
||||
Note that this fuzzer has an unusual command line interface which is not fully
|
||||
compatible with all of libFuzzer's features. See the notes above about
|
||||
``llvm-mc-assemble-fuzzer`` for details.
|
||||
|
||||
|
||||
.. |generic fuzzer| replace:: :ref:`generic fuzzer <fuzzing-llvm-generic>`
|
||||
.. |protobuf fuzzer|
|
||||
replace:: :ref:`libprotobuf-mutator based fuzzer <fuzzing-llvm-protobuf>`
|
||||
.. |LLVM IR fuzzer|
|
||||
replace:: :ref:`structured LLVM IR fuzzer <fuzzing-llvm-ir>`
|
||||
|
||||
|
||||
Mutators and Input Generators
|
||||
=============================
|
||||
|
||||
The inputs for a fuzz target are generated via random mutations of a
|
||||
:ref:`corpus <libfuzzer-corpus>`. There are a few options for the kinds of
|
||||
mutations that a fuzzer in LLVM might want.
|
||||
|
||||
.. _fuzzing-llvm-generic:
|
||||
|
||||
Generic Random Fuzzing
|
||||
----------------------
|
||||
|
||||
The most basic form of input mutation is to use the built in mutators of
|
||||
LibFuzzer. These simply treat the input corpus as a bag of bits and make random
|
||||
mutations. This type of fuzzer is good for stressing the surface layers of a
|
||||
program, and is good at testing things like lexers, parsers, or binary
|
||||
protocols.
|
||||
|
||||
Some of the in-tree fuzzers that use this type of mutator are `clang-fuzzer`_,
|
||||
`clang-format-fuzzer`_, `llvm-as-fuzzer`_, `llvm-dwarfdump-fuzzer`_,
|
||||
`llvm-mc-assemble-fuzzer`_, and `llvm-mc-disassemble-fuzzer`_.
|
||||
|
||||
.. _fuzzing-llvm-protobuf:
|
||||
|
||||
Structured Fuzzing using ``libprotobuf-mutator``
|
||||
------------------------------------------------
|
||||
|
||||
We can use libprotobuf-mutator_ in order to perform structured fuzzing and
|
||||
stress deeper layers of programs. This works by defining a protobuf class that
|
||||
translates arbitrary data into structurally interesting input. Specifically, we
|
||||
use this to work with a subset of the C++ language and perform mutations that
|
||||
produce valid C++ programs in order to exercise parts of clang that are more
|
||||
interesting than parser error handling.
|
||||
|
||||
To build this kind of fuzzer you need `protobuf`_ and its dependencies
|
||||
installed, and you need to specify some extra flags when configuring the build
|
||||
with :doc:`CMake <CMake>`. For example, `clang-proto-fuzzer`_ can be enabled by
|
||||
adding ``-DCLANG_ENABLE_PROTO_FUZZER=ON`` to the flags described in
|
||||
:ref:`building-fuzzers`.
|
||||
|
||||
The only in-tree fuzzer that uses ``libprotobuf-mutator`` today is
|
||||
`clang-proto-fuzzer`_.
|
||||
|
||||
.. _libprotobuf-mutator: https://github.com/google/libprotobuf-mutator
|
||||
.. _protobuf: https://github.com/google/protobuf
|
||||
|
||||
.. _fuzzing-llvm-ir:
|
||||
|
||||
Structured Fuzzing of LLVM IR
|
||||
-----------------------------
|
||||
|
||||
We also use a more direct form of structured fuzzing for fuzzers that take
|
||||
:doc:`LLVM IR <LangRef>` as input. This is achieved through the ``FuzzMutate``
|
||||
library, which was `discussed at EuroLLVM 2017`_.
|
||||
|
||||
The ``FuzzMutate`` library is used to structurally fuzz backends in
|
||||
`llvm-isel-fuzzer`_.
|
||||
|
||||
.. _discussed at EuroLLVM 2017: https://www.youtube.com/watch?v=UBbQ_s6hNgg
|
||||
|
||||
|
||||
Building and Running
|
||||
====================
|
||||
|
||||
.. _building-fuzzers:
|
||||
|
||||
Configuring LLVM to Build Fuzzers
|
||||
---------------------------------
|
||||
|
||||
Fuzzers will be built and linked to libFuzzer by default as long as you build
|
||||
LLVM with sanitizer coverage enabled. You would typically also enable at least
|
||||
one sanitizer for the fuzzers to be particularly likely, so the most common way
|
||||
to build the fuzzers is by adding the following two flags to your CMake
|
||||
invocation: ``-DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=On``.
|
||||
|
||||
.. note:: If you have ``compiler-rt`` checked out in an LLVM tree when building
|
||||
with sanitizers, you'll want to specify ``-DLLVM_BUILD_RUNTIME=Off``
|
||||
to avoid building the sanitizers themselves with sanitizers enabled.
|
||||
|
||||
Continuously Running and Finding Bugs
|
||||
-------------------------------------
|
||||
|
||||
There used to be a public buildbot running LLVM fuzzers continuously, and while
|
||||
this did find issues, it didn't have a very good way to report problems in an
|
||||
actionable way. Because of this, we're moving towards using `OSS Fuzz`_ more
|
||||
instead.
|
||||
|
||||
https://github.com/google/oss-fuzz/blob/master/projects/llvm/project.yaml
|
||||
https://bugs.chromium.org/p/oss-fuzz/issues/list?q=Proj-llvm
|
||||
|
||||
.. _OSS Fuzz: https://github.com/google/oss-fuzz
|
||||
|
||||
|
||||
Utilities for Writing Fuzzers
|
||||
=============================
|
||||
|
||||
There are some utilities available for writing fuzzers in LLVM.
|
||||
|
||||
Some helpers for handling the command line interface are available in
|
||||
``include/llvm/FuzzMutate/FuzzerCLI.h``, including functions to parse command
|
||||
line options in a consistent way and to implement standalone main functions so
|
||||
your fuzzer can be built and tested when not built against libFuzzer.
|
||||
|
||||
There is also some handling of the CMake config for fuzzers, where you should
|
||||
use the ``add_llvm_fuzzer`` to set up fuzzer targets. This function works
|
||||
similarly to functions such as ``add_llvm_tool``, but they take care of linking
|
||||
to LibFuzzer when appropriate and can be passed the ``DUMMY_MAIN`` argument to
|
||||
enable standalone testing.
|
@ -42,10 +42,10 @@ This installs the Clang binary as
|
||||
``./third_party/llvm-build/Release+Asserts/bin/clang``)
|
||||
|
||||
The libFuzzer code resides in the LLVM repository, and requires a recent Clang
|
||||
compiler to build (and is used to `fuzz various parts of LLVM itself`_).
|
||||
However the fuzzer itself does not (and should not) depend on any part of LLVM
|
||||
infrastructure and can be used for other projects without requiring the rest
|
||||
of LLVM.
|
||||
compiler to build (and is used to :doc:`fuzz various parts of LLVM itself
|
||||
<FuzzingLLVM>`). However the fuzzer itself does not (and should not) depend on
|
||||
any part of LLVM infrastructure and can be used for other projects without
|
||||
requiring the rest of LLVM.
|
||||
|
||||
|
||||
Getting Started
|
||||
@ -137,6 +137,8 @@ Finally, link with ``libFuzzer.a``::
|
||||
|
||||
clang -fsanitize-coverage=trace-pc-guard -fsanitize=address your_lib.cc fuzz_target.cc libFuzzer.a -o my_fuzzer
|
||||
|
||||
.. _libfuzzer-corpus:
|
||||
|
||||
Corpus
|
||||
------
|
||||
|
||||
@ -627,66 +629,6 @@ which was configured with ``-DLIBFUZZER_ENABLE_TESTS=ON`` flag.
|
||||
ninja check-fuzzer
|
||||
|
||||
|
||||
Fuzzing components of LLVM
|
||||
==========================
|
||||
.. contents::
|
||||
:local:
|
||||
:depth: 1
|
||||
|
||||
To build any of the LLVM fuzz targets use the build instructions above.
|
||||
|
||||
clang-format-fuzzer
|
||||
-------------------
|
||||
The inputs are random pieces of C++-like text.
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
ninja clang-format-fuzzer
|
||||
mkdir CORPUS_DIR
|
||||
./bin/clang-format-fuzzer CORPUS_DIR
|
||||
|
||||
Optionally build other kinds of binaries (ASan+Debug, MSan, UBSan, etc).
|
||||
|
||||
Tracking bug: https://llvm.org/bugs/show_bug.cgi?id=23052
|
||||
|
||||
clang-fuzzer
|
||||
------------
|
||||
|
||||
The behavior is very similar to ``clang-format-fuzzer``.
|
||||
|
||||
Tracking bug: https://llvm.org/bugs/show_bug.cgi?id=23057
|
||||
|
||||
llvm-as-fuzzer
|
||||
--------------
|
||||
|
||||
Tracking bug: https://llvm.org/bugs/show_bug.cgi?id=24639
|
||||
|
||||
llvm-mc-fuzzer
|
||||
--------------
|
||||
|
||||
This tool fuzzes the MC layer. Currently it is only able to fuzz the
|
||||
disassembler but it is hoped that assembly, and round-trip verification will be
|
||||
added in future.
|
||||
|
||||
When run in dissassembly mode, the inputs are opcodes to be disassembled. The
|
||||
fuzzer will consume as many instructions as possible and will stop when it
|
||||
finds an invalid instruction or runs out of data.
|
||||
|
||||
Please note that the command line interface differs slightly from that of other
|
||||
fuzzers. The fuzzer arguments should follow ``--fuzzer-args`` and should have
|
||||
a single dash, while other arguments control the operation mode and target in a
|
||||
similar manner to ``llvm-mc`` and should have two dashes. For example:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
llvm-mc-fuzzer --triple=aarch64-linux-gnu --disassemble --fuzzer-args -max_len=4 -jobs=10
|
||||
|
||||
Buildbot
|
||||
--------
|
||||
|
||||
A buildbot continuously runs the above fuzzers for LLVM components, with results
|
||||
shown at http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer .
|
||||
|
||||
FAQ
|
||||
=========================
|
||||
|
||||
@ -808,4 +750,4 @@ Trophies
|
||||
.. _`value profile`: #value-profile
|
||||
.. _`caller-callee pairs`: http://clang.llvm.org/docs/SanitizerCoverage.html#caller-callee-coverage
|
||||
.. _BoringSSL: https://boringssl.googlesource.com/boringssl/
|
||||
.. _`fuzz various parts of LLVM itself`: `Fuzzing components of LLVM`_
|
||||
|
||||
|
@ -183,6 +183,7 @@ For developers of applications which use LLVM as a library.
|
||||
ProgrammersManual
|
||||
Extensions
|
||||
LibFuzzer
|
||||
FuzzingLLVM
|
||||
ScudoHardenedAllocator
|
||||
OptBisect
|
||||
|
||||
@ -228,6 +229,9 @@ For developers of applications which use LLVM as a library.
|
||||
:doc:`LibFuzzer`
|
||||
A library for writing in-process guided fuzzers.
|
||||
|
||||
:doc:`FuzzingLLVM`
|
||||
Information on writing and using Fuzzers to find bugs in LLVM.
|
||||
|
||||
:doc:`ScudoHardenedAllocator`
|
||||
A library that implements a security-hardened `malloc()`.
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user