llvm/docs/CommandGuide/llvm-profdata.rst
Vedant Kumar d91d378e3a Retry: [llvm-profdata] Speed up merging by using a thread pool
Add a "-j" option to llvm-profdata to control the number of threads used.
Auto-detect NumThreads when it isn't specified, and avoid spawning threads when
they wouldn't be beneficial.

I tested this patch using a raw profile produced by clang (147MB). Here is the
time taken to merge 4 copies together on my laptop:

  No thread pool: 112.87s user 5.92s system 97% cpu 2:01.08 total
  With 2 threads: 134.99s user 26.54s system 164% cpu 1:33.31 total

Changes since the initial commit:

  - When handling odd-length inputs, call ThreadPool::wait() before merging the
    last profile. Should fix a race/off-by-one (see r275937).

Differential Revision: https://reviews.llvm.org/D22438

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275938 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-19 01:17:20 +00:00

204 lines
5.3 KiB
ReStructuredText

llvm-profdata - Profile data tool
=================================
SYNOPSIS
--------
:program:`llvm-profdata` *command* [*args...*]
DESCRIPTION
-----------
The :program:`llvm-profdata` tool is a small utility for working with profile
data files.
COMMANDS
--------
* :ref:`merge <profdata-merge>`
* :ref:`show <profdata-show>`
.. program:: llvm-profdata merge
.. _profdata-merge:
MERGE
-----
SYNOPSIS
^^^^^^^^
:program:`llvm-profdata merge` [*options*] [*filename...*]
DESCRIPTION
^^^^^^^^^^^
:program:`llvm-profdata merge` takes several profile data files
generated by PGO instrumentation and merges them together into a single
indexed profile data file.
By default profile data is merged without modification. This means that the
relative importance of each input file is proportional to the number of samples
or counts it contains. In general, the input from a longer training run will be
interpreted as relatively more important than a shorter run. Depending on the
nature of the training runs it may be useful to adjust the weight given to each
input file by using the ``-weighted-input`` option.
Profiles passed in via ``-weighted-input``, ``-input-files``, or via positional
arguments are processed once for each time they are seen.
OPTIONS
^^^^^^^
.. option:: -help
Print a summary of command line options.
.. option:: -output=output, -o=output
Specify the output file name. *Output* cannot be ``-`` as the resulting
indexed profile data can't be written to standard output.
.. option:: -weighted-input=weight,filename
Specify an input file name along with a weight. The profile counts of the
supplied ``filename`` will be scaled (multiplied) by the supplied
``weight``, where where ``weight`` is a decimal integer >= 1.
Input files specified without using this option are assigned a default
weight of 1. Examples are shown below.
.. option:: -input-files=path, -f=path
Specify a file which contains a list of files to merge. The entries in this
file are newline-separated. Lines starting with '#' are skipped. Entries may
be of the form <filename> or <weight>,<filename>.
.. option:: -instr (default)
Specify that the input profile is an instrumentation-based profile.
.. option:: -sample
Specify that the input profile is a sample-based profile.
The format of the generated file can be generated in one of three ways:
.. option:: -binary (default)
Emit the profile using a binary encoding. For instrumentation-based profile
the output format is the indexed binary format.
.. option:: -text
Emit the profile in text mode. This option can also be used with both
sample-based and instrumentation-based profile. When this option is used
the profile will be dumped in the text format that is parsable by the profile
reader.
.. option:: -gcc
Emit the profile using GCC's gcov format (Not yet supported).
.. option:: -sparse[=true|false]
Do not emit function records with 0 execution count. Can only be used in
conjunction with -instr. Defaults to false, since it can inhibit compiler
optimization during PGO.
.. option:: -num-threads=N, -j=N
Use N threads to perform profile merging. When N=0, llvm-profdata auto-detects
an appropriate number of threads to use. This is the default.
EXAMPLES
^^^^^^^^
Basic Usage
+++++++++++
Merge three profiles:
::
llvm-profdata merge foo.profdata bar.profdata baz.profdata -output merged.profdata
Weighted Input
++++++++++++++
The input file `foo.profdata` is especially important, multiply its counts by 10:
::
llvm-profdata merge -weighted-input=10,foo.profdata bar.profdata baz.profdata -output merged.profdata
Exactly equivalent to the previous invocation (explicit form; useful for programmatic invocation):
::
llvm-profdata merge -weighted-input=10,foo.profdata -weighted-input=1,bar.profdata -weighted-input=1,baz.profdata -output merged.profdata
.. program:: llvm-profdata show
.. _profdata-show:
SHOW
----
SYNOPSIS
^^^^^^^^
:program:`llvm-profdata show` [*options*] [*filename*]
DESCRIPTION
^^^^^^^^^^^
:program:`llvm-profdata show` takes a profile data file and displays the
information about the profile counters for this file and
for any of the specified function(s).
If *filename* is omitted or is ``-``, then **llvm-profdata show** reads its
input from standard input.
OPTIONS
^^^^^^^
.. option:: -all-functions
Print details for every function.
.. option:: -counts
Print the counter values for the displayed functions.
.. option:: -function=string
Print details for a function if the function's name contains the given string.
.. option:: -help
Print a summary of command line options.
.. option:: -output=output, -o=output
Specify the output file name. If *output* is ``-`` or it isn't specified,
then the output is sent to standard output.
.. option:: -instr (default)
Specify that the input profile is an instrumentation-based profile.
.. option:: -text
Instruct the profile dumper to show profile counts in the text format of the
instrumentation-based profile data representation. By default, the profile
information is dumped in a more human readable form (also in text) with
annotations.
.. option:: -sample
Specify that the input profile is a sample-based profile.
EXIT STATUS
-----------
:program:`llvm-profdata` returns 1 if the command is omitted or is invalid,
if it cannot read input files, or if there is a mismatch between their data.