Sample profiles - Update text profile documentation.

There's been some changes to the text encoding for sample profiles. This updates the documentation and an example. llvm-svn: 250310
2025-02-10 11:23:52 +00:00 · 2015-10-14 18:37:39 +00:00 · 2015-10-14 18:37:39 +00:00 · 33452761bb
commit 33452761bb
parent bb5605ca3a
1 changed files with 54 additions and 12 deletions
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@ -1355,15 +1355,18 @@ read by the backend. LLVM supports three different sample profile formats:

 1. ASCII text. This is the easiest one to generate. The file is divided into
   sections, which correspond to each of the functions with profile
-   information. The format is described below.
+   information. The format is described below. It can also be generated from
+   the binary or gcov formats using the ``llvm-profdata`` tool.

 2. Binary encoding. This uses a more efficient encoding that yields smaller
-   profile files, which may be useful when generating large profiles. It can be
-   generated from the text format using the ``llvm-profdata`` tool.
+   profile files. This is the format generated by the ``create_llvm_prof`` tool
+   in http://github.com/google/autofdo.

 3. GCC encoding. This is based on the gcov format, which is accepted by GCC. It
-   is only interesting in environments where GCC and Clang co-exist. Similarly
-   to the binary encoding, it can be generated using the ``llvm-profdata`` tool.
+   is only interesting in environments where GCC and Clang co-exist. This
+   encoding is only generated by the ``create_gcov`` tool in
+   http://github.com/google/autofdo. It can be read by LLVM and
+   ``llvm-profdata``, but it cannot be generated by either.

 If you are using Linux Perf to generate sampling profiles, you can use the
 conversion tool ``create_llvm_prof`` described in the previous section.
@ -1382,14 +1385,27 @@ of the other two, consult the ``ProfileData`` library in in LLVM's source tree
 .. code-block:: console

    function1:total_samples:total_head_samples
-    offset1[.discriminator]: number_of_samples [fn1:num fn2:num ... ]
-    offset2[.discriminator]: number_of_samples [fn3:num fn4:num ... ]
-    ...
-    offsetN[.discriminator]: number_of_samples [fn5:num fn6:num ... ]
+     offset1[.discriminator]: number_of_samples [fn1:num fn2:num ... ]
+     offset2[.discriminator]: number_of_samples [fn3:num fn4:num ... ]
+     ...
+     offsetN[.discriminator]: number_of_samples [fn5:num fn6:num ... ]
+     offsetA[.discriminator]: fnA:num_of_total_samples
+      offsetA1[.discriminator]: number_of_samples [fn7:num fn8:num ... ]
+      offsetA1[.discriminator]: number_of_samples [fn9:num fn10:num ... ]
+      offsetB[.discriminator]: fnB:num_of_total_samples
+       offsetB1[.discriminator]: number_of_samples [fn11:num fn12:num ... ]

-The file may contain blank lines between sections and within a
-section. However, the spacing within a single line is fixed. Additional
-spaces will result in an error while reading the file.
+This is a nested tree in which the identation represents the nesting level
+of the inline stack. There are no blank lines in the file. And the spacing
+within a single line is fixed. Additional spaces will result in an error
+while reading the file.
+
+Any line starting with the '#' character is completely ignored.
+
+Inlined calls are represented with indentation. The Inline stack is a
+stack of source locations in which the top of the stack represents the
+leaf function, and the bottom of the stack represents the actual
+symbol to which the instruction belongs.

 Function names must be mangled in order for the profile loader to
 match them in the current translation unit. The two numbers in the
@ -1398,6 +1414,14 @@ function (first number), and the total number of samples accumulated
 in the prologue of the function (second number). This head sample
 count provides an indicator of how frequently the function is invoked.

+There are two types of lines in the function body.
+
+-  Sampled line represents the profile information of a source location.
+   ``offsetN[.discriminator]: number_of_samples [fn5:num fn6:num ... ]``
+
+-  Callsite line represents the profile information of an inlined callsite.
+   ``offsetA[.discriminator]: fnA:num_of_total_samples``
+
 Each sampled line may contain several items. Some are optional (marked
 below):

@ -1451,6 +1475,24 @@ d. [OPTIONAL] Potential call targets and samples. If present, this
   instruction that calls one of ``foo()``, ``bar()`` and ``baz()``,
   with ``baz()`` being the relatively more frequently called target.

+As an example, consider a program with the call chain ``main -> foo -> bar``.
+When built with optimizations enabled, the compiler may inline the
+calls to ``bar`` and ``foo`` inside ``main``. The generated profile
+could then be something like this:
+
+.. code-block:: console
+
+    main:35504:0
+    1: _Z3foov:35504
+      2: _Z32bari:31977
+      1.1: 31977
+    2: 0
+
+This profile indicates that there were a total of 35,504 samples
+collected in main. All of those were at line 1 (the call to ``foo``).
+Of those, 31,977 were spent inside the body of ``bar``. The last line
+of the profile (``2: 0``) corresponds to line 2 inside ``main``. No
+samples were collected there.

 Profiling with Instrumentation
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^