For multi-dimensional array like below
int a[2][3];
the previous implementation generates BTF_KIND_ARRAY type
like below:
. element_type: int
. index_type: unsigned int
. number of elements: 6
This is not the best way to represent arrays, esp.,
when converting BTF back to headers and users will see
int a[6];
instead.
This patch generates proper support for multi-dimensional arrays.
For "int a[2][3]", the two BTF_KIND_ARRAY types will be
generated:
Type #n:
. element_type: int
. index_type: unsigned int
. number of elements: 3
Type #(n+1):
. element_type: #n
. index_type: unsigned int
. number of elements: 2
The linux kernel already supports such a multi-dimensional
array representation properly.
Signed-off-by: Yonghong Song <yhs@fb.com>
Differential Revision: https://reviews.llvm.org/D59943
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357215 91177308-0d34-0410-b5e6-96231b3b80d8
The .BTF.ext FuncInfoTable and LineInfoTable contain
information organized per ELF section. Current definition
of FuncInfoTable/LineInfoTable is:
std::unordered_map<uint32_t, std::vector<BTFFuncInfo>> FuncInfoTable
std::unordered_map<uint32_t, std::vector<BTFLineInfo>> LineInfoTable
where the key is the section name off in the string table.
The unordered_map may cause the order of section output
different for different platforms.
The same for unordered map definition of
std::unordered_map<std::string, std::unique_ptr<BTFKindDataSec>>
DataSecEntries
where BTF_KIND_DATASEC entries may have different ordering
for different platforms.
This patch fixed the issue by using std::map.
Test static-var-derived-type.ll is modified to generate two
DataSec's which will ensure the ordering is the same for all
supported platforms.
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357077 91177308-0d34-0410-b5e6-96231b3b80d8
The DataSecEentries is defined as an unordered_map since
order does not really matter.
std::unordered_map<std::string, std::unique_ptr<BTFKindDataSec>>
DataSecEntries;
This seems causing the test static-var-derived-type.ll flaky
as two sections ".bss" and ".readonly" have undeterministic
ordering when performing map iterating, which decides the
output assembly code sequence of BTF_KIND_DATASEC entries.
Fix the test to have only one data section to remove
flakiness.
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356731 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, the type id for a derived type is computed incorrectly.
For example,
type #1: int
type #2: ptr to #1
For a global variable "int *a", type #1 will be attributed to variable "a".
This is due to a bug which assigns the type id of the basetype of
that derived type as the derived type's type id. This happens
to "const", "volatile", "restrict", "typedef" and "pointer" types.
This patch fixed this bug, fixed existing test cases and added
a new one focusing on pointers plus other derived types.
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356727 91177308-0d34-0410-b5e6-96231b3b80d8
Two new kinds, BTF_KIND_VAR and BTF_KIND_DATASEC, are added.
BTF_KIND_VAR has the following specification:
btf_type.name: var name
btf_type.info: type kind
btf_type.type: var type
// btf_type is followed by one u32
u32: varinfo (currently, only 0 - static, 1 - global allocated in elf sections)
Not all globals are supported in this patch. The following globals are supported:
. static variables with or without section attributes
. global variables with section attributes
The inclusion of globals with section attributes
is for future potential extraction of key/value
type id's from map definition.
BTF_KIND_DATASEC has the following specification:
btf_type.name: section name associated with variable or
one of .data/.bss/.readonly
btf_type.info: type kind and vlen for # of variables
btf_type.size: 0
#vlen number of the following:
u32: id of corresponding BTF_KIND_VAR
u32: in-session offset of the var
u32: the size of memory var occupied
At the time of debug info emission, the data section
size is unknown, so the btf_type.size = 0 for
BTF_KIND_DATASEC. The loader can patch it during
loading time.
The in-session offseet of the var is only available
for static variables. For global variables, the
loader neeeds to assign the global variable symbol value in
symbol table to in-section offset.
The size of memory is used to specify the amount of the
memory a variable occupies. Typically, it equals to
the type size, but for certain structures, e.g.,
struct tt {
int a;
int b;
char c[];
};
static volatile struct tt s2 = {3, 4, "abcdefghi"};
The static variable s2 has size of 20.
Note that for BTF_KIND_DATASEC name, the section name
does not contain object name. The compiler does have
input module name. For example, two cases below:
. clang -target bpf -O2 -g -c test.c
The compiler knows the input file (module) is test.c
and can generate sec name like test.data/test.bss etc.
. clang -target bpf -O2 -g -emit-llvm -c test.c -o - |
llc -march=bpf -filetype=obj -o test.o
The llc compiler has the input file as stdin, and
would generate something like stdin.data/stdin.bss etc.
which does not really make sense.
For any user specificed section name, e.g.,
static volatile int a __attribute__((section("id1")));
static volatile const int b __attribute__((section("id2")));
The DataSec with name "id1" and "id2" does not contain
information whether the section is readonly or not.
The loader needs to check the corresponding elf section
flags for such information.
A simple example:
-bash-4.4$ cat t.c
int g1;
int g2 = 3;
const int g3 = 4;
static volatile int s1;
struct tt {
int a;
int b;
char c[];
};
static volatile struct tt s2 = {3, 4, "abcdefghi"};
static volatile const int s3 = 4;
int m __attribute__((section("maps"), used)) = 4;
int test() { return g1 + g2 + g3 + s1 + s2.a + s3 + m; }
-bash-4.4$ clang -target bpf -O2 -g -S t.c
Checking t.s, 4 BTF_KIND_VAR's are generated (s1, s2, s3 and m).
4 BTF_KIND_DATASEC's are generated with names
".data", ".bss", ".rodata" and "maps".
Signed-off-by: Yonghong Song <yhs@fb.com>
Differential Revision: https://reviews.llvm.org/D59441
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356326 91177308-0d34-0410-b5e6-96231b3b80d8
Previous commit 6bc58e6d3d ("[BPF] do not generate unused local/global types")
tried to exclude global variable from type generation. The condition is:
if (Global.hasExternalLinkage())
continue;
This is not right. It also excluded initialized globals.
The correct condition (from AssemblyWriter::printGlobal()) is:
if (!GV->hasInitializer() && GV->hasExternalLinkage())
Out << "external ";
Let us do the same in BTF type generation. Also added a test for it.
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356279 91177308-0d34-0410-b5e6-96231b3b80d8
The kernel currently has a limit for # of types to be 64KB and
the size of string subsection to be 64KB. A simple bcc tool
runqlat.py generates:
. the size of ~33KB type section, roughly ~10K types
. the size of ~17KB string section
The majority type is from the types referenced by local
variables in the bpf program. For example, the kernel "task_struct"
itself recursively brings in ~900 other types.
This patch did the following optimization to avoid generating
unused types:
. do not generate types for local variables unless they are
function arguments.
. do not generate types for external globals.
If an external global is not used in the program, llvm
already removes it from IR, so global variable saving is
typical small. For runqlat.py, only one variable "llvm.used"
is the external global.
The types for locals and external globals can be added back
once there is a usage for them.
After the above optimization, the runqlat.py generates:
. the size of ~1.5KB type section, roughtly 500 types
. the size of ~0.7KB string section
UPDATE:
resubmitted the patch after previous revert with
the following fix:
use Global.hasExternalLinkage() to test "external"
linkage instead of using Global.getInitializer(),
which will assert on external variables.
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356234 91177308-0d34-0410-b5e6-96231b3b80d8
The kernel currently has a limit for # of types to be 64KB and
the size of string subsection to be 64KB. A simple bcc tool
runqlat.py generates:
. the size of ~33KB type section, roughly ~10K types
. the size of ~17KB string section
The majority type is from the types referenced by local
variables in the bpf program. For example, the kernel "task_struct"
itself recursively brings in ~900 other types.
This patch did the following optimization to avoid generating
unused types:
. do not generate types for local variables unless they are
function arguments.
. do not generate types for external globals.
If an external global is not used in the program, llvm
already removes it from IR, so global variable saving is
typical small. For runqlat.py, only one variable "llvm.used"
is the external global.
The types for locals and external globals can be added back
once there is a usage for them.
After the above optimization, the runqlat.py generates:
. the size of ~1.5KB type section, roughtly 500 types
. the size of ~0.7KB string section
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356232 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
A number of optimizations are inhibited by single-use TokenFactors not
being merged into the TokenFactor using it. This makes we consider if
we can do the merge immediately.
Most tests changes here are due to the change in visitation causing
minor reorderings and associated reassociation of paired memory
operations.
CodeGen tests with non-reordering changes:
X86/aligned-variadic.ll -- memory-based add folded into stored leaq
value.
X86/constant-combiners.ll -- Optimizes out overlap between stores.
X86/pr40631_deadstore_elision -- folds constant byte store into
preceding quad word constant store.
Reviewers: RKSimon, craig.topper, spatel, efriedma, courbet
Reviewed By: courbet
Subscribers: dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, eraman, hiraditya, kbarton, jrtc27, atanasyan, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59260
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356068 91177308-0d34-0410-b5e6-96231b3b80d8
Support sub-register code-gen for XADD is like supporting any other Load
and Store patterns.
No new instruction is introduced.
lock *(u32 *)(r1 + 0) += w2
has exactly the same underlying insn as:
lock *(u32 *)(r1 + 0) += r2
BPF_W width modifier has guaranteed they behave the same at runtime. This
patch merely teaches BPF back-end that BPF_W width modifier could work
GPR32 register class and that's all needed for sub-register code-gen
support for XADD.
test/CodeGen/BPF/xadd.ll updated to include sub-register code-gen tests.
A new testcase test/CodeGen/BPF/xadd_legal.ll is added to make sure the
legal case could pass on all code-gen modes. It could also test dead Def
check on GPR32. If there is no proper handling like what has been done
inside BPFMIChecking.cpp:hasLivingDefs, then this testcase will fail.
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355126 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, the LLVM will print an error like
Unsupported relocation: try to compile with -O2 or above,
or check your static variable usage
if user defines more than one static variables in a single
ELF section (e.g., .bss or .data).
There is ongoing effort to support static and global
variables in libbpf and kernel. This patch removed the
assertion so user programs with static variables won't
fail compilation.
The static variable in-section offset is written to
the "imm" field of the corresponding to-be-relocated
bpf instruction. Below is an example to show how the
application (e.g., libbpf) can relate variable to relocations.
-bash-4.4$ cat g1.c
static volatile long a = 2;
static volatile int b = 3;
int test() { return a + b; }
-bash-4.4$ clang -target bpf -O2 -c g1.c
-bash-4.4$ llvm-readelf -r g1.o
Relocation section '.rel.text' at offset 0x158 contains 2 entries:
Offset Info Type Symbol's Value Symbol's Name
0000000000000000 0000000400000001 R_BPF_64_64 0000000000000000 .data
0000000000000018 0000000400000001 R_BPF_64_64 0000000000000000 .data
-bash-4.4$ llvm-readelf -s g1.o
Symbol table '.symtab' contains 6 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS g1.c
2: 0000000000000000 8 OBJECT LOCAL DEFAULT 4 a
3: 0000000000000008 4 OBJECT LOCAL DEFAULT 4 b
4: 0000000000000000 0 SECTION LOCAL DEFAULT 4
5: 0000000000000000 64 FUNC GLOBAL DEFAULT 2 test
-bash-4.4$ llvm-objdump -d g1.o
g1.o: file format ELF64-BPF
Disassembly of section .text:
0000000000000000 test:
0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0)
3: 18 02 00 00 08 00 00 00 00 00 00 00 00 00 00 00 r2 = 8 ll
5: 61 20 00 00 00 00 00 00 r0 = *(u32 *)(r2 + 0)
6: 0f 10 00 00 00 00 00 00 r0 += r1
7: 95 00 00 00 00 00 00 00 exit
-bash-4.4$
. from symbol table, static variable "a" is in section #4, offset 0.
. from symbol table, static variable "b" is in section #4, offset 8.
. the first relocation is against symbol #4:
4: 0000000000000000 0 SECTION LOCAL DEFAULT 4
and in-section offset 0 (see llvm-objdump result)
. the second relocation is against symbol #4:
4: 0000000000000000 0 SECTION LOCAL DEFAULT 4
and in-section offset 8 (see llvm-objdump result)
. therefore, the first relocation is for variable "a", and
the second relocation is for variable "b".
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354954 91177308-0d34-0410-b5e6-96231b3b80d8
The test case reloc-btf.ll is generated with an IR containing
spFlags introduced by https://reviews.llvm.org/rL347806.
In the case of BTF backporting, the old compiler may not
have this patch, so this test will fail during
validation.
This patch removed spFlags from IR in the test case
and used the old way for various flags.
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354409 91177308-0d34-0410-b5e6-96231b3b80d8
In IR, sometimes the following attributes for DIFile may be
generated:
filename: /home/yhs/test.c
directory: /tmp
The /tmp may represent the working directory of the compilation
process.
In such cases, since filename is with absolute path,
the directory should be ignored by BTF. The filename alone is
enough to get the source.
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352952 91177308-0d34-0410-b5e6-96231b3b80d8
In IR, sometimes the following attributes for DIFile may be
generated:
filename: /home/yhs/test.c
directory: /tmp
The /tmp may represent the working directory of the compilation
process.
In such cases, since filename is with absolute path,
the directory should be ignored by BTF. The filename alone is
enough to get the source.
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352939 91177308-0d34-0410-b5e6-96231b3b80d8
Commit f1db33c5c1 ("[BPF] Disable relocation for .BTF.ext section")
assigned relocation type R_BPF_NONE if the fixup type
is FK_Data_4 and the symbol is temporary.
The reason is we use FK_Data_4 as a fixup type
for insn offsets in .BTF.ext section.
Just checking whether the symbol is temporary is not enough.
For example, .debug_info may reference some strings whose
fixup is FK_Data_4 with a temporary symbol as well.
To truely reflect the case for .BTF.ext section,
this patch further checks that the section associateed with the symbol
must be SHF_ALLOC and SHF_EXECINSTR, i.e., in the text section.
This fixed the above-mentioned problem.
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350637 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, BPF has XADD (locked add) insn support and the
asm looks like:
lock *(u32 *)(r1 + 0) += r2
lock *(u64 *)(r1 + 0) += r2
The instruction itself does not have a return value.
At the source code level, users often use
__sync_fetch_and_add()
which eventually translates to XADD. The return value of
__sync_fetch_and_add() is supposed to be the old value
in the xadd memory location. Since BPF::XADD insn does not
support such a return value, this patch added a PreEmit
phase to check such a usage. If such an illegal usage
pattern is detected, a fatal error will be reported like
line 4: Invalid usage of the XADD return value
if compiled with -g, or
Invalid usage of the XADD return value
if compiled without -g.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342692 91177308-0d34-0410-b5e6-96231b3b80d8
Some BPF JIT backends would want to optimize memcpy in their own
architecture specific way.
However, at the moment, there is no way for JIT backends to see memcpy
semantics in a reliable way. This is due to LLVM BPF backend is expanding
memcpy into load/store sequences and could possibly schedule them apart from
each other further. So, BPF JIT backends inside kernel can't reliably
recognize memcpy semantics by peephole BPF sequence.
This patch introduce new intrinsic expand infrastructure to memcpy.
To get stable in-order load/store sequence from memcpy, we first lower
memcpy into BPF::MEMCPY node which then expanded into in-order load/store
sequences in expandPostRAPseudo pass which will happen after instruction
scheduling. By this way, kernel JIT backends could reliably recognize
memcpy through scanning BPF sequence.
This new memcpy expand infrastructure is gated by a new option:
-bpf-expand-memcpy-in-order
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337977 91177308-0d34-0410-b5e6-96231b3b80d8
In order to set breakpoints on labels and list source code around
labels, we need collect debug information for labels, i.e., label
name, the function label belong, line number in the file, and the
address label located. In order to keep these information in LLVM
IR and to allow backend to generate debug information correctly.
We create a new kind of metadata for labels, DILabel. The format
of DILabel is
!DILabel(scope: !1, name: "foo", file: !2, line: 3)
We hope to keep debug information as much as possible even the
code is optimized. So, we create a new kind of intrinsic for label
metadata to avoid the metadata is eliminated with basic block.
The intrinsic will keep existing if we keep it from optimized out.
The format of the intrinsic is
llvm.dbg.label(metadata !1)
It has only one argument, that is the DILabel metadata. The
intrinsic will follow the label immediately. Backend could get the
label metadata through the intrinsic's parameter.
We also create DIBuilder API for labels to be used by Frontend.
Frontend could use createLabel() to allocate DILabel objects, and use
insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR.
Differential Revision: https://reviews.llvm.org/D45024
Patch by Hsiangkai Wang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331841 91177308-0d34-0410-b5e6-96231b3b80d8
Commit 37962a331c ("bpf: Improve expanding logic in LowerSELECT_CC")
intended to improve code quality for certain jmp conditions. The
commit, however, has a couple of issues:
(1). In code, just swap is not enough, ConditionalCode CC
should also be swapped, otherwise incorrect code will
be generated.
(2). The ConditionalCode swap should be subject to
getHasJmpExt(). If getHasJmpExt() is False, certain
conditional codes will not be supported and swap
may generate incorrect code.
The original goal for this patch is to optimize jmp operations
which does not have JmpExt turned on. If JmpExt is on,
better code could be generated. For example, the test
select_ri.ll is introduced to demonstrate the optimization.
The same result can be achieved with -mcpu=v2 flag.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329043 91177308-0d34-0410-b5e6-96231b3b80d8
The current zero extension elimination was restricted to operands of
comparison. It actually could be extended to more cases.
For example:
int *inc_p (int *p, unsigned a)
{
return p + a;
}
'a' will be promoted to i64 during addition, and the zero extension could
be eliminated as well.
For the elimination optimization, it should be much better to start
recognizing the candidate sequence from the SRL instruction instead of J*
instructions.
This patch makes it an generic zero extension elimination pass instead of
one restricted with comparison.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327367 91177308-0d34-0410-b5e6-96231b3b80d8
There is a mistake in current code that we "break" out the optimization
when the first operand of J*_RR doesn't qualify the elimination. This
caused some elimination opportunities missed, for example the one in the
testcase.
The code should just fall through to handle the second operand.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327366 91177308-0d34-0410-b5e6-96231b3b80d8
The current subregister definition check stops after the MOV_32_64
instruction.
This means we are thinking all the following instruction sequences
are safe to be eliminated:
MOV_32_64 rB, wA
SLL_ri rB, rB, 32
SRL_ri rB, rB, 32
However, this is *not* true. The source subregister wA of MOV_32_64 could
come from a implicit truncation of 64-bit register in which case the high
bits of the 64-bit register is not zeroed, therefore we can't eliminate
above sequence.
For example, for i32_val, we shouldn't do the elimination:
long long bar ();
int foo (int b, int c)
{
unsigned int i32_val = (unsigned int) bar();
if (i32_val < 10)
return b;
else
return c;
}
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327365 91177308-0d34-0410-b5e6-96231b3b80d8
Improve the test accuracy by adding more check directives.
Shifts are expected to be eliminated for zero extension but not for signed
extension.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327364 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds some unit tests for 32-bit subregister support.
We want to make sure ALU32, subregister load/store and new peephole
optimization are truely enabled once -mattr=+alu32 specified.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325992 91177308-0d34-0410-b5e6-96231b3b80d8
In DWARF v5 the Line Number Program Header is extensible, allowing values with
new content types. In this extension a content type is added,
DW_LNCT_LLVM_source, which contains the embedded source code of the file.
Add new optional attribute for !DIFile IR metadata called source which contains
source text. Use this to output the source to the DWARF line table of code
objects. Analogously extend METADATA_FILE in Bitcode and .file directive in ASM
to support optional source.
Teach llvm-dwarfdump and llvm-objdump about the new values. Update the output
format of llvm-dwarfdump to make room for the new attribute on file_names
entries, and support embedded sources for the -source option in llvm-objdump.
Differential Revision: https://reviews.llvm.org/D42765
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325970 91177308-0d34-0410-b5e6-96231b3b80d8
The reference '&' is missing in the function parameter. If there are
back-to-back optimizations in terms of dag node list like below:
t29: i64,ch = load<LD4[bitcast (%struct.test_t* @test.t to i8*)+12](dereferenceable), zext from i32> t3, t43, undef:i64
t34: i64,ch = load<LD4[bitcast (%struct.test_t* @test.t to i8*)](dereferenceable), zext from i32> t3, t41, undef:i64
The bug will trigger a segfault for the added test case remove_truncate_5.ll:
LLVMSymbolizer: error reading file: No such file or directory
#0 0x000000000241c4d9 (llc+0x241c4d9)
#1 0x000000000241c56a (llc+0x241c56a)
#2 0x000000000241aa50 (llc+0x241aa50)
...
#22 0x0000000000fd5edf (llc+0xfd5edf)
#23 0x00007f0fe03bec05 __libc_start_main (/lib64/libc.so.6+0x21c05)
#24 0x0000000000fd3e69 (llc+0xfd3e69)
...
Segmentation fault
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325267 91177308-0d34-0410-b5e6-96231b3b80d8
LowerSELECT_CC is not generating optimal Select_Ri pattern at the moment. It
is not guaranteed to place ConstantNode at RHS which would miss matching
Select_Ri.
A new testcase added into the existing select_ri.ll, also there is an
existing case in cmp.ll which would be improved to use Select_Ri after this
patch, it is adjusted accordingly.
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324560 91177308-0d34-0410-b5e6-96231b3b80d8
The major visible difference here is that in line-table dumps,
directory and file names are wrapped in double-quotes; previously,
directory names got single quotes and file names were not quoted at
all.
The improvement in this patch is that when a DWARF v5 line table
header has indirect strings, in a verbose dump these will all have
their section[offset] printed as well as the name itself. This
matches the format used for dumping strings in the .debug_info
section.
Differential Revision: https://reviews.llvm.org/D42802
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324270 91177308-0d34-0410-b5e6-96231b3b80d8
As commented on the existing code:
// The Reg operand should be a virtual register, which is defined
// outside the current basic block. DAG combiner has done a pretty
// good job in removing truncating inside a single basic block.
However, when the Reg operand comes from bpf_load_[byte | half | word]
intrinsics, the generic optimizer doesn't understand their results are
zero extended, so these single basic block elimination opportunities were
missed.
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322534 91177308-0d34-0410-b5e6-96231b3b80d8
Add support for 'objdump -print-imm-hex' for imm64, operand imm
and branch target. If user programs encode immediate values
as hex numbers, such an option will make it easy to correlate
asm insns with source code. This option also makes it easy
to correlate imm values with insn encoding.
There is one changed behavior in this patch. In old way, we
print the 64bit imm as u64:
O << (uint64_t)Op.getImm();
and the new way is:
O << formatImm(Op.getImm());
The formatImm is defined in llvm/MC/MCInstPrinter.h as
format_object<int64_t> formatImm(int64_t Value)
So the new way to print 64bit imm is i64 type.
If a 64bit value has the highest bit set, the old way
will print the value as a positive value and the
new way will print as a negative value. The new way
is consistent with x86_64.
For the code (see the test program):
...
if (a == 0xABCDABCDabcdabcdULL)
...
x86_64 objdump, with and without -print-imm-hex, looks like:
48 b8 cd ab cd ab cd ab cd ab movabsq $-6067004223159161907, %rax
48 b8 cd ab cd ab cd ab cd ab movabsq $-0x5432543254325433, %rax
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321215 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Now that store-merge is only generates type-safe stores, do a second
pass just before instruction selection to allow lowered intrinsics to
be merged as well.
Reviewers: jyknight, hfinkel, RKSimon, efriedma, rnk, jmolloy
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33675
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319036 91177308-0d34-0410-b5e6-96231b3b80d8
Commit b5cbc7760a ("[bpf] allow direct and indirect calls")
allowed more than one function in the bpf program, and
commit 1143538844 ("bpf: fix a bug in trunc-op optimization")
fixed a bug in trunc-op optimization which only showed up
with more than one function in the bpf program.
This patch added a test case for trunc-op optimization
for bpf programs with two functions. Reverting commit
"bpf: fix a bug in trunc-op optimization" will cause
failure for this test case.
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318695 91177308-0d34-0410-b5e6-96231b3b80d8
kernel verifier is becoming smarter and soon will support
direct and indirect function calls.
Remove obsolete error from BPF backend.
Make call to use PCRel_4 fixup.
'bpf to bpf' calls are distinguished from 'bpf to kernel' calls
by insn->src_reg == BPF_PSEUDO_CALL == 1 which is used as relocation
indicator similar to ld_imm64->src_reg == BPF_PSEUDO_MAP_FD == 1
The actual 'call' instruction remains the same for both
'bpf to kernel' and 'bpf to bpf' calls.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318614 91177308-0d34-0410-b5e6-96231b3b80d8
We came across an llvm bug when compiling some testcases that 64-bit
immediates are silently truncated into 32-bit and then packed into
BPF_JMP | BPF_K encoding. This caused comparison with wrong value.
This bug looks to be introduced by r308080. The Select_Ri pattern is
supposed to be lowered into J*_Ri while the latter only support 32-bit
immediate encoding, therefore Select_Ri should have similar immediate
predicate check as what J*_Ri are doing.
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315889 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds new insn, "reg = be16/be32/be64 reg",
for bswap to little endian for big-endian target (bpfeb).
It also adds new insn for negation "reg = -reg".
Currently, for source code, e.g.,
b = -a
LLVM still prefers to generate:
b = 0 - a
But "reg = -reg" format can be used in assembly code.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314376 91177308-0d34-0410-b5e6-96231b3b80d8