Commit Graph

1172 Commits

Author SHA1 Message Date
Alexei Starovoitov
035fd6aca0 libbpf: Cleanup the layering between CORE and bpf_program.
CO-RE processing functions don't need to know 'struct bpf_program' details.
Cleanup the layering to eventually be able to move CO-RE logic into a separate file.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210721000822.40958-2-alexei.starovoitov@gmail.com
2021-08-04 18:27:12 -07:00
Evgeniy Litvinenko
e44c8486c6 libbpf: Add bpf_map__pin_path function
Add bpf_map__pin_path, so that the inconsistently named
bpf_map__get_pin_path can be deprecated later. This is part of the
effort towards libbpf v1.0: https://github.com/libbpf/libbpf/issues/307

Also, add a selftest for the new function.

Signed-off-by: Evgeniy Litvinenko <evgeniyl@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210723221511.803683-1-evgeniyl@fb.com
2021-08-04 18:27:12 -07:00
Jiri Olsa
14f5433b2e libbpf: Export bpf_program__attach_kprobe_opts function
Export bpf_program__attach_kprobe_opts as a public API.

Rename bpf_program_attach_kprobe_opts to bpf_kprobe_opts and turn it into OPTS
struct.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20210721215810.889975-4-jolsa@kernel.org
2021-08-04 18:27:12 -07:00
Jiri Olsa
d7a2de020b libbpf: Allow decimal offset for kprobes
Allow to specify decimal offset in SEC macro, like:
  SEC("kprobe/bpf_fentry_test7+5")

Add selftest for that.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20210721215810.889975-3-jolsa@kernel.org
2021-08-04 18:27:12 -07:00
Jiri Olsa
bb92e7ab4d libbpf: Fix func leak in attach_kprobe
Add missing free() for func pointer in attach_kprobe function.

Fixes: a2488b5f483f ("libbpf: Allow specification of "kprobe/function+offset"")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20210721215810.889975-2-jolsa@kernel.org
2021-08-04 18:27:12 -07:00
Yucong Sun
3f22535d56 Fix text grouping issue on github actions
github action grouping is broken because we were outputing "::endgroup" where
it needs "::endgroup::". This patch also added some addtional grouping around
contianer setup phase, making output easier to read.
2021-08-04 12:18:41 -07:00
Andrii Nakryiko
f8ab8bde8e ci: bump nightly Clang/LLVM version to 14
CI started to fail with missing clang-13 warning. Bump the version to 14.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-03 12:44:00 -07:00
Sergei Iudin
506a544834 ci: improve CI UX a little by setting names a hiding debug 2021-07-29 17:35:23 -07:00
Andrii Nakryiko
ec2c78c034
README: remove Travis CI build badge
We stop using Travis CI from now on.
2021-07-29 14:22:49 -07:00
Andrii Nakryiko
030ff87857 ci: fix log folding in Github Actions
Backport kernel-patches fix for the same issue ([0] and [1]).

  [0] 17c596fe8b
  [1] a38de685c7

Cc: Sergei Iudin <siudin@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-27 19:26:06 -07:00
Andrii Nakryiko
0db006d28e ci: add cancel-in-progress behavior for main test CI workflow
Make sure that only the latest enqueued workflow is running for any given PR
and/or branch.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-22 23:22:42 -07:00
Andrii Nakryiko
6e6f18ac5d
ci: add Github Actions status badge
Add badge to show the Github Actions test.yml workflow status.
2021-07-21 11:20:34 -07:00
Andrii Nakryiko
deca7932c3 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   08f71a1e39a1f07a464ac782d9b612d6a74c7015
Checkpoint bpf-next commit: 807b8f0e24e6004984094e1bcbbd2b297011a085
Baseline bpf commit:        d6371c76e20d7d3f61b05fd67b596af4d14a8886
Checkpoint bpf commit:      d6371c76e20d7d3f61b05fd67b596af4d14a8886

Alan Maguire (2):
  libbpf: Avoid use of __int128 in typed dump display
  libbpf: Propagate errors when retrieving enum value for typed data
    display

 src/btf_dump.c | 103 ++++++++++++++++++++++++++++++++-----------------
 1 file changed, 68 insertions(+), 35 deletions(-)

--
2.30.2
2021-07-21 11:16:01 -07:00
Alan Maguire
ebcae72279 libbpf: Propagate errors when retrieving enum value for typed data display
When retrieving the enum value associated with typed data during
"is data zero?" checking in btf_dump_type_data_check_zero(), the
return value of btf_dump_get_enum_value() is not passed to the caller
if the function returns a non-zero (error) value.  Currently, 0
is returned if the function returns an error.  We should instead
propagate the error to the caller.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626770993-11073-4-git-send-email-alan.maguire@oracle.com
2021-07-21 11:16:01 -07:00
Alan Maguire
64362b8896 libbpf: Avoid use of __int128 in typed dump display
__int128 is not supported for some 32-bit platforms (arm and i386).
__int128 was used in carrying out computations on bitfields which
aid display, but the same calculations could be done with __u64
with the small effect of not supporting 128-bit bitfields.

With these changes, a big-endian issue with casting 128-bit integers
to 64-bit for enum bitfields is solved also, as we now use 64-bit
integers for bitfield calculations.

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626770993-11073-2-git-send-email-alan.maguire@oracle.com
2021-07-21 11:16:01 -07:00
Michal Suchanek
df01b246df README: State the source origin more prominently.
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
2021-07-20 14:45:49 -07:00
Michal Suchanek
6eb5e25905 Makefile: Default LIBSUBDIR to lib64 on 64bit architectures.
commit a82a66e ("Extend build and add install rules to Makefile") adds
special handling for LIBSUBDIR on x86_64. Expand this to all
architectures with 64 in name which suggests a 32bit variant exists, and
s390x which is 64bit extension of s390.

Fixes: #337
Fixes: a82a66e ("Extend build and add install rules to Makefile")
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
2021-07-20 14:45:49 -07:00
Andrii Nakryiko
a603965dad sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   068dfc655b666b54e08fc3d7108b309d7f906d34
Checkpoint bpf-next commit: 08f71a1e39a1f07a464ac782d9b612d6a74c7015
Baseline bpf commit:        a6c39de76d709f30982d4b80a9b9537e1d388858
Checkpoint bpf commit:      d6371c76e20d7d3f61b05fd67b596af4d14a8886

Alan Maguire (3):
  libbpf: Clarify/fix unaligned data issues for btf typed dump
  libbpf: Fix compilation errors on ppc64le for btf dump typed data
  libbpf: Btf typed dump does not need to allocate dump data

Martynas Pumputis (1):
  libbpf: Fix removal of inner map in bpf_object__create_map

 src/btf_dump.c | 41 ++++++++++++++++++++++++++++++-----------
 src/libbpf.c   | 10 ++++------
 2 files changed, 34 insertions(+), 17 deletions(-)

--
2.30.2
2021-07-19 17:45:10 -07:00
Martynas Pumputis
f61c3b318b libbpf: Fix removal of inner map in bpf_object__create_map
If creating an outer map of a BTF-defined map-in-map fails (via
bpf_object__create_map()), then the previously created its inner map
won't be destroyed.

Fix this by ensuring that the destroy routines are not bypassed in the
case of a failure.

Fixes: 646f02ffdd49c ("libbpf: Add BTF-defined map-in-map support")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210719173838.423148-2-m@lambda.lt
2021-07-19 17:45:10 -07:00
Alan Maguire
8235032464 libbpf: Btf typed dump does not need to allocate dump data
By using the stack for this small structure, we avoid the need
for freeing memory in error paths.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626475617-25984-4-git-send-email-alan.maguire@oracle.com
2021-07-19 17:45:10 -07:00
Alan Maguire
dc2c53b7f6 libbpf: Fix compilation errors on ppc64le for btf dump typed data
__s64 can be defined as either long or long long, depending on the
architecture. On ppc64le it's defined as long, giving this error:

 In file included from btf_dump.c:22:
btf_dump.c: In function 'btf_dump_type_data_check_overflow':
libbpf_internal.h:111:22: error: format '%lld' expects argument of
type 'long long int', but argument 3 has type '__s64' {aka 'long int'}
[-Werror=format=]
  111 |  libbpf_print(level, "libbpf: " fmt, ##__VA_ARGS__); \
      |                      ^~~~~~~~~~
libbpf_internal.h:114:27: note: in expansion of macro '__pr'
  114 | #define pr_warn(fmt, ...) __pr(LIBBPF_WARN, fmt, ##__VA_ARGS__)
      |                           ^~~~
btf_dump.c:1992:3: note: in expansion of macro 'pr_warn'
 1992 |   pr_warn("unexpected size [%lld] for id [%u]\n",
      |   ^~~~~~~
btf_dump.c:1992:32: note: format string is defined here
 1992 |   pr_warn("unexpected size [%lld] for id [%u]\n",
      |                             ~~~^
      |                                |
      |                                long long int
      |                             %ld

Cast to size_t and use %zu instead.

Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626475617-25984-3-git-send-email-alan.maguire@oracle.com
2021-07-19 17:45:10 -07:00
Alan Maguire
fb3809e940 libbpf: Clarify/fix unaligned data issues for btf typed dump
If data is packed, data structures can store it outside of usual
boundaries.  For example a 4-byte int can be stored on a unaligned
boundary in a case like this:

struct s {
	char f1;
	int f2;
} __attribute((packed));

...the int is stored at an offset of one byte.  Some platforms have
problems dereferencing data that is not aligned with its size, and
code exists to handle most cases of this for BTF typed data display.
However pointer display was missed, and a simple function to test if
"ptr_is_aligned(data, data_sz)" would help clarify this code.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626475617-25984-2-git-send-email-alan.maguire@oracle.com
2021-07-19 17:45:10 -07:00
Fejes Ferenc
74d3571880 Update README.md 2021-07-19 16:43:05 -07:00
Fejes Ferenc
be570b29c1 Update README.md
Manjaro is a popular and friendly Arch based distro. Recently they also enabled the BTF support: https://forum.manjaro.org/t/co-re-support-in-kernel/46134/19

I can confirm that:
[user@pc ~]$ uname -a
Linux pc 5.12.16-1-MANJARO #1 SMP PREEMPT Sun Jul 11 13:23:34 UTC 2021 x86_64 GNU/Linux
[user@pc ~]$ ls -la /sys/kernel/btf/vmlinux
-r--r--r-- 1 root root 4226769 jul   17 15.27 /sys/kernel/btf/vmlinux
2021-07-19 16:43:05 -07:00
Sergei Iudin
9aa71e1040 Run apt-get update as a first step for GH actions
otherwise container may contain stall repo metadata cached
2021-07-19 14:57:35 -07:00
Andrii Nakryiko
b3ffd258fc vmtest: blacklist 5.5 selftests
Add few new selftests to blacklist. They can't succeed on 5.5.
Also temporarily remove btf_dump for 4.9 due to newly added data dumping
subtests.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-19 11:36:37 -07:00
Andrii Nakryiko
4447ac82d4 ci: temporary work-around to get green CI builds back
Temporary disable tc_bpf tests that seem to have regressed.

Temporary and artificially bump pahole version from 1.21 to 1.22 to get
per-CPU BTF data built.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-19 11:36:37 -07:00
Andrii Nakryiko
8fa229c455 ci: disable -Wstringop-truncation for GCC10 configurations as well
We used to have it disabled for GCC8, but now GCC10 is false-report same
warnings, so disable stringop-truncation warnigs for GCC10 as well.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-19 11:36:37 -07:00
Andrii Nakryiko
8a670b7422 vmtest: regenerate latest vmlinux.h
This is necessary to make runqslower compile with task->__state field on old
kernels, for which we don't have an actual vmlinux.h.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-19 11:36:37 -07:00
Andrii Nakryiko
21f90f61b0 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   f42cfb469f9b4a1c002a03cce3d9329376800a6f
Checkpoint bpf-next commit: 068dfc655b666b54e08fc3d7108b309d7f906d34
Baseline bpf commit:        61e8aeda9398925f8c6fc290585bdd9727d154c4
Checkpoint bpf commit:      a6c39de76d709f30982d4b80a9b9537e1d388858

Alan Maguire (2):
  libbpf: Allow specification of "kprobe/function+offset"
  libbpf: BTF dumper support for typed data

Alexei Starovoitov (2):
  bpf: Sync tools/include/uapi/linux/bpf.h
  bpf: Introduce bpf timers.

Jiri Olsa (3):
  bpf: Add bpf_get_func_ip helper for tracing programs
  bpf: Add bpf_get_func_ip helper for kprobe programs
  libbpf: Add bpf_program__attach_kprobe_opts function

Jonathan Edwards (1):
  libbpf: Add extra BPF_PROG_TYPE check to bpf_object__probe_loading

Kumar Kartikeya Dwivedi (2):
  libbpf: Add request buffer type for netlink messages
  libbpf: Switch to void * casting in netlink helpers

Kuniyuki Iwashima (1):
  bpf: Fix a typo of reuseport map in bpf.h.

Martynas Pumputis (1):
  libbpf: Fix reuse of pinned map on older kernel

Shuyi Cheng (2):
  libbpf: Introduce 'btf_custom_path' to 'bpf_obj_open_opts'
  libbpf: Fix the possible memory leak on error

Toke Høiland-Jørgensen (1):
  libbpf: Restore errno return for functions that were already returning
    it

 include/uapi/linux/bpf.h |  85 +++-
 src/btf.h                |  19 +
 src/btf_dump.c           | 819 ++++++++++++++++++++++++++++++++++++++-
 src/libbpf.c             | 146 ++++++-
 src/libbpf.h             |   9 +-
 src/libbpf.map           |   1 +
 src/netlink.c            | 115 +++---
 src/nlattr.c             |   2 +-
 src/nlattr.h             |  38 +-
 9 files changed, 1117 insertions(+), 117 deletions(-)

--
2.30.2
2021-07-16 17:05:44 -07:00
Andrii Nakryiko
c8b1d14b03 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-07-16 17:05:44 -07:00
Alan Maguire
c0b2ceba1d libbpf: BTF dumper support for typed data
Add a BTF dumper for typed data, so that the user can dump a typed
version of the data provided.

The API is

int btf_dump__dump_type_data(struct btf_dump *d, __u32 id,
                             void *data, size_t data_sz,
                             const struct btf_dump_type_data_opts *opts);

...where the id is the BTF id of the data pointed to by the "void *"
argument; for example the BTF id of "struct sk_buff" for a
"struct skb *" data pointer.  Options supported are

 - a starting indent level (indent_lvl)
 - a user-specified indent string which will be printed once per
   indent level; if NULL, tab is chosen but any string <= 32 chars
   can be provided.
 - a set of boolean options to control dump display, similar to those
   used for BPF helper bpf_snprintf_btf().  Options are
        - compact : omit newlines and other indentation
        - skip_names: omit member names
        - emit_zeroes: show zero-value members

Default output format is identical to that dumped by bpf_snprintf_btf(),
for example a "struct sk_buff" representation would look like this:

struct sk_buff){
	(union){
		(struct){
			.next = (struct sk_buff *)0xffffffffffffffff,
			.prev = (struct sk_buff *)0xffffffffffffffff,
		(union){
			.dev = (struct net_device *)0xffffffffffffffff,
			.dev_scratch = (long unsigned int)18446744073709551615,
		},
	},
...

If the data structure is larger than the *data_sz*
number of bytes that are available in *data*, as much
of the data as possible will be dumped and -E2BIG will
be returned.  This is useful as tracers will sometimes
not be able to capture all of the data associated with
a type; for example a "struct task_struct" is ~16k.
Being able to specify that only a subset is available is
important for such cases.  On success, the amount of data
dumped is returned.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626362126-27775-2-git-send-email-alan.maguire@oracle.com
2021-07-16 17:05:44 -07:00
Shuyi Cheng
bd25fc7df1 libbpf: Fix the possible memory leak on error
If the strdup() fails then we need to call bpf_object__close(obj) to
avoid a resource leak.

Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables")
Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626180159-112996-3-git-send-email-chengshuyi@linux.alibaba.com
2021-07-16 17:05:44 -07:00
Shuyi Cheng
4920031c88 libbpf: Introduce 'btf_custom_path' to 'bpf_obj_open_opts'
btf_custom_path allows developers to load custom BTF which libbpf will
subsequently use for CO-RE relocation instead of vmlinux BTF.

Having btf_custom_path in bpf_object_open_opts one can directly use the
skeleton's <objname>_bpf__open_opts() API to pass in the btf_custom_path
parameter, as opposed to using bpf_object__load_xattr() which is slated to be
deprecated ([0]).

This work continues previous work started by another developer ([1]).

  [0] https://lore.kernel.org/bpf/CAEf4BzbJZLjNoiK8_VfeVg_Vrg=9iYFv+po-38SMe=UzwDKJ=Q@mail.gmail.com/#t
  [1] https://yhbt.net/lore/all/CAEf4Bzbgw49w2PtowsrzKQNcxD4fZRE6AKByX-5-dMo-+oWHHA@mail.gmail.com/

Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626180159-112996-2-git-send-email-chengshuyi@linux.alibaba.com
2021-07-16 17:05:44 -07:00
Alan Maguire
8fa50e86c1 libbpf: Allow specification of "kprobe/function+offset"
kprobes can be placed on most instructions in a function, not
just entry, and ftrace and bpftrace support the function+offset
notification for probe placement.  Adding parsing of func_name
into func+offset to bpf_program__attach_kprobe() allows the
user to specify

SEC("kprobe/bpf_fentry_test5+0x6")

...for example, and the offset can be passed to perf_event_open_probe()
to support kprobe attachment.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210714094400.396467-8-jolsa@kernel.org
2021-07-16 17:05:44 -07:00
Jiri Olsa
330a158982 libbpf: Add bpf_program__attach_kprobe_opts function
Adding bpf_program__attach_kprobe_opts that does the same
as bpf_program__attach_kprobe, but takes opts argument.

Currently opts struct holds just retprobe bool, but we will
add new field in following patch.

The function is not exported, so there's no need to add
size to the struct bpf_program_attach_kprobe_opts for now.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210714094400.396467-7-jolsa@kernel.org
2021-07-16 17:05:44 -07:00
Jiri Olsa
a524ae0bbf bpf: Add bpf_get_func_ip helper for kprobe programs
Adding bpf_get_func_ip helper for BPF_PROG_TYPE_KPROBE programs,
so it's now possible to call bpf_get_func_ip from both kprobe and
kretprobe programs.

Taking the caller's address from 'struct kprobe::addr', which is
defined for both kprobe and kretprobe.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/bpf/20210714094400.396467-5-jolsa@kernel.org
2021-07-16 17:05:44 -07:00
Jiri Olsa
97e2a9c9a1 bpf: Add bpf_get_func_ip helper for tracing programs
Adding bpf_get_func_ip helper for BPF_PROG_TYPE_TRACING programs,
specifically for all trampoline attach types.

The trampoline's caller IP address is stored in (ctx - 8) address.
so there's no reason to actually call the helper, but rather fixup
the call instruction and return [ctx - 8] value directly.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210714094400.396467-4-jolsa@kernel.org
2021-07-16 17:05:44 -07:00
Alexei Starovoitov
bef77595ca bpf: Introduce bpf timers.
Introduce 'struct bpf_timer { __u64 :64; __u64 :64; };' that can be embedded
in hash/array/lru maps as a regular field and helpers to operate on it:

// Initialize the timer.
// First 4 bits of 'flags' specify clockid.
// Only CLOCK_MONOTONIC, CLOCK_REALTIME, CLOCK_BOOTTIME are allowed.
long bpf_timer_init(struct bpf_timer *timer, struct bpf_map *map, int flags);

// Configure the timer to call 'callback_fn' static function.
long bpf_timer_set_callback(struct bpf_timer *timer, void *callback_fn);

// Arm the timer to expire 'nsec' nanoseconds from the current time.
long bpf_timer_start(struct bpf_timer *timer, u64 nsec, u64 flags);

// Cancel the timer and wait for callback_fn to finish if it was running.
long bpf_timer_cancel(struct bpf_timer *timer);

Here is how BPF program might look like:
struct map_elem {
    int counter;
    struct bpf_timer timer;
};

struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __uint(max_entries, 1000);
    __type(key, int);
    __type(value, struct map_elem);
} hmap SEC(".maps");

static int timer_cb(void *map, int *key, struct map_elem *val);
/* val points to particular map element that contains bpf_timer. */

SEC("fentry/bpf_fentry_test1")
int BPF_PROG(test1, int a)
{
    struct map_elem *val;
    int key = 0;

    val = bpf_map_lookup_elem(&hmap, &key);
    if (val) {
        bpf_timer_init(&val->timer, &hmap, CLOCK_REALTIME);
        bpf_timer_set_callback(&val->timer, timer_cb);
        bpf_timer_start(&val->timer, 1000 /* call timer_cb2 in 1 usec */, 0);
    }
}

This patch adds helper implementations that rely on hrtimers
to call bpf functions as timers expire.
The following patches add necessary safety checks.

Only programs with CAP_BPF are allowed to use bpf_timer.

The amount of timers used by the program is constrained by
the memcg recorded at map creation time.

The bpf_timer_init() helper needs explicit 'map' argument because inner maps
are dynamic and not known at load time. While the bpf_timer_set_callback() is
receiving hidden 'aux->prog' argument supplied by the verifier.

The prog pointer is needed to do refcnting of bpf program to make sure that
program doesn't get freed while the timer is armed. This approach relies on
"user refcnt" scheme used in prog_array that stores bpf programs for
bpf_tail_call. The bpf_timer_set_callback() will increment the prog refcnt which is
paired with bpf_timer_cancel() that will drop the prog refcnt. The
ops->map_release_uref is responsible for cancelling the timers and dropping
prog refcnt when user space reference to a map reaches zero.
This uref approach is done to make sure that Ctrl-C of user space process will
not leave timers running forever unless the user space explicitly pinned a map
that contained timers in bpffs.

bpf_timer_init() and bpf_timer_set_callback() will return -EPERM if map doesn't
have user references (is not held by open file descriptor from user space and
not pinned in bpffs).

The bpf_map_delete_elem() and bpf_map_update_elem() operations cancel
and free the timer if given map element had it allocated.
"bpftool map update" command can be used to cancel timers.

The 'struct bpf_timer' is explicitly __attribute__((aligned(8))) because
'__u64 :64' has 1 byte alignment of 8 byte padding.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210715005417.78572-4-alexei.starovoitov@gmail.com
2021-07-16 17:05:44 -07:00
Kuniyuki Iwashima
6f7839f477 bpf: Fix a typo of reuseport map in bpf.h.
Fix s/BPF_MAP_TYPE_REUSEPORT_ARRAY/BPF_MAP_TYPE_REUSEPORT_SOCKARRAY/ typo
in bpf.h.

Fixes: 2dbb9b9e6df6 ("bpf: Introduce BPF_PROG_TYPE_SK_REUSEPORT")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210714124317.67526-1-kuniyu@amazon.co.jp
2021-07-16 17:05:44 -07:00
Alexei Starovoitov
90aba5e582 bpf: Sync tools/include/uapi/linux/bpf.h
Commit 47316f4a3053 missed updating tools/.../bpf.h.
Sync it.

Fixes: 47316f4a3053 ("bpf: Support input xdp_md context in BPF_PROG_TEST_RUN")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-07-16 17:05:44 -07:00
Martynas Pumputis
4dc3aeb072 libbpf: Fix reuse of pinned map on older kernel
When loading a BPF program with a pinned map, the loader checks whether
the pinned map can be reused, i.e. their properties match. To derive
such of the pinned map, the loader invokes BPF_OBJ_GET_INFO_BY_FD and
then does the comparison.

Unfortunately, on < 4.12 kernels the BPF_OBJ_GET_INFO_BY_FD is not
available, so loading the program fails with the following error:

	libbpf: failed to get map info for map FD 5: Invalid argument
	libbpf: couldn't reuse pinned map at
		'/sys/fs/bpf/tc/globals/cilium_call_policy': parameter
		mismatch"
	libbpf: map 'cilium_call_policy': error reusing pinned map
	libbpf: map 'cilium_call_policy': failed to create:
		Invalid argument(-22)
	libbpf: failed to load object 'bpf_overlay.o'

To fix this, fallback to derivation of the map properties via
/proc/$PID/fdinfo/$MAP_FD if BPF_OBJ_GET_INFO_BY_FD fails with EINVAL,
which can be used as an indicator that the kernel doesn't support
the latter.

Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210712125552.58705-1-m@lambda.lt
2021-07-16 17:05:44 -07:00
Toke Høiland-Jørgensen
4ce0551ee5 libbpf: Restore errno return for functions that were already returning it
The update to streamline libbpf error reporting intended to change all
functions to return the errno as a negative return value if
LIBBPF_STRICT_DIRECT_ERRS is set. However, if the flag is *not* set, the
return value changes for the two functions that were already returning a
negative errno unconditionally: bpf_link__unpin() and perf_buffer__poll().

This is a user-visible API change that breaks applications; so let's revert
these two functions back to unconditionally returning a negative errno
value.

Fixes: e9fc3ce99b34 ("libbpf: Streamline error reporting for high-level APIs")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210706122355.236082-1-toke@redhat.com
2021-07-16 17:05:44 -07:00
Kumar Kartikeya Dwivedi
f8411901c4 libbpf: Switch to void * casting in netlink helpers
Netlink helpers I added in 8bbb77b7c7a2 ("libbpf: Add various netlink
helpers") used char * casts everywhere, and there were a few more that
existed from before.

Convert all of them to void * cast, as it is treated equivalently by
clang/gcc for the purposes of pointer arithmetic and to follow the
convention elsewhere in the kernel/libbpf.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210619041454.417577-2-memxor@gmail.com
2021-07-16 17:05:44 -07:00
Kumar Kartikeya Dwivedi
9ff2b76693 libbpf: Add request buffer type for netlink messages
Coverity complains about OOB writes to nlmsghdr. There is no OOB as we
write to the trailing buffer, but static analyzers and compilers may
rightfully be confused as the nlmsghdr pointer has subobject provenance
(and hence subobject bounds).

Fix this by using an explicit request structure containing the nlmsghdr,
struct tcmsg/ifinfomsg, and attribute buffer.

Also switch nh_tail (renamed to req_tail) to cast req * to char * so
that it can be understood as arithmetic on pointer to the representation
array (hence having same bound as request structure), which should
further appease analyzers.

As a bonus, callers don't have to pass sizeof(req) all the time now, as
size is implicitly obtained using the pointer. While at it, also reduce
the size of attribute buffer to 128 bytes (132 for ifinfomsg using
functions due to the padding).

Summary of problem:

  Even though C standard allows interconvertibility of pointer to first
  member and pointer to struct, for the purposes of alias analysis it
  would still consider the first as having pointer value "pointer to T"
  where T is type of first member hence having subobject bounds,
  allowing analyzers within reason to complain when object is accessed
  beyond the size of pointed to object.

  The only exception to this rule may be when a char * is formed to a
  member subobject. It is not possible for the compiler to be able to
  tell the intent of the programmer that it is a pointer to member
  object or the underlying representation array of the containing
  object, so such diagnosis is suppressed.

Fixes: 715c5ce454a6 ("libbpf: Add low level TC-BPF management API")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210619041454.417577-1-memxor@gmail.com
2021-07-16 17:05:44 -07:00
Jonathan Edwards
df023f5cfc libbpf: Add extra BPF_PROG_TYPE check to bpf_object__probe_loading
eBPF has been backported for RHEL 7 w/ kernel 3.10-940+ [0]. However only
the following program types are supported [1]:

  BPF_PROG_TYPE_KPROBE
  BPF_PROG_TYPE_TRACEPOINT
  BPF_PROG_TYPE_PERF_EVENT

For libbpf this causes an EINVAL return during the bpf_object__probe_loading
call which only checks to see if programs of type BPF_PROG_TYPE_SOCKET_FILTER
can load.

The following will try BPF_PROG_TYPE_TRACEPOINT as a fallback attempt before
erroring out. BPF_PROG_TYPE_KPROBE was not a good candidate because on some
kernels it requires knowledge of the LINUX_VERSION_CODE.

  [0] https://www.redhat.com/en/blog/introduction-ebpf-red-hat-enterprise-linux-7
  [1] https://access.redhat.com/articles/3550581

Signed-off-by: Jonathan Edwards <jonathan.edwards@165gc.onmicrosoft.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210619151007.GA6963@165gc.onmicrosoft.com
2021-07-16 17:05:44 -07:00
Andrii Nakryiko
ae62c159ec include: initial sync of pkt_cls.h and pkt_sched.h
Add pkt_cls.h and pkt_sched.h to include/uapi/linux.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-16 14:22:07 -07:00
Yonghong Song
8bf016110e sync uapi headers linux/pkt_cls.h and linux/pkt_sched.h
Let us sync linux/{pkt_cls.h,pkt_sched.h} to libbpf repo.
Otherwise, on ubuntu 16.04, system headers will be picked up
and this will result in compilation error like:
  .../netlink.c:416:23: error: ‘TC_H_CLSACT’ undeclared (first use in this function)
     *parent = TC_H_MAKE(TC_H_CLSACT,
                         ^
  .../netlink.c:418:9: error: ‘TC_H_MIN_INGRESS’ undeclared (first use in this function)
           TC_H_MIN_INGRESS : TC_H_MIN_EGRESS);
           ^
  .../netlink.c:418:28: error: ‘TC_H_MIN_EGRESS’ undeclared (first use in this function)
           TC_H_MIN_INGRESS : TC_H_MIN_EGRESS);
                              ^
  .../netlink.c: In function ‘__get_tc_info’:
  .../netlink.c:522:11: error: ‘TCA_BPF_ID’ undeclared (first use in this function)
    if (!tbb[TCA_BPF_ID])
             ^

Signed-off-by: Yonghong Song <yhs@fb.com>
2021-07-12 14:01:21 -07:00
Yucong Sun
d3e4039a0a create ondemand vmtest workflow 2021-07-09 14:09:51 -07:00
Jussi Mäki
dd34504b43 vmtest: Set CONFIG_BONDING=y in latest.config
This is preparation for the XDP bonding patch set [1] to avoid having to mangle the kernel configuration from vmtest.sh.
 
[1]: https://lore.kernel.org/bpf/202106221509.kwNvAAZg-lkp@intel.com/T/#m4635dc0003944f38a54059b11147ab46abeffa13

Signed-off-by: Jussi Maki <joamaki@gmail.com>
2021-07-08 15:18:32 -07:00