Troubleshooting eBPF

This document should help end-users with troubleshooting their eBPF programs. With a primary focus on programs under kernels samples/bpf.

Memory ulimits

The eBPF maps uses locked memory, which is default very low. Your program likely need to increase resource limit RLIMIT_MEMLOCK see system call setrlimit(2).

The bpf_create_map call will return errno EPERM (Operation not permitted) when the RLIMIT_MEMLOCK memory size limit is exceeded.

Enable bpf JIT

Not seeing the expected performance and perf top showing __bpf_prog_run() as the top CPU consumer.

Did you remember to enable JIT’ing of the BPF code? Like:

$ sysctl net/core/bpf_jit_enable=1
net.core.bpf_jit_enable = 1

Notice there is both JIT’ing of eBPF and cBPF (Classical BPF) implemented in the kernel per arch. You can see current cBPF and eBPF JITs that are supported by the kernel via:

$ git grep BPF_JIT | grep select
arch/arm/Kconfig:      select HAVE_CBPF_JIT
arch/arm64/Kconfig:    select HAVE_EBPF_JIT
arch/mips/Kconfig:     select HAVE_CBPF_JIT if !CPU_MICROMIPS
arch/powerpc/Kconfig:  select HAVE_CBPF_JIT                    if !PPC64
arch/powerpc/Kconfig:  select HAVE_EBPF_JIT                    if PPC64
arch/s390/Kconfig:     select HAVE_EBPF_JIT if PACK_STACK && HAVE_MARCH_Z196_FEATURES
arch/sparc/Kconfig:    select HAVE_CBPF_JIT if SPARC32
arch/sparc/Kconfig:    select HAVE_EBPF_JIT if SPARC64
arch/x86/Kconfig:      select HAVE_EBPF_JIT                    if X86_64

Also see Cilium JIT section and BPF sysctl section.

ELF binary

The binary containing the eBPF program, which got generated by the LLVM compiler, is an normal ELF binary. For samples/bpf/ this is the file named xxx_kern.o. It is possible to inspect this normal ELF file, with tools like readelf or llvm-objdump.

$ llvm-objdump -h xdp_ddos01_blacklist_kern.o

xdp_ddos01_blacklist_kern.o:   file format ELF64-unknown

Sections:
Idx Name          Size      Address          Type
 0               00000000 0000000000000000
 1 .strtab       00000072 0000000000000000
 2 .text         00000000 0000000000000000 TEXT DATA
 3 xdp_prog      000001b8 0000000000000000 TEXT DATA
 4 .relxdp_prog  00000020 0000000000000000
 5 maps          00000028 0000000000000000 DATA
 6 license       00000004 0000000000000000 DATA
 7 .symtab       000000d8 0000000000000000

From the above output some trivial information can be extracted. This is an XDP program, as the defined program section Idx 3 starts with the letters “xdp”. From the same line the size column also show the program size in hex 0001b8 equal 440 bytes, or 55 bpf instructions, as each insns is 8 bytes (see struct bpf_insn) (shell trick echo $((0x1b8)) insns=$((0x1b8 / 8))). Do notice this size is not the JIT’ed program size.

The loader code samples/bpf/bpf_load.c parse this elf file, extract needed program sections, uses the maps section and relocation section (here .relxdp_prog ) to remap the BPF_PSEUDO_MAP_FD instruction to point to the correct map (which gets created during parsing of the maps section, via standard bpf-syscall bpf_create_map).

LLVM disassemble support

Todo

Document what LLVM version this “-S” option got added

In newer versions of LLVM, the tool llvm-objdump, supports showing section names, asm code and original C code, if compiled with -g.

llvm-objdump -S prog_kern.o

Todo

What does the option -no-show-raw-insn do?

See Cilium Toolchain LLVM section for more details.

Extracting eBPF-JIT code

Also see Cilium JIT Debugging.

For debugging/seeing the generated JIT code, is it possible to change this proc sysctl:

sysctl net.core.bpf_jit_enable=2

The output looks like:

flen=55 proglen=335 pass=4 image=ffffffffa0006820 from=xdp_ddos01_blac pid=13333
JIT code: 00000000: 55 48 89 e5 48 81 ec 28 02 00 00 48 89 9d d8 fd
JIT code: 00000010: ff ff 4c 89 ad e0 fd ff ff 4c 89 b5 e8 fd ff ff
JIT code: 00000020: 4c 89 bd f0 fd ff ff 31 c0 48 89 85 f8 fd ff ff
JIT code: 00000030: bb 02 00 00 00 48 8b 77 08 48 8b 7f 00 48 89 fa
JIT code: 00000040: 48 83 c2 0e 48 39 f2 0f 87 e1 00 00 00 48 0f b6
JIT code: 00000050: 4f 0c 48 0f b6 57 0d 48 c1 e2 08 48 09 ca 48 89
JIT code: 00000060: d1 48 81 e1 ff 00 00 00 41 b8 06 00 00 00 49 39
JIT code: 00000070: c8 0f 87 b7 00 00 00 48 81 fa 88 a8 00 00 74 0e
JIT code: 00000080: b9 0e 00 00 00 48 81 fa 81 00 00 00 75 1a 48 89
JIT code: 00000090: fa 48 83 c2 12 48 39 f2 0f 87 90 00 00 00 b9 12
JIT code: 000000a0: 00 00 00 48 0f b7 57 10 bb 02 00 00 00 48 81 e2
JIT code: 000000b0: ff ff 00 00 48 83 fa 08 75 49 48 01 cf 31 db 48
JIT code: 000000c0: 89 fa 48 83 c2 14 48 39 f2 77 38 8b 7f 0c 89 7d
JIT code: 000000d0: fc 48 89 ee 48 83 c6 fc 48 bf 00 9c 24 5f 07 88
JIT code: 000000e0: ff ff e8 29 cd 13 e1 bb 02 00 00 00 48 83 f8 00
JIT code: 000000f0: 74 11 48 8b 78 00 48 83 c7 01 48 89 78 00 bb 01
JIT code: 00000100: 00 00 00 89 5d f8 48 89 ee 48 83 c6 f8 48 bf c0
JIT code: 00000110: 76 12 13 04 88 ff ff e8 f4 cc 13 e1 48 83 f8 00
JIT code: 00000120: 74 0c 48 8b 78 00 48 83 c7 01 48 89 78 00 48 89
JIT code: 00000130: d8 48 8b 9d d8 fd ff ff 4c 8b ad e0 fd ff ff 4c
JIT code: 00000140: 8b b5 e8 fd ff ff 4c 8b bd f0 fd ff ff c9 c3

The proglen is the len of opcode sequence generated and flen is the number of bpf insns. You can use tools/net/bpf_jit_disasm.c to disassemble that output. bpf_jit_disasm -o will dump the related opcodes as well.

Perf tool symbols

For JITed progs, you can do sysctl net/core/bpf_jit_kallsyms=1 and f.e. perf script –kallsyms=/proc/kallsyms to show them based on the tag:

sysctl net/core/bpf_jit_kallsyms=1

Detail see commit: https://git.kernel.org/torvalds/c/74451e66d516c55e3

Remember to use the perf command-line option –kallsyms=/proc/kallsyms to get the symobols resolved, like:

# perf report --no-children --kallsyms=/proc/kallsyms