Extended Berkeley Packet Filter (eBPF) is a programming technology that enables developers to write efficient, safe and non-intrusive programs that run directly in the Linux operating system (OS) kernel space.
Because they can run sandboxed programs in privileged contexts, like an OS kernel, eBPFs can expand the features of existing software at runtime without modifying the kernel source code, loading kernel modules or disrupting overall program execution. eBPF technologies represent an evolution of the original Berkeley Packet Filter (BPF), which provided a simple way to select and analyze network packets in a user space program. But beyond packet filtering, BPF programs lacked the flexibility to handle more complex tasks within the kernel.
Recognizing the need for a more versatile technology, the Linux community developed eBPF, which built upon the backend features of BPF but extended its in-kernel programmability. The advanced functionality of eBPF programs allows developers to implement enhanced packet filtering processes, conduct high-end performance analyses, and install firewalls and debugging protocols in both on-site data centers and cloud-native environments.
Advancements in eBPF technology have compelled software developers to expand its applications to all operating systems, so that non-Linux based platforms can take advantage of eBPF’s sophisticated tracing, networking and monitoring capabilities.1 In fact, the eBPF Foundation—an extension of the Linux Foundation whose members include Google, Meta, Netflix, Microsoft, Intel and Isovalent, among others—has invested heavily in the expansion of OS compatibility for eBPF programs, hoping to eventually broaden the usefulness of eBPF programming.2
Learn how intelligent automation can make your business operations a competitive advantage.
Register for the guide on observability
The main components of an eBPF program are:
eBPF programs are initially written in a restricted C subset and then compiled into eBPF bytecode using tools like LLVM, which serves as the eBPF’s back-end architecture for front-end programming languages like Clang. The bytecode is essentially a restricted set of instructions that adhere to the eBPF instruction set architecture and prevent runtime errors.
Linux kernel technology is capable of translating eBPF bytecode into executable actions, but just-in-time (JIT) compilers offer superior performance. JIT compilers can translate bytecode into native machine code for specific hardware platforms on the fly.
User space loaders are programs in the user space that load the eBPF bytecode into the kernel, attaching it to the appropriate hooks and managing any associated eBPF maps. Examples of user space loaders include tools like BPF Compiler Collection (BCC) and bpftrace.
eBPF maps are data structures with key-value pairs and read-write access that provide shared storage space and facilitate interaction between eBPF kernel programs and user space applications. Created and managed using system calls, eBPF maps can also be used to maintain state between different iterations of the eBPF programs.
The verifier—a critical component of eBPF systems—checks the bytecode before it's loaded into the kernel to make sure the program doesn't contain any harmful operations, like infinite loops, illegal instructions or out-of-bounds memory access. It also ensures that all data paths of the program terminate successfully.
Hooks are points in the kernel code where eBPF programs can be attached. When the kernel reaches a hook, it executes the attached eBPF program.
Different types of hooks like tracepoints, kprobes, and network packet receive queues give eBPF programs broad data access and allow them to perform various operations. Tracepoints, for example, allow programs to inspect and collect data about the kernel or other processes, while traffic control hooks can be used to inspect and modify network packets.
Because eBPFs cannot generate arbitrary functions and must maintain compatibility with every possible kernel version, sometimes basic eBPF instruction sets aren’t nuanced enough to perform advanced operations. Helper functions bridge this gap.
Helper functions—sets of predefined, API-based kernel functions that eBPFs can call from within the system—provide a way for eBPF programs to perform more complex operations (like getting the current time and date or generating random numbers) that aren’t directly supported by the instruction set.
Generally, eBPFs operate as virtual machines (VMs) inside the Linux kernel, working on a low-level instruction set architecture and executing eBPF bytecode. However, the complex process of executing an eBPF program tends to follow certain major steps.
Developers first write the eBPF program and compile the bytecode. The program's purpose will dictate the appropriate type of code. For instance, if a team wants to monitor CPU usage, it will write code that includes functionality for capturing usage metrics.
Once the eBPF compiler converts the high-level C code into lower-level bytecode, a user space loader will generate a BPF system call to load the program into the kernel. The loader is also responsible for addressing errors and setting up any eBPF maps the program needs.
With the program bytecode and maps in place, the eBPF will execute a verification process to ensure the program is safe to execute in the kernel. If it’s deemed unsafe, the system call to load the program will fail, and the loader program will receive an error message. If the program passes verification, it's allowed to run.
Using either an interpreter or a JIT compiler, the eBPF will translate the bytecode into actionable machine code. However, eBPF is an event-driven technology, so it will only run in response to specific hook points or events within the kernel (e.g., system calls, network events, process initiation, CPU idling, etc.). When an event occurs, the eBPF will execute the corresponding bytecode program, allowing developers to inspect and manipulate various components of the system.
Once the eBPF program is running, developers can interact with it from the user space using eBPF maps. For example, the application might periodically check a map to collect data from the eBPF program, or it might update a map to change the program's behavior.
Unloading the program is the final step of most eBPF execution processes. When the eBPF has done its job, the loader can use the BPF system call again to unload it from the kernel, at which point the eBPF stops running and frees its associated resources. The unloading process may also include iterating over any eBPF maps the team no longer needs to free up useful individual elements, and then deleting the map itself (using the “delete” syscall).
eBPF technologies have already become a cornerstone of modern Linux systems, enabling fine-grained control over the Linux kernel and empowering companies to build more innovative solutions within the Linux ecosystem.
eBPF has facilitated advancements in:
eBPF allows developers to install faster, more tailored packet processing features, load balancing processes, application profiling scripts and network monitoring practices. Open-source platforms, like Cilium, leverage eBPF to provide secure, scalable networking for Kubernetes clusters and workloads, and other containerized microservices. Furthermore, by leveraging kernel-level package forwarding logic, eBPFs can streamline routing processes and enable faster overall network response.
eBPFs let developers instrument the kernel and user space applications to collect detailed performance data and metrics without significantly impacting the system's performance. These capabilities help organizations stay ahead, enabling real-time monitoring and observability.
eBPFs can monitor system calls, network traffic and system behavior to detect and respond to potential security threats in real time. IT solutions like Falco, for example, use eBPF to implement runtime security auditing and incident response, enhancing the overall security of the system.
By tracing system calls, monitoring CPU utilization and tracking resource utilization (like disk I/O), eBPF helps developers more easily probe for bottlenecks in system performance and identify opportunities for optimization.
IBM Instana democratizes observability by providing a solution that anyone across DevOps, SRE, platform, ITOps, and development can use to get the data they want with the context they need. Purpose-built for cloud native yet technology-agnostic, the platform automatically and continuously provides high fidelity data—1 second granularity and end-to-end traces—with the context of logical and physical dependencies across mobile, web, applications, and infrastructure.
The IBM Turbonomic® hybrid cloud cost optimization platform allows you to continuously automate critical actions in real time that proactively deliver the most efficient use of compute, storage and network resources to your apps at every layer of the stack.
Harness the full power of Linux® with IBM systems.
To take advantage of Instana's eBPF-based features you need a 4.7+ Linux kernel with debugfs mounted. Learn more here.
Berkeley Packet Filters (BPF) provide a powerful tool for intrusion detection analysis.
Observability provides deep visibility into modern distributed applications for faster, automated problem identification and resolution.
1 Foundation Proposes Advancing eBPF Adoption Across Multiple OSes,(link resides outside ibm.com), DevOps.com, 21 August 2021
2 Latest eBPF Advances Are Harbingers of Major Changes to IT (link resides outside ibm.com), DevOps.com, 13 September 2023