Ensuring Deployment Safety with eBPF: A Step-by-Step Guide

Introduction

Did you know that GitHub hosts all of its own source code on github.com? This practice allows the company to act as its own biggest customer, testing internal changes before rolling them out to users. However, it introduces a critical risk: if github.com goes down, GitHub loses access to its own source code—a classic circular dependency. To mitigate this, GitHub maintains a mirrored copy and built assets for rollback. Yet, more subtle circular dependencies lurk within deployment scripts, such as those that rely on internal services or downloading binaries from GitHub. To tackle this, GitHub turned to eBPF (extended Berkeley Packet Filter), a powerful Linux kernel technology that allows safe and efficient monitoring and blocking of system calls. This guide will walk you through how GitHub uses eBPF to enhance deployment safety, step by step.

Ensuring Deployment Safety with eBPF: A Step-by-Step Guide
Source: github.blog

What You Need

  • Basic understanding of Linux system calls and kernel concepts
  • Familiarity with eBPF fundamentals (e.g., kprobes, tracepoints, maps)
  • Access to a Linux system with kernel version 4.4 or later (eBPF support required)
  • eBPF toolchain installed (e.g., bpftool, clang, llvm, libbpf-dev)
  • Sample deployment scripts or scenarios to test against
  • A test environment (e.g., VM or container) to avoid affecting production

Step-by-Step Guide

Step 1: Identify Circular Dependencies in Your Deployment Process

Before writing eBPF programs, analyze your deployment scripts to spot circular dependencies. In the GitHub scenario, three types exist:

  • Direct dependencies: A script attempts to fetch a binary from GitHub, which is unavailable during an outage.
  • Hidden dependencies: A local tool checks for updates online, hanging if GitHub is down.
  • Transient dependencies: A script calls an internal service that indirectly depends on GitHub.

Document each dependency: what system calls or network requests it makes, and whether it could create a loop. For example, a script run during a MySQL outage might try to wget a release from GitHub.

Step 2: Design an eBPF Program to Monitor Relevant Syscalls

eBPF can attach to syscalls like connect, open, or execve. For deployment safety, focus on network-related syscalls (e.g., connect to detect outgoing connections to GitHub). Create a map to store allowed IP addresses or domains. In GitHub’s case, they’d block connections to github.com except from whitelisted processes. Use kprobe or tracepoint to intercept these syscalls and log details (PID, destination IP, etc.).

Step 3: Write the eBPF C Code

Using C with libbpf, write an eBPF program that:

  • Defines a map (e.g., BPF_HASH) to store allowed IPs or process IDs.
  • Attaches a kprobe to sys_connect.
  • In the probe, extracts destination IP from the sockaddr struct.
  • Compares IP against a blocklist/allowlist (e.g., GitHub’s IP range).
  • If blocked, logs the event and optionally returns an error (e.g., -EPERM) to deny the connection.

Sample snippet:

#include 
#include 

struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __uint(max_entries, 1024);
    __type(key, __u32); // PID
    __type(value, __u32); // allowed flag
} allowed_processes SEC(".maps");

SEC("kprobe/sys_connect")
int kprobe__sys_connect(struct pt_regs *ctx)
{
    // ... implementation
    return 0;
}

Note: Full implementation requires handling socket structures and byte ordering.

Step 4: Compile and Load the eBPF Program

Compile the eBPF C code into a BPF object using clang with -target bpf. Then, use a loader (e.g., bpftool or a small Python/C loader) to load the program into the kernel. For example:

Ensuring Deployment Safety with eBPF: A Step-by-Step Guide
Source: github.blog
clang -O2 -target bpf -c deploy_safety.c -o deploy_safety.o
bpftool prog load deploy_safety.o /sys/fs/bpf/deploy_safety

Verify it's loaded with bpftool prog list. Attach the program to the appropriate hook (e.g., using bpftool perf attach or via tracepoint).

Step 5: Test with Simulated Deployment Scenarios

Simulate the circular dependency scenarios from Step 1. Run a script that tries to connect to GitHub’s IP while the eBPF program is active. Check logs (e.g., via trace_pipe or bpftool map dump) to confirm the program blocks or logs the attempt. Test both allowed processes (e.g., a trusted updater) and blocked ones. Adjust the map entries to fine-tune behavior.

Step 6: Integrate into Your Deployment Pipeline

Once tested, incorporate the eBPF program into the deployment system. For GitHub, this means:

  • Loading the program on host startup or before deployment scripts run.
  • Using a user-space agent to manage the allowlist (e.g., updating allowed IPs or process IDs dynamically).
  • Ensuring rollback: if the eBPF program itself fails, the deployment should abort or use a fallback.

Consider running the eBPF program in a monitoring-only mode initially to audit dependencies before enforcing blocks.

Tips for Success

  • Start with monitoring only – Use eBPF to log violations without blocking. This helps identify unhidden dependencies and avoid breaking existing scripts.
  • Use BPF maps for dynamic policies – Update allowed lists at runtime without recompiling the eBPF program.
  • Handle kernel version differences – eBPF APIs change. Test on your target kernel and use CO-RE (Compile Once – Run Everywhere) if needed.
  • Account for DNS resolution – Blocking by IP may miss domains resolved to different addresses. Consider using eBPF to hook connect after resolution, or block at the sendto syscall.
  • Monitor performance impact – eBPF is fast but logging excessive events can overwhelm buffers. Rate-limit logs.
  • Prepare for failures – Have a fallback mechanism (e.g., a simple iptables rule) in case the eBPF program unloads unexpectedly.

By following these steps, you can leverage eBPF to break circular dependencies in your own deployment pipelines, just like GitHub does. The key is to identify risky dependencies, design a targeted eBPF program, and test thoroughly before enforcing blocks.

Tags:

Recommended

Discover More

Insider Betting on Prediction Markets: A Troubling Trend on PolymarketNew Study Reveals the Brain's Memory Center Begins with Rich Neural Connections, Not a Blank Slate10 Critical Facts About Extrinsic Hallucinations in Large Language ModelsUbuntu and Canonical Services Hit by Prolonged DDoS Attack: Key Questions AnsweredUnified Infrastructure Visibility: HCP Terraform with Infragraph Now in Public Preview