Skip to content

Parsing packets

In the previous chapter, our XDP application ran until Ctrl-C was hit and permitted all the traffic. Each time a packet was received, the eBPF program logged the string "received a packet". In this chapter we're going to show how to parse packets.

While we could go all out and parse data all the way up to L7, we'll constrain our example to L3, and to make things easier, IPv4 only.

Source code

Full code for the example in this chapter is available here

Using network types

We're going to log the source IP address of incoming packets. So we'll need to:

  • Read the Ethernet header to determine if we're dealing with an IPv4 packet, else terminate parsing.
  • Read the source IP Address from the IPv4 header.

We could read the specifications of those protocols and parse manually, but instead we're going to use the network-types crate which provides convenient type definitions for many of the common Internet protocols.

Let's add it to our eBPF crate by adding a dependency on network-types in our xdp-log-ebpf/Cargo.toml:

[package]
name = "xdp-log-ebpf"
version = "0.1.0"
edition = "2021"

[dependencies]
aya-ebpf = { git = "https://github.com/aya-rs/aya" }
aya-log-ebpf = { git = "https://github.com/aya-rs/aya" }
xdp-log-common = { path = "../xdp-log-common" }
network-types = "0.0.7"

[[bin]]
name = "xdp-log"
path = "src/main.rs"

Getting packet data from the context

XdpContext contains two fields that we're going to use: data and data_end, which are respectively a pointer to the beginning and to the end of the packet.

In order to access the data in the packet and to ensure that we do so in a way that keeps the eBPF verifier happy, we're going to introduce a helper function called ptr_at. The function ensures that before we access any packet data, we insert the bound checks which are required by the verifier.

Finally to access individual fields from the Ethernet and IPv4 headers, we're going to use the memoffset crate, let's add a dependency for it in xdp-log-ebpf/Cargo.toml.

Reading fields using offset_of!

As there is limited stack space, it's more memory efficient to use the offset_of! macro to read a single field from a struct, rather than reading the whole struct and accessing the field by name.

The resulting code looks like this:

xdp-log-ebpf/src/main.rs
#![no_std]
#![no_main]

use aya_ebpf::{bindings::xdp_action, macros::xdp, programs::XdpContext};
use aya_log_ebpf::info;

use core::mem;
use network_types::{
    eth::{EthHdr, EtherType},
    ip::{IpProto, Ipv4Hdr},
    tcp::TcpHdr,
    udp::UdpHdr,
};

#[cfg(not(test))]
#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
    loop {}
}

#[xdp]
pub fn xdp_firewall(ctx: XdpContext) -> u32 {
    match try_xdp_firewall(ctx) {
        Ok(ret) => ret,
        Err(_) => xdp_action::XDP_ABORTED,
    }
}

#[inline(always)] // (1)
fn ptr_at<T>(ctx: &XdpContext, offset: usize) -> Result<*const T, ()> {
    let start = ctx.data();
    let end = ctx.data_end();
    let len = mem::size_of::<T>();

    if start + offset + len > end {
        return Err(());
    }

    Ok((start + offset) as *const T)
}

fn try_xdp_firewall(ctx: XdpContext) -> Result<u32, ()> {
    let ethhdr: *const EthHdr = ptr_at(&ctx, 0)?; // (2)
    match unsafe { (*ethhdr).ether_type } {
        EtherType::Ipv4 => {}
        _ => return Ok(xdp_action::XDP_PASS),
    }

    let ipv4hdr: *const Ipv4Hdr = ptr_at(&ctx, EthHdr::LEN)?;
    let source_addr = u32::from_be(unsafe { (*ipv4hdr).src_addr });

    let source_port = match unsafe { (*ipv4hdr).proto } {
        IpProto::Tcp => {
            let tcphdr: *const TcpHdr =
                ptr_at(&ctx, EthHdr::LEN + Ipv4Hdr::LEN)?;
            u16::from_be(unsafe { (*tcphdr).source })
        }
        IpProto::Udp => {
            let udphdr: *const UdpHdr =
                ptr_at(&ctx, EthHdr::LEN + Ipv4Hdr::LEN)?;
            u16::from_be(unsafe { (*udphdr).source })
        }
        _ => return Err(()),
    };

    // (3)
    info!(&ctx, "SRC IP: {:i}, SRC PORT: {}", source_addr, source_port);

    Ok(xdp_action::XDP_PASS)
}
  1. Here we define ptr_at to ensure that packet access is always bound checked.
  2. Use ptr_at to read our ethernet header.
  3. Here we log IP and port.

Don't forget to rebuild your eBPF program!

User-space component

Our user-space code doesn't really differ from the previous chapter, but for the reference, here's the code:

xdp-log/src/main.rs
use anyhow::Context;
use aya::{
    include_bytes_aligned,
    programs::{Xdp, XdpFlags},
    Ebpf,
};
use aya_log::EbpfLogger;
use clap::Parser;
use log::{info, warn};
use tokio::signal;

#[derive(Debug, Parser)]
struct Opt {
    #[clap(short, long, default_value = "eth0")]
    iface: String,
}

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    let opt = Opt::parse();

    env_logger::init();

    // This will include your eBPF object file as raw bytes at compile-time and load it at
    // runtime. This approach is recommended for most real-world use cases. If you would
    // like to specify the eBPF program at runtime rather than at compile-time, you can
    // reach for `Ebpf::load_file` instead.
    #[cfg(debug_assertions)]
    let mut bpf = Ebpf::load(include_bytes_aligned!(
        "../../target/bpfel-unknown-none/debug/xdp-log"
    ))?;
    #[cfg(not(debug_assertions))]
    let mut bpf = Ebpf::load(include_bytes_aligned!(
        "../../target/bpfel-unknown-none/release/xdp-log"
    ))?;
    if let Err(e) = EbpfLogger::init(&mut bpf) {
        // This can happen if you remove all log statements from your eBPF program.
        warn!("failed to initialize eBPF logger: {}", e);
    }
    let program: &mut Xdp =
        bpf.program_mut("xdp_firewall").unwrap().try_into()?;
    program.load()?;
    program.attach(&opt.iface, XdpFlags::default())
        .context("failed to attach the XDP program with default flags - try changing XdpFlags::default() to XdpFlags::SKB_MODE")?;

    info!("Waiting for Ctrl-C...");
    signal::ctrl_c().await?;
    info!("Exiting...");

    Ok(())
}

Running the program

As before, the interface can be overwritten by providing the interface name as a parameter, for example, RUST_LOG=info cargo xtask run -- --iface wlp2s0.

$ RUST_LOG=info cargo xtask run
[2022-12-22T11:32:21Z INFO  xdp_log] SRC IP: 172.52.22.104, SRC PORT: 443
[2022-12-22T11:32:21Z INFO  xdp_log] SRC IP: 172.52.22.104, SRC PORT: 443
[2022-12-22T11:32:21Z INFO  xdp_log] SRC IP: 172.52.22.104, SRC PORT: 443
[2022-12-22T11:32:21Z INFO  xdp_log] SRC IP: 172.52.22.104, SRC PORT: 443
[2022-12-22T11:32:21Z INFO  xdp_log] SRC IP: 234.130.159.162, SRC PORT: 443