New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new program 'firehose' #557
Conversation
README: Usage: firehose [OPTION]... <callback.so> Firehose: Execute C callback functions directly on live traffic. The callback function is provided as a shared library (.so file) and is called with each packet received from one or more 10-Gigabit (Intel 82599) network interfaces. Receive overhead is very low. firehose can process tens of millions of packets per second with one CPU core. Print the example (-e option) and header file (-H option) for instructions on how to write a callback library. Options: -H, --print-header Print a copy of firehose.h. You need this to compile your callback library. -e, --example Print instructions for compiling an example library. -i PCI, --input PCI Receive packets from port with PCI address. -t SECONDS, --time SECONDS Example usage: Instructions for creating an example program for firehose. This example assumes that you have the 'firehose' executable in your path. If you are using the 'snabb' executable then use a syntax like 'snabb firehose' or './snabb firehose' instead as appropirate. Step 1: Create firehose.h Create firehose.h with this command: $ firehose -H > firehose.h Step 2: Create firehose_example.c Create the file firehose_example.c with these contents: #include <stdio.h> #include "firehose.h" // generated by 'firehose -h' static int packets; void firehose_start() { printf("Starting\n"); } void firehose_stop() { printf("Stopping after %d packets\n", packets); } void firehose_packet(const char *pci, char *data, int length) { packets++; } Step 3: Compile the callback library Compile firehose_example.so callback shared library: $ gcc -O2 -fPIC -shared -o firehose_example.so firehose_example.c Step 4: Run the example on one or more 10G ports Indentify the PCI addresses of the (Intel 82599) network ports that you want to test with and run firehose: $ sudo firehose -i 0000:01:00.0 -i 0000:01:00.1 \ -i 0000:02:00.0 -i 0000:02:00.1 \ -t 1 ./firehose_example.so Loading shared object: ./firehose_example.so Initializing NIC: 0000:01:00.0 Initializing NIC: 0000:01:00.1 Initializing NIC: 0000:02:00.0 Initializing NIC: 0000:02:00.1 Initializing callback library Starting Processing traffic... Stopping after 41136315 packets which shows 41 million packets being processed in one second. (This is not the performance limit: that was the total traffic being received on the links for this example.)
The whole inner loop for the example program is only 12 instructions with no branch except for the loop exit condition. That is looping over the receive descriptor queue, passing new packets to the callback, having the callback bump a counter, resetting the descriptor, continuing until no packets left. I like the way this design made it possible to inline all of that C code into one place. Like what LuaJIT does automatically with its tracing JIT :-). Cool to browse the disassembly. Function entry and loop setup:
Inner loop:
Function return:
|
Migrate snabbvmx lwaftr to new configuration, more cleanups
This is a program that me and @pavel-odintsov have been working on this week. It is a specially optimized program that picks up packets off the wire and passes them to a simple C callback function. We are exploring it as a possible faster way to get packets into fastnetmon. Hopefully it is generally the easiest way in the universe to quickly attach a C function to an arbitrarily large amount of traffic.
On Interlaken I have setup a 40G / 41 Mpps load testing setup (96-byte SYN flood). For simple packet counting I see full line rate with one 2.4 GHz core and I suspect there is a lot more capacity than this. For basic packet header parsing/analysis based on fastnetmon code I see 34 Mpps. It's really fast.
The trick is that we statically initialize all packet buffers and DMA descriptors so that new packets can be received with just a few instructions.
README:
Example usage: