Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new program 'firehose' #557

Merged
merged 2 commits into from Aug 4, 2015
Merged

Add new program 'firehose' #557

merged 2 commits into from Aug 4, 2015

Conversation

lukego
Copy link
Member

@lukego lukego commented Jul 10, 2015

This is a program that me and @pavel-odintsov have been working on this week. It is a specially optimized program that picks up packets off the wire and passes them to a simple C callback function. We are exploring it as a possible faster way to get packets into fastnetmon. Hopefully it is generally the easiest way in the universe to quickly attach a C function to an arbitrarily large amount of traffic.

On Interlaken I have setup a 40G / 41 Mpps load testing setup (96-byte SYN flood). For simple packet counting I see full line rate with one 2.4 GHz core and I suspect there is a lot more capacity than this. For basic packet header parsing/analysis based on fastnetmon code I see 34 Mpps. It's really fast.

The trick is that we statically initialize all packet buffers and DMA descriptors so that new packets can be received with just a few instructions.

README:

  Usage: firehose [OPTION]... <callback.so>

  Firehose: Execute C callback functions directly on live traffic.

  The callback function is provided as a shared library (.so file) and
  is called with each packet received from one or more 10-Gigabit (Intel
  82599) network interfaces.

  Receive overhead is very low. firehose can process tens of millions of
  packets per second with one CPU core.

  Print the example (-e option) and header file (-H option) for
  instructions on how to write a callback library.

  Options:

    -H, --print-header   Print a copy of firehose.h.
                         You need this to compile your callback library.
    -e, --example        Print instructions for compiling an example library.
    -i PCI, --input PCI  Receive packets from port with PCI address.
    -t SECONDS, --time SECONDS

Example usage:

  Instructions for creating an example program for firehose.

  This example assumes that you have the 'firehose' executable in your
  path. If you are using the 'snabb' executable then use a syntax like
  'snabb firehose' or './snabb firehose' instead as appropirate.

  Step 1: Create firehose.h

    Create firehose.h with this command:
    $ firehose -H > firehose.h

  Step 2: Create firehose_example.c

    Create the file firehose_example.c with these contents:

      #include <stdio.h>
      #include "firehose.h" // generated by 'firehose -h'

      static int packets;
      void firehose_start() { printf("Starting\n"); }
      void firehose_stop()  { printf("Stopping after %d packets\n", packets); }
      void firehose_packet(const char *pci, char *data, int length) { packets++; }

  Step 3: Compile the callback library

   Compile firehose_example.so callback shared library:
    $ gcc -O2 -fPIC -shared -o firehose_example.so firehose_example.c

  Step 4: Run the example on one or more 10G ports

    Indentify the PCI addresses of the (Intel 82599) network ports that
    you want to test with and run firehose:

    $ sudo firehose -i 0000:01:00.0 -i 0000:01:00.1 \
                    -i 0000:02:00.0 -i 0000:02:00.1 \
                    -t 1 ./firehose_example.so
    Loading shared object: ./firehose_example.so
    Initializing NIC: 0000:01:00.0
    Initializing NIC: 0000:01:00.1
    Initializing NIC: 0000:02:00.0
    Initializing NIC: 0000:02:00.1
    Initializing callback library
    Starting
    Processing traffic...
    Stopping after 41136315 packets

    which shows 41 million packets being processed in one
    second. (This is not the performance limit: that was the total
    traffic being received on the links for this example.)

README:

  Usage: firehose [OPTION]... <callback.so>

  Firehose: Execute C callback functions directly on live traffic.

  The callback function is provided as a shared library (.so file) and
  is called with each packet received from one or more 10-Gigabit (Intel
  82599) network interfaces.

  Receive overhead is very low. firehose can process tens of millions of
  packets per second with one CPU core.

  Print the example (-e option) and header file (-H option) for
  instructions on how to write a callback library.

  Options:

    -H, --print-header   Print a copy of firehose.h.
                         You need this to compile your callback library.
    -e, --example        Print instructions for compiling an example library.
    -i PCI, --input PCI  Receive packets from port with PCI address.
    -t SECONDS, --time SECONDS

Example usage:

  Instructions for creating an example program for firehose.

  This example assumes that you have the 'firehose' executable in your
  path. If you are using the 'snabb' executable then use a syntax like
  'snabb firehose' or './snabb firehose' instead as appropirate.

  Step 1: Create firehose.h

    Create firehose.h with this command:
    $ firehose -H > firehose.h

  Step 2: Create firehose_example.c

    Create the file firehose_example.c with these contents:

      #include <stdio.h>
      #include "firehose.h" // generated by 'firehose -h'

      static int packets;
      void firehose_start() { printf("Starting\n"); }
      void firehose_stop()  { printf("Stopping after %d packets\n", packets); }
      void firehose_packet(const char *pci, char *data, int length) { packets++; }

  Step 3: Compile the callback library

   Compile firehose_example.so callback shared library:
    $ gcc -O2 -fPIC -shared -o firehose_example.so firehose_example.c

  Step 4: Run the example on one or more 10G ports

    Indentify the PCI addresses of the (Intel 82599) network ports that
    you want to test with and run firehose:

    $ sudo firehose -i 0000:01:00.0 -i 0000:01:00.1 \
                    -i 0000:02:00.0 -i 0000:02:00.1 \
                    -t 1 ./firehose_example.so
    Loading shared object: ./firehose_example.so
    Initializing NIC: 0000:01:00.0
    Initializing NIC: 0000:01:00.1
    Initializing NIC: 0000:02:00.0
    Initializing NIC: 0000:02:00.1
    Initializing callback library
    Starting
    Processing traffic...
    Stopping after 41136315 packets

    which shows 41 million packets being processed in one
    second. (This is not the performance limit: that was the total
    traffic being received on the links for this example.)
@lukego
Copy link
Member Author

lukego commented Jul 10, 2015

The whole inner loop for the example program is only 12 instructions with no branch except for the loop exit condition. That is looping over the receive descriptor queue, passing new packets to the callback, having the callback bump a counter, resetting the descriptor, continuing until no packets left.

I like the way this design made it possible to inline all of that C code into one place. Like what LuaJIT does automatically with its tracing JIT :-).

Cool to browse the disassembly.

Function entry and loop setup:

00000000000007d0 <firehose_callback_v1>:
 7d0:   49 63 c0                movslq %r8d,%rax
 7d3:   48 c1 e0 04             shl    $0x4,%rax
 7d7:   48 8d 3c 02             lea    (%rdx,%rax,1),%rdi
 7db:   f6 47 0c 01             testb  $0x1,0xc(%rdi)
 7df:   74 41                   je     822 <firehose_callback_v1+0x52>
 7e1:   8b 05 5d 08 20 00       mov    0x20085d(%rip),%eax        # 201044 <packets>
 7e7:   83 e9 01                sub    $0x1,%ecx
 7ea:   44 8d 48 01             lea    0x1(%rax),%r9d
 7ee:   66 90                   xchg   %ax,%ax

Inner loop:

 7f0:   41 83 c0 01             add    $0x1,%r8d
 7f4:   41 21 c8                and    %ecx,%r8d
 7f7:   49 63 c0                movslq %r8d,%rax
 7fa:   4c 8b 14 c6             mov    (%rsi,%rax,8),%r10
 7fe:   48 c1 e0 04             shl    $0x4,%rax
 802:   41 0f 18 0a             prefetcht0 (%r10)
 806:   c6 47 0c 00             movb   $0x0,0xc(%rdi)
 80a:   48 8d 3c 02             lea    (%rdx,%rax,1),%rdi
 80e:   45 89 ca                mov    %r9d,%r10d
 811:   41 83 c1 01             add    $0x1,%r9d
 815:   f6 47 0c 01             testb  $0x1,0xc(%rdi)
 819:   75 d5                   jne    7f0 <firehose_callback_v1+0x20>

Function return:

 81b:   44 89 15 22 08 20 00    mov    %r10d,0x200822(%rip)        # 201044 <packets>
 822:   44 89 c0                mov    %r8d,%eax
 825:   c3                      retq   
 826:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
 82d:   00 00 00 

lukego added a commit to lukego/snabb that referenced this pull request Jul 10, 2015
lukego added a commit to lukego/snabb that referenced this pull request Jul 26, 2015
@eugeneia eugeneia merged commit ca6d988 into snabbco:master Aug 4, 2015
@lukego lukego deleted the firehose branch February 24, 2016 12:45
wingo added a commit that referenced this pull request Nov 18, 2016
Migrate snabbvmx lwaftr to new configuration, more cleanups
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants