snabbmark: Add "mp-ring" multiprocess benchmark #804

lukego · 2016-03-06T06:24:11Z

Add a new mp-ring benchmark for measuring the performance of basic multi-process link operations. The benchmark forks worker processes and cycles packets through them via a series of links.

This particular benchmark "just works" in multiprocess mode without any changes to the Snabb core because packets and links are already allocated in shared memory, that is done before the children are forked, and freelists are not used because the same packets keep circulating from link to link.

The intention of this benchmark is to be a framework for investigating any fundamental performance limits of inter-process traffic (see #801) and for reproducing specific issues like inter-core conflicts on the cache-coherence level. The code uses a simple and naive Lua implementation ("basic") but can accommodate more sophisticated implementations (in the spirit of the asm code in #603). This could make it a useful tool for prototyping code like 100G mux/demux (#691).

(I have made no attempt to optimize this benchmark. That is another activity entirely.)

cc @xrme @kbara

Examples

[luke@lugano-1:~/git/snabbswitch/src]$ make && sudo ./snabb snabbmark mp-ring --processes 1
make: 'snabb' is up to date.
Benchmark configuration:
       burst: 100
  writebytes: 0
   processes: 1
   readbytes: 0
     packets: 100000000
        mode: basic
   pmuevents: false
  65.28 Mpps ring throughput per process

[luke@lugano-1:~/git/snabbswitch/src]$ make && sudo ./snabb snabbmark mp-ring --processes 2
make: 'snabb' is up to date.
Benchmark configuration:
       burst: 100
  writebytes: 0
   processes: 2
   readbytes: 0
     packets: 100000000
        mode: basic
   pmuevents: false
   5.44 Mpps ring throughput per process

[luke@lugano-1:~/git/snabbswitch/src]$ make && sudo ./snabb snabbmark mp-ring --processes 3
make: 'snabb' is up to date.
Benchmark configuration:
       burst: 100
  writebytes: 0
   processes: 3
   readbytes: 0
     packets: 100000000
        mode: basic
   pmuevents: false
   4.39 Mpps ring throughput per process

Usage

Usage:
  snabbmark mp-ring [OPTIONS]

  -m MODE, --mode MODE
                             Mode of operation. Determines which code
                             is used for the worker processes.
                             Currently supported values:
                               basic -- idiomatic Lua code [default]
  -n PROCESSES, --processes PROCESSES
                             Number of worker processes.
                             Default: 2
  -p PACKETS, --packets PACKETS
                             Number of packets processed by each worker.
                             Default: 100e6 (one hundred million)
  -b BURST, --burst BURST
                             Initial number of packets per link.
                             Default: 100
  -e EVENTS, --events EVENTS
                             Comma-separated list of PMU events to count.
  -r BYTES, --read BYTES
                             Number of bytes to read from each packet.
  -w BYTES, --write BYTES
                             Number of bytes to write to each packet.
  -h, --help
                             Print this usage message.

This benchmark measures the throughput of <N> Snabb processes that are circularly connected together in a ring.

Process <N> uses core <N>.

mention-bot · 2016-03-06T06:24:12Z

By analyzing the blame information on this pull request, we identified @eugeneia and @hb9cwp to be potential reviewers

lukego · 2016-03-06T06:28:48Z

The upstream branch for this change is multiproc and I have already merged it there. No action from other maintainers required here.

The whole multiproc branch will be submitted upstream when it makes sense to merge towards master.

xrme · 2016-03-15T23:05:44Z

I'm not sure if it's kosher to keep commenting here, but this little snippet of the benchmark also causes cache line ping-ponging. If you add a crude C.usleep(100) in there, --mode ff throughput goes up 40%.

   -- Spin until enough packets have been processed                             
   while counters[0] < c.packets do
      core.lib.compiler_barrier()
   end

I think we could make the main process wait on a semaphore rather than spinning like this.

lukego · 2016-03-16T05:26:11Z

Interesting! Please push that crude sleep somewhere e.g. your mp-ring branch so that it will show up over on #813. Seems simple if we keep using that branch and PR for discussion. I will merge it up to multiproc whenever that makes sense.

I wonder if movnt would also avoid this ping-pong overhead. Curious to look at the PMU counters for this when I have a chance to understand the exact MESIF interaction.

xrme · 2016-03-16T20:17:27Z

I pushed 65abcd3 to my mp-ring branch, but my commit isn't showing up over on #813.

lukego · 2016-03-17T04:27:50Z

I think it's because there is no open Pull Request from your branch at the moment. Github automatically closed #813 when I completed the merge. Should be able to start a new pull request from the same branch to send further changes.

…mx-test Temporarily disable snabbvmx selftest

lukego added 3 commits March 6, 2016 04:30

snabbmark: Add 'mp-ring' multiprocess benchmark

25007d8

This benchmark measures the throughput of <N> Snabb processes that are circularly connected together in a ring.

snabbmark: Add CPU affinity to mp-ring

e4ec563

Process <N> uses core <N>.

snabbmark: Add proper command-line syntax to mp-ring

d45f808

lukego self-assigned this Mar 6, 2016

lukego added a commit to lukego/snabb that referenced this pull request Mar 6, 2016

Merge snabbco#804 (snabbmark mp-ring) into multiproc

a5517e0

lukego added the merged label Mar 6, 2016

lukego mentioned this pull request Mar 8, 2016

PMU musings #808

Open

dpino pushed a commit to dpino/snabb that referenced this pull request May 3, 2017

Merge pull request snabbco#804 from Igalia/temporarily-disable-snabbv…

a38707c

…mx-test Temporarily disable snabbvmx selftest

eugeneia closed this Nov 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

snabbmark: Add "mp-ring" multiprocess benchmark #804

snabbmark: Add "mp-ring" multiprocess benchmark #804

lukego commented Mar 6, 2016

mention-bot commented Mar 6, 2016

lukego commented Mar 6, 2016

xrme commented Mar 15, 2016

lukego commented Mar 16, 2016

xrme commented Mar 16, 2016

lukego commented Mar 17, 2016

snabbmark: Add "mp-ring" multiprocess benchmark #804

snabbmark: Add "mp-ring" multiprocess benchmark #804

Conversation

lukego commented Mar 6, 2016

Examples

Usage

mention-bot commented Mar 6, 2016

lukego commented Mar 6, 2016

xrme commented Mar 15, 2016

lukego commented Mar 16, 2016

xrme commented Mar 16, 2016

lukego commented Mar 17, 2016