Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to more capable, unified Intel driver #960

Closed
wingo opened this issue Sep 21, 2017 · 4 comments
Closed

Upgrade to more capable, unified Intel driver #960

wingo opened this issue Sep 21, 2017 · 4 comments
Milestone

Comments

@wingo
Copy link

wingo commented Sep 21, 2017

Feature summary

The 2017.03 "Camu" Snabb release came with a new driver for Intel cards, intel_mp. It supported the i210, i350, and 82599 chipsets, with a mostly-overlapping set of functionality compared to the single-card 82599 support. Although intel_mp did not yet support some kinds of receive queue assignment that the 82599 does, it does support the receive-side scaling (RSS) hash functionality which the 82599 driver does not yet support. The plan was for Snabb to deprecate and remove the old driver and just use intel_mp, once all existing use cases are accounted for.

We had not tested the performance of the new intel_mp 82599 driver against the lwAFTR or Snabb NFV. This project qualifies the new driver against the lwAFTR. It adds support for RSS in the lwAFTR itself, allowing operators to dedicate more than one CPU core per port to handling lwAFTR traffic.

Deliverable steps

  1. Replace the existing 82599 driver in Snabb with intel_mp.

    a. Implement VLAN insertion, removal, and MAC-based queue assignment for the 82599.

    Done; merged upstream via Implement VMDq, VLAN insert/remove for intel_mp snabbco/snabb#1229. See https://github.com/snabbco/snabb/blob/wingo-next/src/apps/intel_mp/README.md for documentation. Note that the "VMDq" feature has to be on to enable these features; see Support stripping VLAN tags on non-virtualized 82599 NICs snabbco/snabb#749.

    b. Implement 5-tuple receive queue assignment for the intel_mp driver and the 82599 NIC.

    Done; see documentation link above. Multiple workers can be run on one interface as long as their rxq and txq values are different and their poolnum values are the same. Unfortunately with the VMDq feature enabled, only two queues are available!

    c. Qualify intel_mp driver for 82599 for use with the lwAFTR, reaching the same performance as the older driver.

    Done. In our tests, we see no regression relative to intel10g. It took a few patches to get here, of course!

    d. Remove older driver and update all applications and documentation.

    Done: Use intel_mp driver instead of intel10g driver snabbco/snabb#1237. This branch is upstream-bound but not yet merged upstream; it is merged on lwaftr of course.

  2. Horizonal scale-out via a hash function over incoming packets, allowing multiple cores per port to be allocated to forwarding lwAFTR traffic. We anticipate the speedup to be near-linear with the number of processes dedicated to incoming traffic.

    Done. See Expand lwAFTR to support multiple worker processes #953 for the general discussion of multiple workers and https://github.com/Igalia/snabb/blob/lwaftr/src/program/lwaftr/doc/configuration.md#multiple-devices for a specific discussion of RSS.

@wingo wingo added this to the v2017.08.04 milestone Sep 21, 2017
@takikawa
Copy link
Member

takikawa commented Sep 21, 2017

Here are some implementation thoughts on the horizontal scale-out item (hopefully this is the most appropriate issue for this comment, I can move it too) in case it's useful.

To configure this, we need to set up both VMDq pools (for MAC and VLAN based queue assignment) and RSS queues (for serving multiple queues via multiple processes).

For each instance in snabb-softwire-v2 there are some number of queues. Since each queue can have different external-interfaces, it seems necessary to collect the queues that have the same VLAN/MAC/etc and assign them to a new VMDq pool. There are 64 possible pools (or 32 if we add this as a configuration option). It seems best to assign these pool numbers manually (by setting the poolnum config key) rather than relying on the automatic selection (to avoid needing to know what number was selected automatically).

Then for each pool, each queue in the configuration should get assigned an RSS queue. This is just a matter of assigning 0-1 (or 0-3 for 32 pools) for both rxq and txq. (should this also set rxcounter/txcounter for per-queue statistics counters? There are only 16 of those though)

NB: also, when assigning RSS queues, each VMDq pool should have the same number of active queues. This is due to issue mentioned in #808 about programming the RSS redirection table (the RETA register).

@teknico
Copy link

teknico commented Sep 25, 2017

There are 64 possible pools (or 32 if we add this as a configuration option).

The commit where we switched from 32 to 64 pools is ae707ce8f8e4c46fc5095ad9e9d6e1d8e5982e62.

@wingo
Copy link
Author

wingo commented Sep 27, 2017

@takikawa @teknico Thanks for these comments! I see now that we should have been setting a VMDq pool manually. That's done in #977. At this point I think I will update the feature description with its documentation. Yay!

@wingo
Copy link
Author

wingo commented Oct 2, 2017

Closing as done. Thanks @teknico and @takikawa !

@wingo wingo closed this as completed Oct 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants