Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added vlan tag to the bridge cni plugin. #231

Merged
merged 2 commits into from
Apr 11, 2019

Conversation

SchSeba
Copy link
Contributor

@SchSeba SchSeba commented Nov 18, 2018

Allow the user to assign VLAN tag.

Updated netlink library to introduce the VlanFiltering variable into the bridge struct

@squeed
Copy link
Member

squeed commented Nov 19, 2018

Neat. Linux bridges never cease to surprise me :-). Some minor requests:

  • Can you split this in to two commits, one for the actual code and one for the vendor/ bump?
  • Can you add some documentation as to what this does? Especially, be clear that this sets the vlan on the container interface (I didn't immediately understand this)

And some larger questions:

  • Is there a use-case for setting a vlan and not enabling vlanFiltering? Otherwise, I'd rather just delete that knob.
  • Should we handle the case where an existing bridge doesn't have vlanFiltering enabled? Is that a setting that can be changed on-the-fly? ensureBridge should be as idempotent as reasonably possible.

@SchSeba SchSeba force-pushed the add-vlan-tag-to-bridge branch from d1d2fca to 9ff5e57 Compare November 19, 2018 13:02
@SchSeba
Copy link
Contributor Author

SchSeba commented Nov 19, 2018

@squeed

Can you split this in to two commits, one for the actual code and one for the vendor/ bump?

Done

Can you add some documentation as to what this does? Especially, be clear that this sets the vlan on the container interface (I didn't immediately understand this)

Done

Is there a use-case for setting a vlan and not enabling vlanFiltering? Otherwise, I'd rather just delete that knob.

I can't see a use-case like that, The tag will not work if the vlanFiltering is not configured.

Should we handle the case where an existing bridge doesn't have vlanFiltering enabled? Is that a setting that can be changed on-the-fly? ensureBridge should be as idempotent as reasonably possible.

I think we can always enable the vlanFiltering. untag traffic will work unrelated if the feature is enable or not.

@squeed
Copy link
Member

squeed commented Nov 21, 2018

So, we talked about this at the CNI maintainers meeting, and this came up: If a container is started in a vlan, how does it get connectivity? Presumably there's no equivalent vlan-100 "uplink" interface on the "host" side of the bridge (i.e. "br0.100").

What are you envisioning? Do you have a separate step that creates the "br0.100" interface? Or do you expect non-default vlans to be private / isolated?

It's generally expected, but by no means required, that a CNI plugin results in a container being "online." Of course, these are loose definitions, but they are helpful to guide decisions.

Should this plugin also create the "br0.100" interface if configured with a vlan?

@phoracek
Copy link

@squeed forgive me, I don't know much about setting of VLAN directly on a bridge, but my understanding was this:

If I connect two containers on the same host to bridge+vlan, they would be able to reach each other.

If I create a bridges on two nodes and connect them via a trunk interface. Then containers connected to this bridge+vlan can reach each other node-to-node.

Is that right or is more needed?

@dankenigsberg
Copy link

So, we talked about this at the CNI maintainers meeting, and this came up: If a container is started in a vlan, how does it get connectivity? Presumably there's no equivalent vlan-100 "uplink" interface on the "host" side of the bridge (i.e. "br0.100").

If properly set, the bridge has one trunk uplink (e.g. eth0) and multiple ports, one per Pod. Traffic arriving to eth0 tagged with vlanX is repeated (under usual bridge rules) to all ports belonging to this vlan, but with the tag stripped. The reverse happen to egress traffic.

So if I get what you're asking, the bridge port itself plays the role of br0.100 vlan device. They tag egress and strip ingress in the same way.

(/me slightly rephrasing @phoracek)

@dankenigsberg
Copy link

I don't know how it usually works on github, but @SchSeba maybe you can start you commit message by explaining what bridge vlan filtering is (include https://developers.redhat.com/blog/2017/09/14/vlan-filter-support-on-bridge/ ) and why it is useful here.

@SchSeba SchSeba force-pushed the add-vlan-tag-to-bridge branch 4 times, most recently from db0fcec to f81be1f Compare November 26, 2018 14:37
@SchSeba
Copy link
Contributor Author

SchSeba commented Nov 26, 2018

@squeed

Should this plugin also create the "br0.100" interface if configured with a vlan?

The code to allow the vlan to be tagged on the bridge interface. there is not need to create a br0.100 interface.

if n.Vlan != 0 {
		err = netlink.BridgeVlanAdd(br, uint16(n.Vlan), false, true, true, false)
		if err != nil {
			return nil, nil, fmt.Errorf("failed to setup vlan tag on the bridge interface %q: %v", br.Name, err)
		}
	}

@SchSeba
Copy link
Contributor Author

SchSeba commented Dec 4, 2018

@squeed there is any update on this PR?

@squeed
Copy link
Member

squeed commented Dec 12, 2018

Then I still don't understand it. Are you manually trunking in another physical interface? If I want to send a packet from the host to a container on vlan 100, how do I do it?

@SchSeba
Copy link
Contributor Author

SchSeba commented Dec 12, 2018

Hi @squeed Thanks for the comment.

So if the want so send a packet you need to allow the vlan on the bridge interface and configure ip address in the same subnet on that interface.

@dankenigsberg
Copy link

Maybe a little diagram would help explain what we'd like to achieve. The suggested change is aimed to allow multiple Pods, each one connected to a possibly-different vlan, share access to NIC that is connected to a trunk port of an external switch. If the node needs access to a vlan, it can connect to it just like any other namespace.

@squeed am I addressing your question?

    +---------------------------------------------------------------------+
    | node1                                                               |
    |                               +------------------+                  |        
    |    +---------+                |Pod3              |                  |        
    |    |Pod2     |                |                  |                  |        
    |    |         |     +----------+                  |                  |        
    |    |         |     |          |                  |                  |        
    |    |         |     |          +------------------+                  |        
    |    |         |     |                            +---------------+   |
    |    |         |     | vlan300                    | Pod1          |   |    
    |    +----+----+     |           +----------------+               |   |    
    |         |          |           |                |               |   |    
    |         |          |           | vlan200        |               |   |    
    |         |          |           |                +---------------+   |
    |         |          |           |                                    |        
    |         |     +----+-----------+--+                                 |        
    |         |     |                   |                                 |        
    |         |     |                   |                                 |        
    |         +-----+                   |                                 |        
    |    vlan200    |        br0        +---------+                       |        
    |               |                   |    vlan200                      |        
    |               +---------+---------+                                 |        
    |                         |                                           |
    |                         |eth7                                       |
    +---------------------------------------------------------------------+
                              |
                              |  trunk 200,300

if err != nil {
return nil, nil, fmt.Errorf("failed to create bridge %q: %v", n.BrName, err)
}

if n.Vlan != 0 {
err = netlink.BridgeVlanAdd(br, uint16(n.Vlan), false, true, true, false)
Copy link
Contributor

@rosenhouse rosenhouse Dec 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this vlan filter entry being removed on DEL. Are there any downsides to leaving it in place on the bridge even after all veths tagged with the vid have been removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I don't remove it from the bridge I can't find any good reason to remove it just link we are not removing the bridge if there is no veths connected to it.

@@ -171,6 +180,7 @@ var counter uint
// arguments for a test case.
func (tc testCase) createCmdArgs(targetNS ns.NetNS, dataDir string) *skel.CmdArgs {
conf := tc.netConfJSON(dataDir)
fmt.Println(conf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you intend to leave this in the testCase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope my bad sorry

@SchSeba
Copy link
Contributor Author

SchSeba commented Jan 9, 2019

Hi @squeed there is something that prevent this PR from be merged?

Copy link
Member

@dcbw dcbw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly /lgtm
Just a couple README nits.

@@ -52,3 +52,8 @@ If the bridge is missing, the plugin will create one on first use and, if gatewa
* `hairpinMode` (boolean, optional): set hairpin mode for interfaces on the bridge. Defaults to false.
* `ipam` (dictionary, required): IPAM configuration to be used for this network. For L2-only network, create empty dictionary.
* `promiscMode` (boolean, optional): set promiscuous mode on the bridge. Defaults to false.
* `vlan` (int, optional): assign VLAN tag. Defaults to none.

*Note:* The VLAN parameter config the VLAN tag on the host end of the veth and also enable the vlan_filtering feature on the bridge interface.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"config" -> "configures"
"enable" -> "enables"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done Thanks!


*Note:* The VLAN parameter config the VLAN tag on the host end of the veth and also enable the vlan_filtering feature on the bridge interface.

*Note:* To configure up link for L2 network you need to allow the vlan on the up link interface by using the follow command ``` bridge vlan add vid VLAN_ID dev DEV```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"up link" -> "uplink"
"follow" -> "following"

And I assume DEV here is like eth0 or whatever the hosts physical NIC would usually be?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I assume DEV here is like eth0 or whatever the hosts physical NIC would usually be?

Correct. A better text would be

To configure up link for L2 network you need to allow the vlan on the up link interface DEV by using the follow command

or maybe use the blog example of bridge vlan add dev bond0 vid 2 master

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@dcbw
Copy link
Member

dcbw commented Jan 9, 2019

So if the want so send a packet you need to allow the vlan on the bridge interface and configure ip address in the same subnet on that interface.

This isn't quite clear enough to me... what would be required on the host side to send a packet to a container on a given VLAN? Would you need to create another veth and attach it to the bridge using the same VLAN ID as the container?

@SchSeba
Copy link
Contributor Author

SchSeba commented Jan 10, 2019

This isn't quite clear enough to me... what would be required on the host side to send a packet to a container on a given VLAN? Would you need to create another veth and attach it to the bridge using the same VLAN ID as the container?

you don't need to create any additional interface, the bridge interface will get the vlan ID.
The bridge also will have ip address if this is a L3 connection just like before.
Then the host can send packets using the bridge interface to the container because they both have the same vlan ID.

@SchSeba SchSeba force-pushed the add-vlan-tag-to-bridge branch from a1438fc to a9ab503 Compare January 10, 2019 08:25
@squeed
Copy link
Member

squeed commented Jan 16, 2019

I tested this with two configurations, same bridge:

{
        "cniVersion": "0.3.1",
        "name": "mybridge",
        "type": "bridge",
        "ipMasq": true,
        "ipam": {
                "type": "host-local",
                "subnet": "10.42.0.0/24"
        },
        "isDefaultGateway": true,
        "hairpinMode": true
}

and

{
        "cniVersion": "0.3.1",
        "name": "myvlanbridge",
        "type": "bridge",
        "ipMasq": true,
        "ipam": {
                "type": "host-local",
                "subnet": "10.52.0.0/24"
        },
        "isDefaultGateway": true,
        "hairpinMode": true,
        "vlan": 99
}

and got the following error:

$ sudo CNI_PATH=./bin ../cni/bin/cnitool add myvlanbridge /var/run/netns/vlan100 
failed to set bridge addr: "cni0" already has an IP address different from 10.52.0.1/24

@squeed
Copy link
Member

squeed commented Jan 23, 2019

Just to follow up on my previous comment: I would expect setting the vlan property on the bridge to work, even when users expect layer3 connectivity.

If I understand your use-case correctly, you only really care about layer2 connectivity, so that's not an issue. However, I would really like the plugins to continue to meet user expectations, and it's not at all clear to me as an end-user why adding vlan tag would require me to drop layer3.

So I'd like you to come up with a way for containers to have both a vlan tag and layer3 connectivity. If this means you have to create bridgename.100 on the host or not, I leave up to you.

@SchSeba SchSeba force-pushed the add-vlan-tag-to-bridge branch 3 times, most recently from 22df5b9 to 0502f71 Compare February 3, 2019 18:55
@squeed
Copy link
Member

squeed commented Feb 20, 2019

Hi @SchSeba,
You are right, and my understanding of the spec was wrong (to be fair, the other maintainers were of the same opinion). I think you're right, and we should not include the IP addresses of the host-side interface. I'm very sorry for the confusion.

@SchSeba
Copy link
Contributor Author

SchSeba commented Feb 21, 2019

Hi @squeed thanks for the comment just to be sure you want me to remove the cni0.99 interface from the result.interfaces list?

@SchSeba
Copy link
Contributor Author

SchSeba commented Feb 26, 2019

@squeed ping?

@dcbw
Copy link
Member

dcbw commented Mar 5, 2019

Hi @squeed thanks for the comment just to be sure you want me to remove the cni0.99 interface from the result.interfaces list?

@SchSeba not quite; leave the cni0.99 interface in the Interfaces array. But when there is an L3 address for cni0 or cni0.99 we don't need to include that in the IPs array because the existing bridge plugin does not include the cni0 address there either.

So I think what you have now is what Casey is asking for.

@SchSeba
Copy link
Contributor Author

SchSeba commented Mar 6, 2019

Thanks for the comment @dcbw so I think this PR is ready to get merge :)

@dcbw
Copy link
Member

dcbw commented Mar 8, 2019

@SchSeba Casey is on PTO today but will do a final check on Monday.

@dankenigsberg
Copy link

dankenigsberg commented Mar 13, 2019

@dcbw @squeed happy Wednesday!

@SchSeba
Copy link
Contributor Author

SchSeba commented Mar 21, 2019

Hi @dcbw @squeed any update on this PR?

@SchSeba
Copy link
Contributor Author

SchSeba commented Mar 21, 2019

Hi @rosenhouse maybe you can help me get this PR merge?

Copy link
Contributor

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks OK; if Casey is ok with it I am too.

@squeed
Copy link
Member

squeed commented Apr 4, 2019

Code LGTM. Can you squash in to two commits (vendor and code), and I'll merge?

@SchSeba SchSeba force-pushed the add-vlan-tag-to-bridge branch from 19363f7 to 67c1036 Compare April 4, 2019 12:15
@SchSeba SchSeba force-pushed the add-vlan-tag-to-bridge branch from 67c1036 to 58a7125 Compare April 4, 2019 14:33
@SchSeba
Copy link
Contributor Author

SchSeba commented Apr 4, 2019

Hi @squeed please take a look on my last commit I was requested to add some lines to the test to fix them after the rebase.
If that's fine from your side I will squash it.

@SchSeba
Copy link
Contributor Author

SchSeba commented Apr 9, 2019

Hi @squeed any update?

@squeed
Copy link
Member

squeed commented Apr 10, 2019

@SchSeba Gotcha. looks good from my side. Squash this and I'll merge it :-)

@squeed squeed added this to the v0.8.0 milestone Apr 10, 2019
With the VLAN filter, the Linux bridge acts more like a real switch, Allow to tag and untag
vlan id's on every interface connected to the bridge.

This PR also creates a veth interface for the bridge vlan interface on L3 configuration.

Related to https://developers.redhat.com/blog/2017/09/14/vlan-filter-support-on-bridge/ post.

Note: This feature was introduced in Linux kernel 3.8 and was added to RHEL in version 7.0.
@SchSeba SchSeba force-pushed the add-vlan-tag-to-bridge branch from 58a7125 to b69a2dd Compare April 10, 2019 19:00
@SchSeba
Copy link
Contributor Author

SchSeba commented Apr 10, 2019

@squeed done and also the tests looks good :)

@dcbw dcbw merged commit e9e1d37 into containernetworking:master Apr 11, 2019
vlanFiltering := false
if n.Vlan != 0 {
vlanFiltering = true
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if vlan filtering is enabled in the begining (for those with Vlan IDs), and a new vlan, without vlan ID, is to be created. Wouldn't the new vlan disable vlan filtering?

@CJCShadowsan
Copy link

I know this sounds like an obvious question, but...

...How does this work?

I've tried creating a bridge interface but can't get pod-to-host networking working over VLANs.

@dankenigsberg
Copy link

I know this sounds like an obvious question, but...

...How does this work?

I've tried creating a bridge interface but can't get pod-to-host networking working over VLANs.

I believe you must first connect the host to a bridge port with the relevant Vlan ID enabled. You should be able to do so by running this CNI in the host networking namespace and set an IP address on the free veth.

@CJCShadowsan
Copy link

I know this sounds like an obvious question, but...
...How does this work?
I've tried creating a bridge interface but can't get pod-to-host networking working over VLANs.

I believe you must first connect the host to a bridge port with the relevant Vlan ID enabled. You should be able to do so by running this CNI in the host networking namespace and set an IP address on the free veth.

Ok - so my setup:

Host:

  • 1x bond device
  • 4x vlan devices (on 4 VLANS) attached to bond0 (bond0.101, bond0.102, etc)

Other hosts:

  • 1x bond device
  • VLAN device bond0.101
  • IPMI device on VLAN 102

Via multus:

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: br1-vl101
spec:
  config: '{
      "cniVersion": "0.4.0",
      "type": "bridge",
      "bridge": "br1",
      "vlan": 101,
      "ipam": {
        "type": "host-local",
        "subnet": "10.8.0.0/24",
        "rangeStart": "10.8.0.220",
        "rangeEnd": "10.8.0.220"
      }
    }'

When I attach this to a POD (incidentally a KubeVirt instance), the address comes up and I see the device show up on a bridge vlan command on the physical host... But can't ping a 10.8.0.5 address which is on the same VLAN, on a physical host (not the host this Pod is running on).

Yet... with the real bond0.101 device I can ping 10.8.0.5.

What am I missing? I was hoping there would be some sort of example to make this work considering this merge request enables a VLAN tag on a bridge plugin but i'm just coming up agonisingly short.

@dankenigsberg
Copy link

Kudos for using KubeVirt ;-) You're most welcome to share your use case on kubevirt-dev

Can you ping between two pods on the same node over their br1-vl101 network? You should!
I don't see how you connect br1 to the external world. I assume it is connected to bond0. IIRC Linux vlan devices such as your bond0.101 grab ethernet frames with the 101 tag before they reach to a bridge connected to bond0. To let the both host and bridge see bidirectional vlan 101 traffic, I suggest again that you create a veth pair, with one side attached to a bridge port tagged with 101 and the other with ip address, both on the host network namespace.

If my guess is off, you'd need a bit of tcpdump sleuthing to find out where your packets are lost.

@CJCShadowsan
Copy link

I'm betting this is where i'm falling down here. So there's more that needs to be done in order to make this work? Lol!

So - dummies guide - If we just use the physical bond, bond0 (which carries all VLANS):

brctl addbr br1 bond0
ip link set br1 type bridge vlan_filtering 1
ip link set bond0 type bridge vlan_filtering 1
bridge vlan add dev bond0 vid 101 master
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: br1-vl101
spec:
  config: '{
      "cniVersion": "0.4.0",
      "type": "bridge",
      "bridge": "br1",
      "master": "bond0",
      "vlan": 101,
      "ipam": {
        "type": "host-local",
        "subnet": "10.8.0.0/24",
        "rangeStart": "10.8.0.220",
        "rangeEnd": "10.8.0.220"
      }
    }'

This creates a veth with VLAN 101, bond0 bridge carries 101... And br1 gets created that links to bond0 if it's created. Is that correct?

@gaby
Copy link

gaby commented Jul 20, 2021

Can someone provide an example of how is this suppose to work? We can use docker ipvlan and get vlan tagged traffic. When using this plugin we can't ping/reach anything using vlans.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.