.. Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information# regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. The VXLAN Plugin ================ General ------- CloudStack supports VXLAN technology to enhance scalability and flexibility in networking designs. Using VXLAN (Virtual Extensible LAN) instead of traditional VLAN (Virtual LAN) for layer 2 isolation method offers several key benefits, especially for modern data centers and cloud networking environments that require high scalability and flexibility. VXLAN overcomes the limitations of traditional VLANs by providing a highly scalable, flexible, and efficient networking solution. It enables the creation of a large number of isolated virtual networks over a common physical infrastructure, supports better utilization of network resources through Layer 3 routing capabilities, and simplifies network management and provisioning. When deploying a VXLAN-based network, there are two options to choose from: • Multicast • EVPN using BGP While Multicast is the easiest to set up VXLAN isolation, EVPN offers much more control, scalability, and flexibility. Therefore, it is chosen for most VXLAN network deployments. .. warning:: Deploying VXLAN, especially with EVPN, requires extensive networking knowledge, which isn't covered by this documentation or CloudStack in general. Make sure to familiarize yourself with VXLAN, BGP and EVPN before attempting to deploy this network technology. System Requirements / Networking for VXLAN ------------------------------------------ The following table lists the requirements for using VXLAN in your deployment: +---------------------+-----------------------------------------------+----------------------------------------------------------------------------------------------------------------+ | Item | Requirement | Note | +=====================+===============================================+================================================================================================================+ | Hypervisor | KVM | Only the BridgeVifDriver (default) is supported | +---------------------+-----------------------------------------------+----------------------------------------------------------------------------------------------------------------+ | Network Card (NIC) | VXLAN offloading | A NIC with VXLAN-offloading support is recommended. For example Mellanox ConnectX-5 or Intel X710 | +---------------------+-----------------------------------------------+----------------------------------------------------------------------------------------------------------------+ | IP Protocol | IPv4 or IPv6 | CloudStack is agnostic to the IP-protocol being used as underlay. Both IPv4 and IPv6 are supported | +---------------------+-----------------------------------------------+----------------------------------------------------------------------------------------------------------------+ | MTU | >=1550 | VXLAN has an overhead of 50 bytes, therefore 1550 is the minimum. See the notes below | +---------------------+-----------------------------------------------+----------------------------------------------------------------------------------------------------------------+ | BGP Routing | FRRouting (>=10) | BGP Routing Daemon only required for EVPN. Version >=10 is recommended | +---------------------+-----------------------------------------------+----------------------------------------------------------------------------------------------------------------+ MTU size ~~~~~~~~ When new VXLAN interfaces are created, kernel will obtain current MTU size of the physical interface (ethX or the bridge) and then create VXLAN interface/bridge that are exactly 50 bytes smaller than the MTU on physical interface/bridge. This means that in order to support default MTU size of 1500 bytes inside Instance, your VXLAN interface/bridge must also have MTU of 1500 bytes, meaning that your physical interface/bridge must have MTU of at least 1550 bytes. In order to configure "jumbo frames" you can i.e. make physical interface/bridge with 9000 bytes MTU, then all the VXLAN interfaces will be created with MTU of 8950 bytes, and then MTU size inside Instance can be set to 8950 bytes. In general it is recommend to use an MTU of at least 9000 bytes or larger. Most VXLAN capable network cards and switch support an MTU of up to 9216. Using an MTU of 9216 bytes allows for using Jumbo Frames (9000) within guest networks. VXLAN using Multicast --------------------- The default mode for using VXLAN is Multicast. The required configuration is described below. Important note on max number of multicast groups ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Default value of "net.ipv4.igmp_max_memberships" (cat /proc/sys/net/ipv4/igmp_max_memberships) is "20", which means that host can be joined to max 20 multicast groups (attach max 20 multicast IPs on the host). Since all VXLAN (VTEP) interfaces provisioned on host are multicast-based (belong to certain multicast group, and thus has it is own multicast IP that is used as VTEP), this means that you can not provision more than 20 (working) VXLAN interfaces per host. Under Linux you can NOT by default provision (start) more than 20 VXLAN interfaces and the error message "No buffer space available" will appear in the Cloudstack Agent logs after provisioning the required bridges and VXLAN interfaces. Increase the needed parameter to an appropriate value (i.e. 100 or 200) as required. If you need to operate more than 20 Instances from different client networks, the change above is required. Configure hypervisor: KVM ^^^^^^^^^^^^^^^^^^^^^^^^^ In addition to "KVM Hypervisor Host Installation" in "CloudStack Installation Guide", you have to configure the following item on the host. Create bridge interface with IPv4 address ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This plugin requires an IPv4 address on the KVM host to terminate and originate VXLAN traffic. The address should be assigned to a physical interface or a bridge interface bound to a physical interface. Both a private address or a public address are fine for the purpose. It is not required to be in the same subnet for all hypervisors in a zone, but they should be able to reach each other via IP multicast with UDP/8472 port. A name of a physical interface or a name of a bridge interface bound to a physical interface can be used as a traffic label. Physical interface name fits for almost all cases, but if physical interface name differs per host, you may use a bridge to set a same name. If you would like to use a bridge name as a traffic label, you may create a bridge in this way. Let ``cloudbr1`` be the bridge interface for the Instances' private Network. Configure in RHEL or CentOS ''''''''''''''''''''''''''' When you configured the ``cloudbr1`` interface as below, :: $ sudo vi /etc/sysconfig/network-scripts/ifcfg-cloudbr1 :: DEVICE=cloudbr1 TYPE=Bridge ONBOOT=yes BOOTPROTO=none IPV6INIT=no IPV6_AUTOCONF=no DELAY=5 STP=yes you would change the configuration similar to below. :: DEVICE=cloudbr1 TYPE=Bridge ONBOOT=yes BOOTPROTO=static IPADDR=192.0.2.X NETMASK=255.255.255.0 IPV6INIT=no IPV6_AUTOCONF=no DELAY=5 STP=yes Configure in Ubuntu ''''''''''''''''''' When you configured ``cloudbr1`` as below, :: $ sudo vi /etc/network/interfaces :: auto lo iface lo inet loopback # The primary network interface auto eth0.100 iface eth0.100 inet static address 192.168.42.11 netmask 255.255.255.240 gateway 192.168.42.1 dns-nameservers 9.9.9.9 dns-domain lab.example.org # Public network auto cloudbr0 iface cloudbr0 inet manual bridge_ports eth0.200 bridge_fd 5 bridge_stp off bridge_maxwait 1 # Private network auto cloudbr1 iface cloudbr1 inet manual bridge_ports eth0.300 bridge_fd 5 bridge_stp off bridge_maxwait 1 you would change the configuration similar to below. :: auto lo iface lo inet loopback # The primary network interface auto eth0.100 iface eth0.100 inet static address 192.168.42.11 netmask 255.255.255.240 gateway 192.168.42.1 dns-nameservers 9.9.9.9 dns-domain lab.example.org # Public network auto cloudbr0 iface cloudbr0 inet manual bridge_ports eth0.200 bridge_fd 5 bridge_stp off bridge_maxwait 1 # Private network auto cloudbr1 iface cloudbr1 inet static address 192.0.2.X netmask 255.255.255.0 bridge_ports eth0.300 bridge_fd 5 bridge_stp off bridge_maxwait 1 Configure iptables to pass VXLAN packets ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Since VXLAN uses UDP packet to forward encapsulated the L2 frames, UDP/8472 port must be opened. Make sure that your firewall (firewalld, ufw, ...) allows UDP packets on port 8472, as an example: :: $ sudo firewall-cmd --zone=public --permanent --add-port=8472/udp $ sudo ufw allow proto udp from any to any port 8472 VXLAN using EVPN --------------------- Using VXLAN with BGP+EVPN as underlay is more complex to set up, but does allow for more scaling and provides much more flexibility. This documentation can not cover all elements of deploying BGP+EVPN in your environment. It is recommend to read `this blogpost `_ before you continue. The main items for using EVPN: - BGP Routing Daemon on the hypervisor - No LACP/Bonding will be used - The modified script (modifyvxlan-evpn.sh) is required and this might require tailoring to your situation - BGP+EVPN capable and enabled network environment EVPN Bash script ~~~~~~~~~~~~~~~~ The default 'modifyvxlan.sh' script installed by CloudStack uses Multicast for VXLAN. A different version of this script is available which will use EVPN instead of Multicast and ships with CloudStack by default. In order to use this script create a symlink on **each** KVM hypervisor :: $ cd /usr/share $ ln -s cloudstack-common/scripts/vm/network/vnet/modifyvxlan-evpn.sh modifyvxlan.sh This script is also available in the CloudStack `GIT repository `_. View the contents of the script to understand its inner workings, some key items: - VXLAN (vtep) devices are created using 'nolearning', disabling the use of multicast - UDP port 4789 (RFC 7348) - IPv4 is used as underlay - It assumes an IPv4 (/32) address is configured on the loopback interface and will be the VTEP source BGP routing daemon ~~~~~~~~~~~~~~~~~~~ Using `FRRouting `_ as routing daemon is recommended, but not required. In general FRR is a BGP routing daemon with extensive EVPN support. Refer to the FRRouting documentation on how to install the proper packages and get started with FRR. A minimal configuration for FRR could look like this: .. code-block:: bash frr defaults traditional hostname hypervisor01 log syslog informational no ipv6 forwarding service integrated-vtysh-config ! interface ens2f0np0 no ipv6 nd suppress-ra ! interface ens2f1np1 no ipv6 nd suppress-ra ! interface lo ip address 10.255.192.12/32 ipv6 address 2001:db8:100::1/128 ! router bgp 4200800212 bgp router-id 10.255.192.12 no bgp ebgp-requires-policy no bgp default ipv4-unicast no bgp network import-check neighbor uplinks peer-group neighbor uplinks remote-as external neighbor uplinks ebgp-multihop 255 neighbor ens2f0np0 interface peer-group uplinks neighbor ens2f1np1 interface peer-group uplinks ! address-family ipv4 unicast network 10.255.192.12/32 neighbor uplinks activate neighbor uplinks next-hop-self neighbor uplinks soft-reconfiguration inbound neighbor uplinks route-map upstream-v4-in in neighbor uplinks route-map upstream-v4-out out exit-address-family ! address-family ipv6 unicast network 2001:db8:100::1/128 neighbor uplinks activate neighbor uplinks soft-reconfiguration inbound neighbor uplinks route-map upstream-v6-in in neighbor uplinks route-map upstream-v6-out out exit-address-family ! address-family l2vpn evpn neighbor uplinks activate advertise-all-vni advertise-svi-ip exit-address-family This configuration will: - Establish two BGP sessions using BGP Unnumbered over the two uplinks (ens2f0np0 and ens2f1np1) - These BGP sessions are usually established with two Top-of-Rack (ToR) switches/routers which are BGP+EVPN capable - Enable the families ipv4, ipv6 and evpn - Announce the IPv4 (10.255.192.12/32) and IPv6 (2001:db8:100::1/128) loopback addresses - Advertise all VXLAN networks (VNI) detected locally on the hypervisor (vxlan network devices) - Use ASN 4200800212 for this hypervisor (each node has it is own unique ASN) BGP and EVPN in the upstream network ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This documentation does not cover configuring BGP and EVPN in the upstream network. This will differ per network and is therefore difficult to capture in this documentation. A couple of key items though: - Each hypervisor with establish eBGP session(s) with the Top-of-Rack router(s) in it is rack - These Top-of-Rack devices will connect to (a) Spine router(s) - On the Spine router(s) the VNIs will terminate and they will act as IPv4/IPv6 gateways The exact BGP and EVPN configuration will differ per networking vendor and thus differs per deployment. Setup zone using VXLAN ---------------------- In almost all parts of zone setup, you can just follow the advanced zone setup instruction in "CloudStack Installation Guide" to use this plugin. It is not required to add a Network element nor to reconfigure the Network offering. The only thing you have to do is configure the physical Network to use VXLAN as the isolation method for Guest Network. Configure the physical Network ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. figure:: /_static/images/vxlan-physicalnetwork.png CloudStack needs to have one physical Network for Guest Traffic with the isolation method set to "VXLAN". .. figure:: /_static/images/vxlan-trafficlabel.png Guest Network traffic label should be the name of the physical interface or the name of the bridge interface and the bridge interface and they should have an IPv4 address. See ? for details. Configure the guest traffic ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. figure:: /_static/images/vxlan-vniconfig.png Specify a range of VNIs you would like to use for carrying guest Network traffic. .. warning:: VNI must be unique per zone and no duplicate VNIs can exist in the zone. Exercise care when designing your VNI allocation policy.