Sunday, June 12, 2016

Playing Openvswitch And Namespace: Veth-pairs, Internal Port, Bridge, Vlan, Vxlan, DHCP, and L3 Routing Tutorial

playingvswitchNS

OpenVswitch And Namespace Playing

thanks for

https://read01.com/GQRaP2.html

本篇文章主要是手動完成vswitch, namespace達成以下設定

  1. veth-pairs, internal port and vswitch bridge連結方法
  2. Vlan setting
  3. VXlan setting
  4. DHCP service
  5. L3 routing service。

讀完這篇大家大概就會對OpenStack Neutron的DHCP,L3 routing,與OpenStack中vswitch, namespace 與device的觀念更加清晰。 我的做法跟OpenStack Neutron的做法是有差異的,但主要我想做的事是,經過此教學,可以對Neutron有跟深的感覺,當你想到neutron怎麼運作時,可以透過本文章的教學來做聯想。

這篇文章很重要,但我寫得很亂,因為最近實在太忙了.....

或許,我可以找個時間,好好描述一下OpenStack Neutron是怎麼做的。

install vswitch

apt-get install openvswitch-switch

Two Namespeces connected by a Vswitch Bridge

  1. two namespace (foonet, bobnet)
  2. one v bridge (ovsbr)
  3. connect to bridge and communication each other
ip netns add foonet
ip netns add bobnet
ovs-vsctl add-br ovsbr

create a wire, just a real line/wire, two endpoint is called eth0-foo and veth-foo. Furthermore, we will discuss use vswitch port not line/wire. The difference is if you use port two namespace can be connected but line/wire cannot connect.

ip link add eth0-foo type veth peer name veth-foo

where weth peer is defined to connect to bridge/switch.

connect the line to foonet

ip link set eth0-foo netns foonet

connect the line to switch by using ovs-vsctl, since it's vswitch.

ovs-vsctl add-port ovsbr veth-foo

Connect to bobnet

ip link add eth0-bob type veth peer name veth-bob
ip link set eth0-bob netns bobnet
ovs-vsctl add-port ovsbr veth-bob

start to config foonet

To check the initial state of the foonet and exit it

root@openvswitch:~# ip netns exec foonet bash
root@openvswitch:~# ifconfig
root@openvswitch:~#
exit

Config lo device first

root@openvswitch:~# ip netns exec foonet ip link set dev lo up
root@openvswitch:~# ip netns exec foonet ifconfig
lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

config eth0-foo, where the must the same with you defined before, or it will fail to create it.

ip netns exec foonet ip link set dev eth0-foo up

Check config eth0-foo. In actually, the definition of eth0-foo is just as tap-xx, qvo-xx as OpenStack defined. So it's very clear, how to imagine the tap-device in OpenStack.

root@openvswitch:~# ip netns exec foonet ifconfig
eth0-foo  Link encap:Ethernet  HWaddr 6a:49:62:e1:5b:47
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
 .
 .

Assign an IP to it. and check the result.

root@openvswitch:~# ip netns exec foonet ip address add 10.0.0.10/24 dev eth0-foo
root@openvswitch:~# ip netns exec foonet ifconfig
eth0-foo  Link encap:Ethernet  HWaddr 6a:49:62:e1:5b:47
          inet addr:10.0.0.10  Bcast:0.0.0.0  Mask:255.255.255.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
root@openvswitch:~# ip netns exec bobnet ip link set dev lo up
ip netns exec bobnet ip link set dev eth0-bob up
ip netns exec bobnet ip address add 10.0.0.11/24 dev eth0-bob

Ping each other

Now you can log into foonet, and ping bobnet. It should be not connected each other, since they are different namespace.

ip netns exec bobnet bash
root@openvswitch:~# ping 10.0.0.10
PING 10.0.0.10 (10.0.0.10) 56(84) bytes of data.
^C

http://www.rendoumi.com/yong-open-vswitch-de-nei-bu-duan-kou-lian-jie-liang-ge-namespace/ No need to line/wire, just claim an interface eth1 or neth1. Then binds to namespace and bridge with same interface. It can ping each other now. ovs-vsctl add-port br0 nnneth1 -- set Interface eth1 type=internal ip link set nnneth1 netns ns1 where nnneth1 can be think as a port not a wire. This is called internal mode.

Directly connect two Namespaces

create 2 namespace, dir1net and dir2net

root@openvswitch:~# ip netns add dir1net
root@openvswitch:~# ip netns add dir2net

create a line

ip link add eth0-dir type veth peer name veth-dir

Link them

root@openvswitch:~# ip link set eth0-dir netns dir1net
root@openvswitch:~# ip link set veth-dir netns dir2net

Create Line/Wire, adding two side to two namespace and assign network ip address.

root@openvswitch:~# ip netns exec dir1net ip link set dev eth0-dir up
root@openvswitch:~# ip netns exec dir2net ip link set dev veth-dir up

root@openvswitch:~# ip netns exec dir1net ip address add 10.0.0.10/24 dev eth0-dir
root@openvswitch:~# ip netns exec dir2net ip address add 10.0.0.11/24 dev veth-dir

Test connection, it should be connect and we can reuse 10.0.0.11 compare to our previous setting, bobnet, since it's network namespace.

root@openvswitch:~# ip netns exec dir1net bash
root@openvswitch:~# ping 10.0.0.11
PING 10.0.0.11 (10.0.0.11) 56(84) bytes of data.
64 bytes from 10.0.0.11: icmp_seq=1 ttl=64 time=0.046 ms
root@openvswitch:~# ip netns
dir2net
dir1net
bobnet
foonet

Trying Vlan

之前的方式我們採用veth-pairs的做法,現在我們來試試看internal port的做法。這做法不用產生,line/wire,直接用產生的port,如下範例為vlan100, 直接放到bridge與namespace中,就像port插入一般,而非veth-pair有兩個端的做法。 這兩種方法有一些不同,
1. veth-pairs的做法,在同一個bridge相連是不會通的。
2. internal port的做法在同一個bridge中相連是會通的,透過vlan tag的方式做隔離。
OpenStack主要是採用internal port的做法,因此我們接下來我們試試看這種做法。

ovs-vsctl add-br vlanxbr
ip netns add vlan100net
ip netns add vlan200net
ovs-vsctl add-port vlanbr vlan100 tag=100 -- set interface vlan100 type=internal
ifconfig vlan10 192.168.10.254 netmask 255.255.255.0
ip link set vlan100 netns vlan100net
ovs-vsctl add-port vlanxbr vlan200 tag=200 -- set interface vlan200 type=internal
ip link set vlan200 netns vlan200net
root@openvswitch:~# ip netns exec vlan100net ip address add 10.0.0.10/24 dev vlan100
root@openvswitch:~# ip netns exec vlan200net ip address add 10.0.0.11/24 dev vlan200
root@openvswitch:~# ip netns exec vlan100net ip link set dev vlan100 up
root@openvswitch:~# ip netns exec vlan200net ip link set dev vlan200 up
ip netns exec vlan100net ip link set dev lo up
ip netns exec vlan200net ip link set dev lo up

now we construct vlan110net for vlan tag=100, we need to check the inter-connection between different vlan. This way is called "internal port" used by OpenStack.

ip netns add vlan110net
ovs-vsctl add-port vlanbr vlan110 tag=100 -- set interface vlan110 type=internal
ip link set vlan110 netns vlan110net
ip netns exec vlan110net ip address add 10.0.0.12/24 dev vlan110
ip netns exec vlan110net ip link set dev lo up
ip netns exec vlan110net ip link set vlan110 up

Don't forget to bring vlan100 up.

Test vlan tag 100 and 200 inter-connection.

root@openvswitch:~# ip netns exec vlan110net bash
root@openvswitch:~# ping 10.0.0.10
PING 10.0.0.10 (10.0.0.10) 56(84) bytes of data.
64 bytes from 10.0.0.10: icmp_seq=1 ttl=64 time=0.918 ms

root@openvswitch:~# ping 10.0.0.11
PING 10.0.0.11 (10.0.0.11) 56(84) bytes of data.
^C

The result is same vlan can interconection each other but different vlan tag.

Reboot Problem

To show all the command, that is help for understanding the OpenStack Neutron.

ovsdb-tool -mm show-log /etc/openvswitch/conf.db

But it contains only ovs-vsctl commands.

Rebooting will clean all setting !!!!

Trying Vxlan

Environment

host1:172.16.235.128 host2:172.16.235.168

VXlan is working on connecting to other node. In local node we use Vlan

  1. test vlan tag 10 in host1
  2. test vlan tag 10 in host2
  3. use vxlan inter-connection with tag10
  4. ping in n1 to n2
  5. change host2 to tag 20
  6. use vxlan inter-connection
  7. ping in n1 to n2
  8. n3 with tag10, as n1, and connet to n2

Repeat the process to build vlan 100 in both localhost. Adding the following vxlan commmand

In host1

ovs-vsctl add-br vlanbr
ip netns add vlan100-1net
ovs-vsctl add-port vlanbr vlan100 tag=100 -- set interface vlan100 type=internal
ip link set vlan100 netns vlan100-1net
ip netns exec vlan100-1net ip address add 10.0.0.10/24 dev vlan100
ip netns exec vlan100-1net ip link set dev lo up
ip netns exec vlan100-1net ip link set dev vlan100 up

In host2

ovs-vsctl add-br vlanbr
ip netns add vlan100-2net
ovs-vsctl add-port vlanbr vlan100 tag=100 -- set interface vlan100 type=internal
ip link set vlan100 netns vlan100-2net
ip netns exec vlan100-2net ip address add 10.0.0.11/24 dev vlan100
ip netns exec vlan100-2net ip link set dev lo up
ip netns exec vlan100-2net ip link set dev vlan100 up

To host1 and ping vlan100-2net

ping 10.0.0.11

Not Connected.

We add VXlan that enabling the connection between hosts. In host1:

ovs-vsctl add-port vlanbr vxlan0 -- set interface vxlan0 type=vxlan options:remote_ip=172.16.235.168

In host2:

ovs-vsctl add-port vlanbr vxlan0 -- set interface vxlan0 type=vxlan options:remote_ip=172.16.235.128
root@ovsvxlan1:~# ip netns exec vlan100-1net bash
root@ovsvxlan1:~# ping 10.0.0.11
PING 10.0.0.11 (10.0.0.11) 56(84) bytes of data.
64 bytes from 10.0.0.11: icmp_seq=1 ttl=64 time=1.79 ms
64 bytes from 10.0.0.11: icmp_seq=2 ttl=64 time=0.557 ms

It works.

Adding vlan tag 200 in Host2 with networking 10.0.0.21 as before. And it shoud be cannot connected.

root@ovsvxlan1:~# ping 10.0.0.21
PING 10.0.0.21 (10.0.0.21) 56(84) bytes of data.
^C

Adding tag 100 to host2 with networking 10.0.0.15. In host1: with vlan tag 100 and acrross node via vxlan.

root@ovsvxlan1:~# ping 10.0.0.15
PING 10.0.0.15 (10.0.0.15) 56(84) bytes of data.
64 bytes from 10.0.0.15: icmp_seq=1 ttl=64 time=2.10 ms
64 bytes from 10.0.0.15: icmp_seq=2 ttl=64 time=0.814 ms
64 bytes from 10.0.0.15: icmp_seq=3 ttl=64 time=0.477 ms

In Host2: with vlan tag 100 in local node.

root@ovsvxlan2:~# ip netns exec vlan100-2net bash
root@ovsvxlan2:~# ping 10.0.0.15
PING 10.0.0.15 (10.0.0.15) 56(84) bytes of data.
64 bytes from 10.0.0.15: icmp_seq=1 ttl=64 time=0.906 ms
64 bytes from 10.0.0.15: icmp_seq=2 ttl=64 time=0.061 ms

Trying DHCP

In host1

apt-get install dnsmasq

create dhcp namespace

ip netns add dhcpnet
ovs-vsctl add-port vlanbr dhcp100 tag=100 -- set interface dhcp100 type=internal
ip link set dhcp100 netns dhcpnet
ip netns exec dhcpnet ip address add 10.0.0.100/24 dev dhcp100
ip netns exec dhcpnet ip link set dev lo up
ip netns exec dhcpnet ip link set dev dhcp100 up

In openstack the ip address in dhcp namespace is 10.0.0.2, if subnet is 10.0.0.0/24.

launch dnsmasq.

ip netns exec dhcpnet dnsmasq --interface=dhcp100 --dhcp-range=10.0.0.5,10.0.0.150,12h

We will imporve this setting in further discussion.

In Host2 and go to namespace with tag100

ip netns exec vlan100-2net bash

release the original ip.

dhclient -r vlan100

Check network.

root@ovsvxlan2:~# ifconfig
vlan100   Link encap:Ethernet  HWaddr 02:6f:fd:6d:69:3e
          inet6 addr: fe80::6f:fdff:fe6d:693e/64 Scope:Link
          UP BROADCAST RUNNING  MTU:1500  Metric:1
          RX packets:77 errors:0 dropped:0 overruns:0 frame:0
          TX packets:166 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:8865 (8.8 KB)  TX bytes:47100 (47.1 KB)

get IP from dnsmasq

dhclient vlan100

The result is:

root@ovsvxlan2:~# ifconfig
.
.
vlan100   Link encap:Ethernet  HWaddr 02:6f:fd:6d:69:3e
          inet addr:10.0.0.41  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::6f:fdff:fe6d:693e/64 Scope:Link
          UP BROADCAST RUNNING  MTU:1500  Metric:1
          RX packets:79 errors:0 dropped:0 overruns:0 frame:0
          TX packets:168 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:9552 (9.5 KB)  TX bytes:47784 (47.7 KB)

We get new ip 10.0.0.41 from dnsmasq. You can also test other namespace with tag 100 in host1.

We talk alot about namespace. In this simulation, we don't have a real VM but use namespace instead. Namespace is greate for simulation.

Trying Routing

Create an external bridge to internet for routing namespace used. The idea is, any namespace/VM setting the gateway to routing namespace, 10.0.0.1. And the traffic flow to routing namespace with 10.0.0.1 and it then nat forward to another nic to internet.

We cannot just use eth0, since we will create namespace device that will connect to bridge. So eth0 is a device we need to make it becoming a bridge, so that the namespace device can connect to.
That's why you will see br-ex in OpenStack. We put internet device eth0 on br-ex, and namespace device connects to br-ex too. We share the same br-ex, so as our experence using internal port we can connect to internet now.

ovs-vsctl add-br br-ex
ovs-vsctl add-port br-ex eth0

ovs-vsctl add-port br-ex tap0 tag=100
ifconfig eth0 0
ifconfig br-ex 172.16.235.128 netmask 255.255.255.0 up
route add default gw 172.16.235.2 dev br-ex metric 100

where 172.16.236.2 is host gateway, you can get from host commands

root@ovsvxlan1:~# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         172.16.235.2    0.0.0.0         UG    100    0        0 br-ex

Now we create an routeing namespace with 2 nics, one in vlanbr that connect by same tenant used. Another in br-ex that go to internet. We set the IP of routing namespace to 10.0.0.1 and route gateway to 172.16.235.2

ip netns add routernet
ovs-vsctl add-port vlanbr tapex tag=100 -- set interface tapex type=internal
ip link set tapex netns routernet
ip netns exec routernet ip address add 10.0.0.1/24 dev tapex
ip netns exec routernet ip link set dev tapex up

ovs-vsctl add-port br-ex tapexex -- set interface tapexex type=internal
ip link set tapexex netns routernet
ip netns exec routernet ip link set dev tapexex up

Inside routernet

Given an external IP, 172.16.235.3, that can connect to 172.16.236.2 gateway, and setup default gw.

Chceck route -n, if the default routeing existed, that will route non-subnet packet to gateway via device.

ip netns exec routernet bash


ifconfig tapexex 172.16.235.3
route add default gw 172.16.235.2 dev tapexex

Define the dev to go out is necessary, since we have two nic device. The routing table shows in routernet namespace

root@ovsvxlan1:~# route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         172.16.235.2    0.0.0.0         UG    0      0        0 tapexex
10.0.0.0        *               255.255.255.0   U     0      0        0 tapex
172.16.235.0    *               255.255.255.0   U     0      0        0 tapexex

set up SNAT Routing in routernet. Clean all ip table is necessary, since we tune back the network environment into initail clean state.

iptables --flush
iptables --table nat --flush
iptables --delete
iptables --table nat --delete-chain
echo "1" > /proc/sys/net/ipv4/ip_forward
iptables --table nat --append POSTROUTING --out-interface tapexex -j MASQUERADE
iptables --append FORWARD --in-interface tapex -j ACCEPT

Where out-interface* tapexex is internet device, and in-interface tapex is internal network device.

Set up resolve(routingnet)

root@ovsvxlan1:~# cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 8.8.8.8

You can ping 172.16.234.2 gateway first, and to intenet by ping 8.8.8.8, 168.95.1.1 or www.google.com.

root@ovsvxlan1:~# ping 172.16.235.2
PING 172.16.235.2 (172.16.235.2) 56(84) bytes of data.
64 bytes from 172.16.235.2: icmp_seq=1 ttl=128 time=0.723 ms
^C
--- 172.16.235.2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.723/0.723/0.723/0.000 ms
root@ovsvxlan1:~# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=128 time=52.2 ms
^C
--- 8.8.8.8 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 52.216/52.216/52.216/0.000 ms
root@ovsvxlan1:~# ping www.google.com
PING www.google.com (74.125.203.99) 56(84) bytes of data.
64 bytes from th-in-f99.1e100.net (74.125.203.99): icmp_seq=1 ttl=128 time=45.8 ms
^C
--- www.google.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 45.819/45.819/45.819/0.000 ms

In vlan100-3net

ip netns exec vlan100-3net bash

adding routing, to make sure all non-subnet packet routes to default gateway 10.0.0.1.

route add default gw 10.0.0.1

The routing talbe shows

root@ovsvxlan2:~# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.0.0.1        0.0.0.0         UG    0      0        0 vlan100

You might clean previous routing table by using

route del -net xxxxx netmask xxxx

modify resolve.conf (vlan100-3net). If you set up dnsmasq with nameserver parameter, you will not modify this file, that will dicuss later.

root@ovsvxlan2:~# cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 8.8.8.8
nameserver 10.0.0.100
search localdomain

ping google.com

root@ovsvxlan2:~# ping www.google.com.tw
PING www.google.com.tw (74.125.204.94) 56(84) bytes of data.
64 bytes from ti-in-f94.1e100.net (74.125.204.94): icmp_seq=1 ttl=127 time=67.8 ms
64 bytes from ti-in-f94.1e100.net (74.125.204.94): icmp_seq=2 ttl=127 time=51.3 ms

After DHCP, Now L3 routing has done.

We now complete DHCP and L3-routing, it's a very simple simulation compared to OpenStack used.
And the point is we got feeling about how OpenStack works.

More about DHCP

We simple introduce how to setup DHCP, and alot of manually process to make the traffic go to internet including change nameserver and routing gateway. You can setup Dnsmasq server with the following parameters that will help you No need to setup anything in VM/namespace, the network is still connect to internet.

In dhcpnet kill the old dnsmasq process.

dnsmasq --interface=dhcp100 --dhcp-range=10.0.0.5,10.0.0.150,12h --server=8.8.8.8 --dhcp-option=option:router,10.0.0.1

In namespace vlan100-3net

ip netns exec vlan100-3net bash

dhclient -r vlan100-3
dhclient vlan100-3

show the route gateway

root@ovsvxlan2:~# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.0.0.1        0.0.0.0         UG    0      0        0 vlan100-3
10.0.0.0        0.0.0.0         255.255.255.0   U     0      0        0 vlan100-3

Result

root@ovsvxlan2:~# ping www.google.com
PING www.google.com (64.233.187.103) 56(84) bytes of data.
64 bytes from tj-in-f103.1e100.net (64.233.187.103): icmp_seq=1 ttl=127 time=52.8 ms

Everything works well while gateway and nameserver to dnsmasq.

Conclusion

  1. Use namespace insteads of VM
  2. Using internal port is better than veth-pair, since openstack use intenal port.
  3. The default setting of Interal port to bridge provides you inter-connnection betwen two namespace , but veth-pair is not
  4. Internal port can be seperate by using vlan tag, as OpenStack used.
  5. DHCP howto
  6. Dnsmasq daemon can be launched in namespace. Namespace is so powerful.
  7. namespace dnsmasq will not confuse with localhost dnsmasq, since its in namespace.
  8. L3 Routing howto
  9. L3 Routing need two nics and with NAT setting. Take care the routing setting to internet.
  10. More detailed setting about DHCP including nameserver, and gateway to provide more easy used environment for VM/namespace.
  11. No persistence setting for namespace and openvswitch. All setting must reconfig, localrc, after host reboot, so as OpenStack.

No comments:

Post a Comment