1- [IPVS](#ipvs)
2 - [What is IPVS](#what-is-ipvs)
3 - [IPVS vs. IPTABLES](#ipvs-vs-iptables)
4 - [When IPVS falls back to IPTABLES](#when-ipvs-falls-back-to-iptables)
5 - [Run kube-proxy in IPVS mode](#run-kube-proxy-in-ipvs-mode)
6 - [Prerequisite](#prerequisite)
7 - [Local UP Cluster](#local-up-cluster)
8 - [GCE Cluster](#gce-cluster)
9 - [Cluster Created by Kubeadm](#cluster-created-by-kubeadm)
10 - [Debug](#debug)
11 - [Check IPVS proxy rules](#check-ipvs-proxy-rules)
12 - [Why kube-proxy can't start IPVS mode](#why-kube-proxy-cant-start-ipvs-mode)
13
14# IPVS
15
16This document intends to show users
17- what is IPVS
18- difference between IPVS and IPTABLES
19- how to run kube-proxy in IPVS mode and info on debugging
20
21## What is IPVS
22
23**IPVS (IP Virtual Server)** implements transport-layer load balancing, usually called Layer 4 LAN switching, as part of
24Linux kernel.
25
26IPVS runs on a host and acts as a load balancer in front of a cluster of real servers. IPVS can direct requests for TCP
27and UDP-based services to the real servers, and make services of real servers appear as virtual services on a single IP address.
28
29## IPVS vs. IPTABLES
30IPVS mode was introduced in Kubernetes v1.8, goes beta in v1.9 and GA in v1.11. IPTABLES mode was added in v1.1 and become the default operating mode since v1.2. Both IPVS and IPTABLES are based on `netfilter`.
31Differences between IPVS mode and IPTABLES mode are as follows:
32
331. IPVS provides better scalability and performance for large clusters.
34
352. IPVS supports more sophisticated load balancing algorithms than IPTABLES (least load, least connections, locality, weighted, etc.).
36
373. IPVS supports server health checking and connection retries, etc.
38
39### When IPVS falls back to IPTABLES
40IPVS proxier will employ IPTABLES in doing packet filtering, SNAT or masquerade.
41Specifically, IPVS proxier will use ipset to store source or destination address of traffics that need DROP or do masquerade, to make sure the number of IPTABLES rules be constant, no matter how many services we have.
42
43
44Here is the table of ipset sets that IPVS proxier used.
45
46| set name | members | usage |
47| :----------------------------- | ---------------------------------------- | ---------------------------------------- |
48| KUBE-CLUSTER-IP | All service IP + port | Mark-Masq for cases that `masquerade-all=true` or `clusterCIDR` specified |
49| KUBE-LOOP-BACK | All service IP + port + IP | masquerade for solving hairpin purpose |
50| KUBE-EXTERNAL-IP | service external IP + port | masquerade for packages to external IPs |
51| KUBE-LOAD-BALANCER | load balancer ingress IP + port | masquerade for packages to load balancer type service |
52| KUBE-LOAD-BALANCER-LOCAL | LB ingress IP + port with `externalTrafficPolicy=local` | accept packages to load balancer with `externalTrafficPolicy=local` |
53| KUBE-LOAD-BALANCER-FW | load balancer ingress IP + port with `loadBalancerSourceRanges` | package filter for load balancer with `loadBalancerSourceRanges` specified |
54| KUBE-LOAD-BALANCER-SOURCE-CIDR | load balancer ingress IP + port + source CIDR | package filter for load balancer with `loadBalancerSourceRanges` specified |
55| KUBE-NODE-PORT-TCP | nodeport type service TCP port | masquerade for packets to nodePort(TCP) |
56| KUBE-NODE-PORT-LOCAL-TCP | nodeport type service TCP port with `externalTrafficPolicy=local` | accept packages to nodeport service with `externalTrafficPolicy=local` |
57| KUBE-NODE-PORT-UDP | nodeport type service UDP port | masquerade for packets to nodePort(UDP) |
58| KUBE-NODE-PORT-LOCAL-UDP | nodeport type service UDP port with `externalTrafficPolicy=local` | accept packages to nodeport service with `externalTrafficPolicy=local` |
59
60
61IPVS proxier will fall back on IPTABLES in the following scenarios.
62
63**1. kube-proxy starts with --masquerade-all=true**
64
65If kube-proxy starts with `--masquerade-all=true`, IPVS proxier will masquerade all traffic accessing service Cluster IP, which behaves the same as what IPTABLES proxier. Suppose kube-proxy has flag `--masquerade-all=true` specified, then the IPTABLES installed by IPVS proxier should be like what is shown below.
66
67```shell
68# iptables -t nat -nL
69
70Chain PREROUTING (policy ACCEPT)
71target prot opt source destination
72KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
73
74Chain OUTPUT (policy ACCEPT)
75target prot opt source destination
76KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
77
78Chain POSTROUTING (policy ACCEPT)
79target prot opt source destination
80KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */
81
82Chain KUBE-MARK-MASQ (2 references)
83target prot opt source destination
84MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000
85
86Chain KUBE-POSTROUTING (1 references)
87target prot opt source destination
88MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
89MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOOP-BACK dst,dst,src
90
91Chain KUBE-SERVICES (2 references)
92target prot opt source destination
93KUBE-MARK-MASQ all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-CLUSTER-IP dst,dst
94ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-CLUSTER-IP dst,dst
95```
96
97**2. Specify cluster CIDR in kube-proxy startup**
98
99If kube-proxy starts with `--cluster-cidr=<cidr>`, IPVS proxier will masquerade off-cluster traffic accessing service Cluster IP, which behaves the same as what IPTABLES proxier. Suppose kube-proxy is provided with the cluster cidr `10.244.16.0/24`, then the IPTABLES installed by IPVS proxier should be like what is shown below.
100
101```shell
102# iptables -t nat -nL
103
104Chain PREROUTING (policy ACCEPT)
105target prot opt source destination
106KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
107
108Chain OUTPUT (policy ACCEPT)
109target prot opt source destination
110KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
111
112Chain POSTROUTING (policy ACCEPT)
113target prot opt source destination
114KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */
115
116Chain KUBE-MARK-MASQ (3 references)
117target prot opt source destination
118MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000
119
120Chain KUBE-POSTROUTING (1 references)
121target prot opt source destination
122MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
123MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOOP-BACK dst,dst,src
124
125Chain KUBE-SERVICES (2 references)
126target prot opt source destination
127KUBE-MARK-MASQ all -- !10.244.16.0/24 0.0.0.0/0 match-set KUBE-CLUSTER-IP dst,dst
128ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-CLUSTER-IP dst,dst
129```
130
131**3. Load Balancer type service**
132
133For loadBalancer type service, IPVS proxier will install IPTABLES with match of ipset `KUBE-LOAD-BALANCER`.
134Specially when service's `LoadBalancerSourceRanges` is specified or specified `externalTrafficPolicy=local`,
135IPVS proxier will create ipset sets `KUBE-LOAD-BALANCER-LOCAL`/`KUBE-LOAD-BALANCER-FW`/`KUBE-LOAD-BALANCER-SOURCE-CIDR`
136and install IPTABLES accordingly, which should look like what is shown below.
137
138```shell
139# iptables -t nat -nL
140
141Chain PREROUTING (policy ACCEPT)
142target prot opt source destination
143KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
144
145Chain OUTPUT (policy ACCEPT)
146target prot opt source destination
147KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
148
149Chain POSTROUTING (policy ACCEPT)
150target prot opt source destination
151KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */
152
153Chain KUBE-FIREWALL (1 references)
154target prot opt source destination
155RETURN all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOAD-BALANCER-SOURCE-CIDR dst,dst,src
156KUBE-MARK-DROP all -- 0.0.0.0/0 0.0.0.0/0
157
158Chain KUBE-LOAD-BALANCER (1 references)
159target prot opt source destination
160KUBE-FIREWALL all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOAD-BALANCER-FW dst,dst
161RETURN all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOAD-BALANCER-LOCAL dst,dst
162KUBE-MARK-MASQ all -- 0.0.0.0/0 0.0.0.0/0
163
164Chain KUBE-MARK-DROP (1 references)
165target prot opt source destination
166MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x8000
167
168Chain KUBE-MARK-MASQ (2 references)
169target prot opt source destination
170MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000
171
172Chain KUBE-POSTROUTING (1 references)
173target prot opt source destination
174MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
175MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOOP-BACK dst,dst,src
176
177Chain KUBE-SERVICES (2 references)
178target prot opt source destination
179KUBE-LOAD-BALANCER all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOAD-BALANCER dst,dst
180ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOAD-BALANCER dst,dst
181```
182
183**4. NodePort type service**
184
185For NodePort type service, IPVS proxier will install IPTABLES with match of ipset `KUBE-NODE-PORT-TCP/KUBE-NODE-PORT-UDP`.
186When specified `externalTrafficPolicy=local`, IPVS proxier will create ipset sets `KUBE-NODE-PORT-LOCAL-TCP/KUBE-NODE-PORT-LOCAL-UDP`
187and install IPTABLES accordingly, which should look like what is shown below.
188
189Suppose service with TCP type nodePort.
190
191```shell
192Chain PREROUTING (policy ACCEPT)
193target prot opt source destination
194KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
195
196Chain OUTPUT (policy ACCEPT)
197target prot opt source destination
198KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
199
200Chain POSTROUTING (policy ACCEPT)
201target prot opt source destination
202KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */
203
204Chain KUBE-MARK-MASQ (2 references)
205target prot opt source destination
206MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000
207
208Chain KUBE-NODE-PORT (1 references)
209target prot opt source destination
210RETURN all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-NODE-PORT-LOCAL-TCP dst
211KUBE-MARK-MASQ all -- 0.0.0.0/0 0.0.0.0/0
212
213Chain KUBE-POSTROUTING (1 references)
214target prot opt source destination
215MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
216MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOOP-BACK dst,dst,src
217
218Chain KUBE-SERVICES (2 references)
219target prot opt source destination
220KUBE-NODE-PORT all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-NODE-PORT-TCP dst
221```
222
223**5. Service with externalIPs specified**
224
225For service with `externalIPs` specified, IPVS proxier will install IPTABLES with match of ipset `KUBE-EXTERNAL-IP`,
226Suppose we have service with `externalIPs` specified, IPTABLES rules should look like what is shown below.
227
228```shell
229Chain PREROUTING (policy ACCEPT)
230target prot opt source destination
231KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
232
233Chain OUTPUT (policy ACCEPT)
234target prot opt source destination
235KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
236
237Chain POSTROUTING (policy ACCEPT)
238target prot opt source destination
239KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */
240
241Chain KUBE-MARK-MASQ (2 references)
242target prot opt source destination
243MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000
244
245Chain KUBE-POSTROUTING (1 references)
246target prot opt source destination
247MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
248MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOOP-BACK dst,dst,src
249
250Chain KUBE-SERVICES (2 references)
251target prot opt source destination
252KUBE-MARK-MASQ all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-EXTERNAL-IP dst,dst
253ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-EXTERNAL-IP dst,dst PHYSDEV match ! --physdev-is-in ADDRTYPE match src-type !LOCAL
254ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-EXTERNAL-IP dst,dst ADDRTYPE match dst-type LOCAL
255```
256
257## Run kube-proxy in IPVS mode
258
259Currently, local-up scripts, GCE scripts and kubeadm support switching IPVS proxy mode via exporting environment variables or specifying flags.
260
261### Prerequisite
262Ensure IPVS required kernel modules (**Notes**: use `nf_conntrack` instead of `nf_conntrack_ipv4` for Linux kernel 4.19 and later)
263```shell
264ip_vs
265ip_vs_rr
266ip_vs_wrr
267ip_vs_sh
268nf_conntrack_ipv4
269```
2701. have been compiled into the node kernel. Use
271
272`grep -e ipvs -e nf_conntrack_ipv4 /lib/modules/$(uname -r)/modules.builtin`
273
274and get results like the followings if compiled into kernel.
275```
276kernel/net/ipv4/netfilter/nf_conntrack_ipv4.ko
277kernel/net/netfilter/ipvs/ip_vs.ko
278kernel/net/netfilter/ipvs/ip_vs_rr.ko
279kernel/net/netfilter/ipvs/ip_vs_wrr.ko
280kernel/net/netfilter/ipvs/ip_vs_lc.ko
281kernel/net/netfilter/ipvs/ip_vs_wlc.ko
282kernel/net/netfilter/ipvs/ip_vs_fo.ko
283kernel/net/netfilter/ipvs/ip_vs_ovf.ko
284kernel/net/netfilter/ipvs/ip_vs_lblc.ko
285kernel/net/netfilter/ipvs/ip_vs_lblcr.ko
286kernel/net/netfilter/ipvs/ip_vs_dh.ko
287kernel/net/netfilter/ipvs/ip_vs_sh.ko
288kernel/net/netfilter/ipvs/ip_vs_sed.ko
289kernel/net/netfilter/ipvs/ip_vs_nq.ko
290kernel/net/netfilter/ipvs/ip_vs_ftp.ko
291```
292
293OR
294
2952. have been loaded.
296```shell
297# load module <module_name>
298modprobe -- ip_vs
299modprobe -- ip_vs_rr
300modprobe -- ip_vs_wrr
301modprobe -- ip_vs_sh
302modprobe -- nf_conntrack_ipv4
303
304# to check loaded modules, use
305lsmod | grep -e ip_vs -e nf_conntrack_ipv4
306# or
307cut -f1 -d " " /proc/modules | grep -e ip_vs -e nf_conntrack_ipv4
308 ```
309
310Packages such as `ipset` should also be installed on the node before using IPVS mode.
311
312Kube-proxy will fall back to IPTABLES mode if those requirements are not met.
313
314### Local UP Cluster
315
316Kube-proxy will run in IPTABLES mode by default in a [local-up cluster](https://github.com/kubernetes/community/blob/master/contributors/devel/running-locally.md).
317
318To use IPVS mode, users should export the env `KUBE_PROXY_MODE=ipvs` to specify the IPVS mode before [starting the cluster](https://github.com/kubernetes/community/blob/master/contributors/devel/running-locally.md#starting-the-cluster):
319```shell
320# before running `hack/local-up-cluster.sh`
321export KUBE_PROXY_MODE=ipvs
322```
323
324### GCE Cluster
325
326Similar to local-up cluster, kube-proxy in [clusters running on GCE](https://kubernetes.io/docs/getting-started-guides/gce/) run in IPTABLES mode by default. Users need to export the env `KUBE_PROXY_MODE=ipvs` before [starting a cluster](https://kubernetes.io/docs/getting-started-guides/gce/#starting-a-cluster):
327```shell
328#before running one of the commands chosen to start a cluster:
329# curl -sS https://get.k8s.io | bash
330# wget -q -O - https://get.k8s.io | bash
331# cluster/kube-up.sh
332export KUBE_PROXY_MODE=ipvs
333```
334
335### Cluster Created by Kubeadm
336
337If you are using kubeadm with a [configuration file](https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/#config-file), you have to add mode: ipvs in a KubeProxyConfiguration (separated by -- that is also passed to kubeadm init).
338
339```yaml
340...
341apiVersion: kubeproxy.config.k8s.io/v1alpha1
342kind: KubeProxyConfiguration
343mode: ipvs
344...
345```
346
347before running
348
349`kubeadm init --config <path_to_configuration_file>`
350
351to specify the ipvs mode before deploying the cluster.
352
353**Notes**
354If ipvs mode is successfully on, you should see IPVS proxy rules (use `ipvsadm`) like
355```shell
356 # ipvsadm -ln
357IP Virtual Server version 1.2.1 (size=4096)
358Prot LocalAddress:Port Scheduler Flags
359 -> RemoteAddress:Port Forward Weight ActiveConn InActConn
360TCP 10.0.0.1:443 rr persistent 10800
361 -> 192.168.0.1:6443 Masq 1 1 0
362```
363or similar logs occur in kube-proxy logs (for example, `/tmp/kube-proxy.log` for local-up cluster) when the local cluster is running:
364```
365Using ipvs Proxier.
366```
367
368While there is no IPVS proxy rules or the following logs occurs indicate that the kube-proxy fails to use IPVS mode:
369```
370Can't use ipvs proxier, trying iptables proxier
371Using iptables Proxier.
372```
373See the following section for more details on debugging.
374
375## Debug
376
377### Check IPVS proxy rules
378
379Users can use `ipvsadm` tool to check whether kube-proxy are maintaining IPVS rules correctly. For example, we have the following services in the cluster:
380
381```
382 # kubectl get svc --all-namespaces
383NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
384default kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 1d
385kube-system kube-dns ClusterIP 10.0.0.10 <none> 53/UDP,53/TCP 1d
386```
387We may get IPVS proxy rules like:
388
389```shell
390 # ipvsadm -ln
391IP Virtual Server version 1.2.1 (size=4096)
392Prot LocalAddress:Port Scheduler Flags
393 -> RemoteAddress:Port Forward Weight ActiveConn InActConn
394TCP 10.0.0.1:443 rr persistent 10800
395 -> 192.168.0.1:6443 Masq 1 1 0
396TCP 10.0.0.10:53 rr
397 -> 172.17.0.2:53 Masq 1 0 0
398UDP 10.0.0.10:53 rr
399 -> 172.17.0.2:53 Masq 1 0 0
400```
401
402### Why kube-proxy can't start IPVS mode
403
404Use the following check list to help you solve the problems:
405
406**1. Specify proxy-mode=ipvs**
407
408Check whether the kube-proxy mode has been set to `ipvs`.
409
410**2. Install required kernel modules and packages**
411
412Check whether the IPVS required kernel modules have been compiled into the kernel and packages installed. (see Prerequisite)
View as plain text