...

Text file src/k8s.io/kubernetes/pkg/proxy/ipvs/README.md

Documentation: k8s.io/kubernetes/pkg/proxy/ipvs

     1- [IPVS](#ipvs)
     2  - [What is IPVS](#what-is-ipvs)
     3  - [IPVS vs. IPTABLES](#ipvs-vs-iptables)
     4    - [When IPVS falls back to IPTABLES](#when-ipvs-falls-back-to-iptables)
     5  - [Run kube-proxy in IPVS mode](#run-kube-proxy-in-ipvs-mode)
     6    - [Prerequisite](#prerequisite)
     7    - [Local UP Cluster](#local-up-cluster)
     8    - [GCE Cluster](#gce-cluster)
     9    - [Cluster Created by Kubeadm](#cluster-created-by-kubeadm)
    10  - [Debug](#debug)
    11    - [Check IPVS proxy rules](#check-ipvs-proxy-rules)
    12    - [Why kube-proxy can't start IPVS mode](#why-kube-proxy-cant-start-ipvs-mode)
    13
    14# IPVS
    15
    16This document intends to show users
    17- what is IPVS
    18- difference between IPVS and IPTABLES
    19- how to run kube-proxy in IPVS mode and info on debugging
    20
    21## What is IPVS
    22
    23**IPVS (IP Virtual Server)** implements transport-layer load balancing, usually called Layer 4 LAN switching, as part of
    24Linux kernel.
    25
    26IPVS runs on a host and acts as a load balancer in front of a cluster of real servers. IPVS can direct requests for TCP
    27and UDP-based services to the real servers, and make services of real servers appear as virtual services on a single IP address.
    28
    29## IPVS vs. IPTABLES
    30IPVS mode was introduced in Kubernetes v1.8, goes beta in v1.9 and GA in v1.11. IPTABLES mode was added in v1.1 and become the default operating mode since v1.2. Both IPVS and IPTABLES are based on `netfilter`.
    31Differences between IPVS mode and IPTABLES mode are as follows:
    32
    331. IPVS provides better scalability and performance for large clusters.
    34
    352. IPVS supports more sophisticated load balancing algorithms than IPTABLES (least load, least connections, locality, weighted, etc.).
    36
    373. IPVS supports server health checking and connection retries, etc.
    38
    39### When IPVS falls back to IPTABLES
    40IPVS proxier will employ IPTABLES in doing packet filtering, SNAT or masquerade.
    41Specifically, IPVS proxier will use ipset to store source or destination address of traffics that need DROP or do masquerade, to make sure the number of IPTABLES rules be constant, no matter how many services we have.
    42
    43
    44Here is the table of ipset sets that IPVS proxier used.
    45
    46| set name                       | members                                  | usage                                    |
    47| :----------------------------- | ---------------------------------------- | ---------------------------------------- |
    48| KUBE-CLUSTER-IP                | All service IP + port                    | Mark-Masq for cases that `masquerade-all=true` or `clusterCIDR` specified |
    49| KUBE-LOOP-BACK                 | All service IP + port + IP               | masquerade for solving hairpin purpose   |
    50| KUBE-EXTERNAL-IP               | service external IP + port               | masquerade for packages to external IPs  |
    51| KUBE-LOAD-BALANCER             | load balancer ingress IP + port          | masquerade for packages to load balancer type service  |
    52| KUBE-LOAD-BALANCER-LOCAL       | LB ingress IP + port with `externalTrafficPolicy=local` | accept packages to load balancer with `externalTrafficPolicy=local` |
    53| KUBE-LOAD-BALANCER-FW          | load balancer ingress IP + port with `loadBalancerSourceRanges` | package filter for load balancer with `loadBalancerSourceRanges` specified |
    54| KUBE-LOAD-BALANCER-SOURCE-CIDR | load balancer ingress IP + port + source CIDR | package filter for load balancer with `loadBalancerSourceRanges` specified |
    55| KUBE-NODE-PORT-TCP             | nodeport type service TCP port           | masquerade for packets to nodePort(TCP)  |
    56| KUBE-NODE-PORT-LOCAL-TCP       | nodeport type service TCP port with `externalTrafficPolicy=local` | accept packages to nodeport service with `externalTrafficPolicy=local` |
    57| KUBE-NODE-PORT-UDP             | nodeport type service UDP port           | masquerade for packets to nodePort(UDP)  |
    58| KUBE-NODE-PORT-LOCAL-UDP       | nodeport type service UDP port with `externalTrafficPolicy=local` | accept packages to nodeport service with `externalTrafficPolicy=local` |
    59
    60
    61IPVS proxier will fall back on IPTABLES in the following scenarios.
    62
    63**1. kube-proxy starts with --masquerade-all=true**
    64
    65If kube-proxy starts with `--masquerade-all=true`, IPVS proxier will masquerade all traffic accessing service Cluster IP, which behaves the same as what IPTABLES proxier. Suppose kube-proxy has flag `--masquerade-all=true` specified, then the IPTABLES installed by IPVS proxier should be like what is shown below.
    66
    67```shell
    68# iptables -t nat -nL
    69
    70Chain PREROUTING (policy ACCEPT)
    71target     prot opt source               destination
    72KUBE-SERVICES  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */
    73
    74Chain OUTPUT (policy ACCEPT)
    75target     prot opt source               destination
    76KUBE-SERVICES  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */
    77
    78Chain POSTROUTING (policy ACCEPT)
    79target     prot opt source               destination
    80KUBE-POSTROUTING  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes postrouting rules */
    81
    82Chain KUBE-MARK-MASQ (2 references)
    83target     prot opt source               destination
    84MARK       all  --  0.0.0.0/0            0.0.0.0/0            MARK or 0x4000
    85
    86Chain KUBE-POSTROUTING (1 references)
    87target     prot opt source               destination
    88MASQUERADE  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
    89MASQUERADE  all  --  0.0.0.0/0            0.0.0.0/0            match-set KUBE-LOOP-BACK dst,dst,src
    90
    91Chain KUBE-SERVICES (2 references)
    92target     prot opt source               destination
    93KUBE-MARK-MASQ  all  --  0.0.0.0/0            0.0.0.0/0            match-set KUBE-CLUSTER-IP dst,dst
    94ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            match-set KUBE-CLUSTER-IP dst,dst
    95```
    96
    97**2. Specify cluster CIDR in kube-proxy startup**
    98
    99If kube-proxy starts with `--cluster-cidr=<cidr>`, IPVS proxier will masquerade off-cluster traffic accessing service Cluster IP, which behaves the same as what IPTABLES proxier. Suppose kube-proxy is provided with the cluster cidr `10.244.16.0/24`, then the IPTABLES installed by IPVS proxier should be like what is shown below.
   100
   101```shell
   102# iptables -t nat -nL
   103
   104Chain PREROUTING (policy ACCEPT)
   105target     prot opt source               destination
   106KUBE-SERVICES  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */
   107
   108Chain OUTPUT (policy ACCEPT)
   109target     prot opt source               destination
   110KUBE-SERVICES  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */
   111
   112Chain POSTROUTING (policy ACCEPT)
   113target     prot opt source               destination
   114KUBE-POSTROUTING  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes postrouting rules */
   115
   116Chain KUBE-MARK-MASQ (3 references)
   117target     prot opt source               destination
   118MARK       all  --  0.0.0.0/0            0.0.0.0/0            MARK or 0x4000
   119
   120Chain KUBE-POSTROUTING (1 references)
   121target     prot opt source               destination
   122MASQUERADE  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
   123MASQUERADE  all  --  0.0.0.0/0            0.0.0.0/0            match-set KUBE-LOOP-BACK dst,dst,src
   124
   125Chain KUBE-SERVICES (2 references)
   126target     prot opt source               destination
   127KUBE-MARK-MASQ  all  -- !10.244.16.0/24       0.0.0.0/0            match-set KUBE-CLUSTER-IP dst,dst
   128ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            match-set KUBE-CLUSTER-IP dst,dst
   129```
   130
   131**3. Load Balancer type service**
   132
   133For loadBalancer type service, IPVS proxier will install IPTABLES with match of ipset `KUBE-LOAD-BALANCER`.
   134Specially when service's  `LoadBalancerSourceRanges` is specified or specified `externalTrafficPolicy=local`,
   135IPVS proxier will create ipset sets `KUBE-LOAD-BALANCER-LOCAL`/`KUBE-LOAD-BALANCER-FW`/`KUBE-LOAD-BALANCER-SOURCE-CIDR`
   136and install IPTABLES accordingly, which should look like what is shown below.
   137
   138```shell
   139# iptables -t nat -nL
   140
   141Chain PREROUTING (policy ACCEPT)
   142target     prot opt source               destination
   143KUBE-SERVICES  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */
   144
   145Chain OUTPUT (policy ACCEPT)
   146target     prot opt source               destination
   147KUBE-SERVICES  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */
   148
   149Chain POSTROUTING (policy ACCEPT)
   150target     prot opt source               destination
   151KUBE-POSTROUTING  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes postrouting rules */
   152
   153Chain KUBE-FIREWALL (1 references)
   154target     prot opt source               destination
   155RETURN     all  --  0.0.0.0/0            0.0.0.0/0            match-set KUBE-LOAD-BALANCER-SOURCE-CIDR dst,dst,src
   156KUBE-MARK-DROP  all  --  0.0.0.0/0            0.0.0.0/0
   157
   158Chain KUBE-LOAD-BALANCER (1 references)
   159target     prot opt source               destination
   160KUBE-FIREWALL  all  --  0.0.0.0/0            0.0.0.0/0            match-set KUBE-LOAD-BALANCER-FW dst,dst
   161RETURN     all  --  0.0.0.0/0            0.0.0.0/0            match-set KUBE-LOAD-BALANCER-LOCAL dst,dst
   162KUBE-MARK-MASQ  all  --  0.0.0.0/0            0.0.0.0/0
   163
   164Chain KUBE-MARK-DROP (1 references)
   165target     prot opt source               destination
   166MARK       all  --  0.0.0.0/0            0.0.0.0/0            MARK or 0x8000
   167
   168Chain KUBE-MARK-MASQ (2 references)
   169target     prot opt source               destination
   170MARK       all  --  0.0.0.0/0            0.0.0.0/0            MARK or 0x4000
   171
   172Chain KUBE-POSTROUTING (1 references)
   173target     prot opt source               destination
   174MASQUERADE  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
   175MASQUERADE  all  --  0.0.0.0/0            0.0.0.0/0            match-set KUBE-LOOP-BACK dst,dst,src
   176
   177Chain KUBE-SERVICES (2 references)
   178target     prot opt source               destination
   179KUBE-LOAD-BALANCER  all  --  0.0.0.0/0            0.0.0.0/0            match-set KUBE-LOAD-BALANCER dst,dst
   180ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            match-set KUBE-LOAD-BALANCER dst,dst
   181```
   182
   183**4. NodePort type service**
   184
   185For NodePort type service, IPVS proxier will install IPTABLES with match of ipset `KUBE-NODE-PORT-TCP/KUBE-NODE-PORT-UDP`.
   186When specified `externalTrafficPolicy=local`, IPVS proxier will create ipset sets `KUBE-NODE-PORT-LOCAL-TCP/KUBE-NODE-PORT-LOCAL-UDP`
   187and install IPTABLES accordingly, which should look like what is shown below.
   188
   189Suppose service with TCP type nodePort.
   190
   191```shell
   192Chain PREROUTING (policy ACCEPT)
   193target     prot opt source               destination
   194KUBE-SERVICES  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */
   195
   196Chain OUTPUT (policy ACCEPT)
   197target     prot opt source               destination
   198KUBE-SERVICES  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */
   199
   200Chain POSTROUTING (policy ACCEPT)
   201target     prot opt source               destination
   202KUBE-POSTROUTING  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes postrouting rules */
   203
   204Chain KUBE-MARK-MASQ (2 references)
   205target     prot opt source               destination
   206MARK       all  --  0.0.0.0/0            0.0.0.0/0            MARK or 0x4000
   207
   208Chain KUBE-NODE-PORT (1 references)
   209target     prot opt source               destination
   210RETURN     all  --  0.0.0.0/0            0.0.0.0/0            match-set KUBE-NODE-PORT-LOCAL-TCP dst
   211KUBE-MARK-MASQ  all  --  0.0.0.0/0            0.0.0.0/0
   212
   213Chain KUBE-POSTROUTING (1 references)
   214target     prot opt source               destination
   215MASQUERADE  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
   216MASQUERADE  all  --  0.0.0.0/0            0.0.0.0/0            match-set KUBE-LOOP-BACK dst,dst,src
   217
   218Chain KUBE-SERVICES (2 references)
   219target     prot opt source               destination
   220KUBE-NODE-PORT  all  --  0.0.0.0/0            0.0.0.0/0            match-set KUBE-NODE-PORT-TCP dst
   221```
   222
   223**5. Service with externalIPs specified**
   224
   225For service with `externalIPs` specified, IPVS proxier will install IPTABLES with match of ipset `KUBE-EXTERNAL-IP`,
   226Suppose we have service with `externalIPs` specified, IPTABLES rules should look like what is shown below.
   227
   228```shell
   229Chain PREROUTING (policy ACCEPT)
   230target     prot opt source               destination
   231KUBE-SERVICES  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */
   232
   233Chain OUTPUT (policy ACCEPT)
   234target     prot opt source               destination
   235KUBE-SERVICES  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */
   236
   237Chain POSTROUTING (policy ACCEPT)
   238target     prot opt source               destination
   239KUBE-POSTROUTING  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes postrouting rules */
   240
   241Chain KUBE-MARK-MASQ (2 references)
   242target     prot opt source               destination
   243MARK       all  --  0.0.0.0/0            0.0.0.0/0            MARK or 0x4000
   244
   245Chain KUBE-POSTROUTING (1 references)
   246target     prot opt source               destination
   247MASQUERADE  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
   248MASQUERADE  all  --  0.0.0.0/0            0.0.0.0/0            match-set KUBE-LOOP-BACK dst,dst,src
   249
   250Chain KUBE-SERVICES (2 references)
   251target     prot opt source               destination
   252KUBE-MARK-MASQ  all  --  0.0.0.0/0            0.0.0.0/0            match-set KUBE-EXTERNAL-IP dst,dst
   253ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            match-set KUBE-EXTERNAL-IP dst,dst PHYSDEV match ! --physdev-is-in ADDRTYPE match src-type !LOCAL
   254ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            match-set KUBE-EXTERNAL-IP dst,dst ADDRTYPE match dst-type LOCAL
   255```
   256
   257## Run kube-proxy in IPVS mode
   258
   259Currently, local-up scripts, GCE scripts and kubeadm support switching IPVS proxy mode via exporting environment variables or specifying flags.
   260
   261### Prerequisite
   262Ensure IPVS required kernel modules (**Notes**: use `nf_conntrack` instead of `nf_conntrack_ipv4` for Linux kernel 4.19 and later)
   263```shell
   264ip_vs
   265ip_vs_rr
   266ip_vs_wrr
   267ip_vs_sh
   268nf_conntrack_ipv4
   269```
   2701. have been compiled into the node kernel. Use
   271
   272`grep -e ipvs -e nf_conntrack_ipv4 /lib/modules/$(uname -r)/modules.builtin`
   273
   274and get results like the followings if compiled into kernel.
   275```
   276kernel/net/ipv4/netfilter/nf_conntrack_ipv4.ko
   277kernel/net/netfilter/ipvs/ip_vs.ko
   278kernel/net/netfilter/ipvs/ip_vs_rr.ko
   279kernel/net/netfilter/ipvs/ip_vs_wrr.ko
   280kernel/net/netfilter/ipvs/ip_vs_lc.ko
   281kernel/net/netfilter/ipvs/ip_vs_wlc.ko
   282kernel/net/netfilter/ipvs/ip_vs_fo.ko
   283kernel/net/netfilter/ipvs/ip_vs_ovf.ko
   284kernel/net/netfilter/ipvs/ip_vs_lblc.ko
   285kernel/net/netfilter/ipvs/ip_vs_lblcr.ko
   286kernel/net/netfilter/ipvs/ip_vs_dh.ko
   287kernel/net/netfilter/ipvs/ip_vs_sh.ko
   288kernel/net/netfilter/ipvs/ip_vs_sed.ko
   289kernel/net/netfilter/ipvs/ip_vs_nq.ko
   290kernel/net/netfilter/ipvs/ip_vs_ftp.ko
   291```
   292
   293OR
   294
   2952. have been loaded.
   296```shell
   297# load module <module_name>
   298modprobe -- ip_vs
   299modprobe -- ip_vs_rr
   300modprobe -- ip_vs_wrr
   301modprobe -- ip_vs_sh
   302modprobe -- nf_conntrack_ipv4
   303
   304# to check loaded modules, use
   305lsmod | grep -e ip_vs -e nf_conntrack_ipv4
   306# or
   307cut -f1 -d " "  /proc/modules | grep -e ip_vs -e nf_conntrack_ipv4
   308 ```
   309
   310Packages such as `ipset` should also be installed on the node before using IPVS mode.
   311
   312Kube-proxy will fall back to IPTABLES mode if those requirements are not met.
   313
   314### Local UP Cluster
   315
   316Kube-proxy will run in IPTABLES mode by default in a [local-up cluster](https://github.com/kubernetes/community/blob/master/contributors/devel/running-locally.md).
   317
   318To use IPVS mode, users should export the env `KUBE_PROXY_MODE=ipvs` to specify the IPVS mode before [starting the cluster](https://github.com/kubernetes/community/blob/master/contributors/devel/running-locally.md#starting-the-cluster):
   319```shell
   320# before running `hack/local-up-cluster.sh`
   321export KUBE_PROXY_MODE=ipvs
   322```
   323
   324### GCE Cluster
   325
   326Similar to local-up cluster, kube-proxy in [clusters running on GCE](https://kubernetes.io/docs/getting-started-guides/gce/) run in IPTABLES mode by default. Users need to export the env `KUBE_PROXY_MODE=ipvs` before [starting a cluster](https://kubernetes.io/docs/getting-started-guides/gce/#starting-a-cluster):
   327```shell
   328#before running one of the commands chosen to start a cluster:
   329# curl -sS https://get.k8s.io | bash
   330# wget -q -O - https://get.k8s.io | bash
   331# cluster/kube-up.sh
   332export KUBE_PROXY_MODE=ipvs
   333```
   334
   335### Cluster Created by Kubeadm
   336
   337If you are using kubeadm with a [configuration file](https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/#config-file), you have to add mode: ipvs in a KubeProxyConfiguration (separated by -- that is also passed to kubeadm init).
   338
   339```yaml
   340...
   341apiVersion: kubeproxy.config.k8s.io/v1alpha1
   342kind: KubeProxyConfiguration
   343mode: ipvs
   344...
   345```
   346
   347before running
   348
   349`kubeadm init --config <path_to_configuration_file>`
   350
   351to specify the ipvs mode before deploying the cluster.
   352
   353**Notes**
   354If ipvs mode is successfully on, you should see IPVS proxy rules (use `ipvsadm`) like
   355```shell
   356 # ipvsadm -ln
   357IP Virtual Server version 1.2.1 (size=4096)
   358Prot LocalAddress:Port Scheduler Flags
   359  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
   360TCP  10.0.0.1:443 rr persistent 10800
   361  -> 192.168.0.1:6443             Masq    1      1          0
   362```
   363or similar logs occur in kube-proxy logs (for example, `/tmp/kube-proxy.log` for local-up cluster) when the local cluster is running:
   364```
   365Using ipvs Proxier.
   366```
   367
   368While there is no IPVS proxy rules or the following logs occurs indicate that the kube-proxy fails to use IPVS mode:
   369```
   370Can't use ipvs proxier, trying iptables proxier
   371Using iptables Proxier.
   372```
   373See the following section for more details on debugging.
   374
   375## Debug
   376
   377### Check IPVS proxy rules
   378
   379Users can use `ipvsadm` tool to check whether kube-proxy are maintaining IPVS rules correctly. For example, we have the following services in the cluster:
   380
   381```
   382 # kubectl get svc --all-namespaces
   383NAMESPACE     NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
   384default       kubernetes   ClusterIP   10.0.0.1     <none>        443/TCP         1d
   385kube-system   kube-dns     ClusterIP   10.0.0.10    <none>        53/UDP,53/TCP   1d
   386```
   387We may get IPVS proxy rules like:
   388
   389```shell
   390 # ipvsadm -ln
   391IP Virtual Server version 1.2.1 (size=4096)
   392Prot LocalAddress:Port Scheduler Flags
   393  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
   394TCP  10.0.0.1:443 rr persistent 10800
   395  -> 192.168.0.1:6443             Masq    1      1          0
   396TCP  10.0.0.10:53 rr
   397  -> 172.17.0.2:53                Masq    1      0          0
   398UDP  10.0.0.10:53 rr
   399  -> 172.17.0.2:53                Masq    1      0          0
   400```
   401
   402### Why kube-proxy can't start IPVS mode
   403
   404Use the following check list to help you solve the problems:
   405
   406**1. Specify proxy-mode=ipvs**
   407
   408Check whether the kube-proxy mode has been set to `ipvs`.
   409
   410**2. Install required kernel modules and packages**
   411
   412Check whether the IPVS required kernel modules have been compiled into the kernel and packages installed. (see Prerequisite)

View as plain text