...
1# Scheduler Performance Test
2
3This package contains the scheduler performance tests, often called scheduler_perf.
4We use it for benchmarking the scheduler with in-tree plugins, which is visible at [perf-dash](https://perf-dash.k8s.io/#/?jobname=scheduler-perf-benchmark&metriccategoryname=Scheduler&metricname=BenchmarkPerfResults&Metric=SchedulingThroughput&Name=SchedulingBasic%2F5000Nodes%2Fnamespace-2&extension_point=not%20applicable&result=not%20applicable).
5Also you can use it outside the Kubernetes repository with out-of-tree plugins by making use of `RunBenchmarkPerfScheduling`.
6
7## Motivation
8
9We already have a performance testing system -- Kubemark. However, Kubemark requires setting up and bootstrapping a whole cluster, which takes a lot of time.
10
11We want to have a standard way to reproduce scheduling latency metrics result and benchmark scheduler as simple and fast as possible. We have the following goals:
12
13- Save time on testing
14 - The test and benchmark can be run in a single box.
15 We only set up components necessary to scheduling without booting up a cluster.
16- Profiling runtime metrics to find out bottleneck
17 - Write scheduler integration test but focus on performance measurement.
18 Take advantage of go profiling tools and collect fine-grained metrics,
19 like cpu-profiling, memory-profiling and block-profiling.
20- Reproduce test result easily
21 - We want to have a known place to do the performance related test for scheduler.
22 Developers should just run one script to collect all the information they need.
23
24Currently the test suite has the following:
25
26- benchmark
27 - make use of `go test -bench` and report nanosecond/op.
28 - schedule b.N pods when the cluster has N nodes and P scheduled pods. Since it takes relatively long time to finish one round, b.N is small: 10 - 100.
29
30## How To Run
31
32### Benchmark tests
33
34```shell
35# In Kubernetes root path
36make test-integration WHAT=./test/integration/scheduler_perf ETCD_LOGLEVEL=warn KUBE_TEST_VMODULE="''" KUBE_TEST_ARGS="-run=^$$ -benchtime=1ns -bench=BenchmarkPerfScheduling"
37```
38
39The benchmark suite runs all the tests specified under config/performance-config.yaml.
40By default, it runs all workloads that have the "performance" label. In the configuration,
41labels can be added to a test case and/or individual workloads. Each workload also has
42all labels of its test case. The `perf-scheduling-label-filter` command line flag can
43be used to select workloads. It works like GitHub label filtering: the flag accepts
44a comma-separated list of label names. Each label may have a `+` or `-` as prefix. Labels with
45`+` or no prefix must be set for a workload for it to be run. `-` means that the label must not
46be set. For example, this runs all performance benchmarks except those that are labeled
47as "fast":
48```shell
49make test-integration WHAT=./test/integration/scheduler_perf ETCD_LOGLEVEL=warn KUBE_TEST_VMODULE="''" KUBE_TEST_ARGS="-run=^$$ -benchtime=1ns -bench=BenchmarkPerfScheduling -perf-scheduling-label-filter=performance,-fast"
50```
51
52Once the benchmark is finished, JSON file with metrics is available in the current directory (test/integration/scheduler_perf). Look for `BenchmarkPerfScheduling_benchmark_YYYY-MM-DDTHH:MM:SSZ.json`.
53You can use `-data-items-dir` to generate the metrics file elsewhere.
54
55In case you want to run a specific test in the suite, you can specify the test through `-bench` flag:
56
57Also, bench time is explicitly set to 1ns (`-benchtime=1ns` flag) so each test is run only once.
58Otherwise, the golang benchmark framework will try to run a test more than once in case it ran for less than 1s.
59
60```shell
61# In Kubernetes root path
62make test-integration WHAT=./test/integration/scheduler_perf ETCD_LOGLEVEL=warn KUBE_TEST_VMODULE="''" KUBE_TEST_ARGS="-run=^$$ -benchtime=1ns -bench=BenchmarkPerfScheduling/SchedulingBasic/5000Nodes/5000InitPods/1000PodsToSchedule"
63```
64
65To produce a cpu profile:
66
67```shell
68# In Kubernetes root path
69make test-integration WHAT=./test/integration/scheduler_perf KUBE_TIMEOUT="-timeout=3600s" ETCD_LOGLEVEL=warn KUBE_TEST_VMODULE="''" KUBE_TEST_ARGS="-run=^$$ -benchtime=1ns -bench=BenchmarkPerfScheduling -cpuprofile ~/cpu-profile.out"
70```
71
72### How to configure benchmark tests
73
74Configuration file located under `config/performance-config.yaml` contains a list of templates.
75Each template allows to set:
76- node manifest
77- manifests for initial and testing pod
78- number of nodes, number of initial and testing pods
79- templates for PVs and PVCs
80- feature gates
81
82See `op` data type implementation in [scheduler_perf_test.go](scheduler_perf_test.go)
83for available operations to build `WorkloadTemplate`.
84
85Initial pods create a state of a cluster before the scheduler performance measurement can begin.
86Testing pods are then subject to performance measurement.
87
88The configuration file under `config/performance-config.yaml` contains a default list of templates to cover
89various scenarios. In case you want to add your own, you can extend the list with new templates.
90It's also possible to extend `op` data type, respectively its underlying data types
91to extend configuration of possible test cases.
92
93### Logging
94
95The default verbosity is 2 (the recommended value for production). -v can be
96used to change this. The log format can be changed with
97-logging-format=text|json. The default is to write into a log file (when using
98the text format) or stderr (when using JSON). Together these options allow
99simulating different real production configurations and to compare their
100performance.
101
102During interactive debugging sessions it is possible to enable per-test output
103via -use-testing-log.
104
105### Integration tests
106
107To run integration tests, use:
108```
109make test-integration WHAT=./test/integration/scheduler_perf KUBE_TEST_ARGS=-use-testing-log
110```
111
112Integration testing uses the same `config/performance-config.yaml` as
113benchmarking. By default, workloads labeled as `integration-test` are executed
114as part of integration testing. `-test-scheduling-label-filter` can be used to
115change that.
View as plain text