README.md

Documentation: cuelang.org/go/doc/tutorial/kubernetes

     1# Kubernetes tutorial
     2
     3In this tutorial we show how to convert Kubernetes configuration files
     4for a collection of microservices.
     5
     6The configuration files are scrubbed and renamed versions of
     7real-life configuration files.
     8The files are organized in a directory hierarchy grouping related services
     9in subdirectories.
    10This is a common pattern.
    11The `cue` tooling has been optimized for this use case.
    12
    13In this tutorial we will address the following topics:
    14
    151. convert the given YAML files to CUE
    161. hoist common patterns to parent directories
    171. use the tooling to rewrite CUE files to drop unnecessary fields
    181. repeat from step 2 for different subdirectories
    191. define commands to operate on the configuration
    201. extract CUE templates directly from Kubernetes Go source
    211. manually tailor the configuration
    221. map a Kubernetes configuration to `docker-compose` (TODO)
    23
    24
    25## The given data set
    26
    27The data set is based on a real-life case, using different names for the
    28services.
    29All the inconsistencies of the real setup are replicated in the files
    30to get a realistic impression of how a conversion to CUE would behave
    31in practice.
    32
    33The given YAML files are ordered in following directory
    34(you can use `find` if you don't have tree):
    35
    36```
    37$ tree ./original | head
    38.
    39└── services
    40    ├── frontend
    41    │   ├── bartender
    42    │   │   └── kube.yaml
    43    │   ├── breaddispatcher
    44    │   │   └── kube.yaml
    45    │   ├── host
    46    │   │   └── kube.yaml
    47    │   ├── maitred
    48...
    49```
    50
    51Each subdirectory contains related microservices that often share similar
    52characteristics and configurations.
    53The configurations include a large variety of Kubernetes objects, including
    54services, deployments, config maps,
    55a daemon set, a stateful set, and a cron job.
    56
    57The result of the first tutorial is in the `quick`, for "quick and dirty"
    58directory.
    59A manually optimized configuration can be found int `manual`
    60directory.
    61
    62
    63## Importing existing configuration
    64
    65We first make a copy of the data directory.
    66
    67```
    68$ cp -a original tmp
    69$ cd tmp
    70```
    71
    72We initialize a module so that we can treat all our configuration files
    73in the subdirectories as part of one package.
    74We do that later by giving all the same package name.
    75
    76```
    77$ cue mod init
    78```
    79
    80Creating a module also allows our packages import external packages.
    81
    82We initialize a Go module so that later we can resolve the
    83`k8s.io/api/apps/v1` Go package dependency:
    84
    85```
    86$ go mod init mod.test
    87```
    88
    89Let's try to use the `cue import` command to convert the given YAML files
    90into CUE.
    91
    92```
    93$ cd services
    94```
    95
    96Since we have multiple packages and files, we need to specify the package to
    97which they should belong.
    98
    99```
   100$ cue import ./... -p kube
   101path, list, or files flag needed to handle multiple objects in file ./services/frontend/bartender/kube.yaml
   102```
   103
   104Many of the files contain more than one Kubernetes object.
   105Moreover, we are creating a single configuration that contains all objects
   106from all files.
   107We need to organize all Kubernetes objects such that each is individually
   108identifiable within the single configuration.
   109We do so by defining a different struct for each type putting each object
   110in this respective struct keyed by its name.
   111This allows objects of different types to share the same name,
   112just as is allowed by Kubernetes.
   113To accomplish this, we tell `cue` to put each object in the configuration
   114tree at the path with the "kind" as first element and "name" as second.
   115
   116```
   117$ cue import ./... -p kube -l 'strings.ToCamel(kind)' -l metadata.name -f
   118```
   119
   120The added `-l` flag defines the labels for each object, based on values from
   121each object, using the usual CUE syntax for field labels.
   122In this case, we use a camelcase variant of the `kind` field of each object and
   123use the `name` field of the `metadata` section as the name for each object.
   124We also added the `-f` flag to overwrite the few files that succeeded before.
   125
   126Let's see what happened:
   127
   128```
   129$ tree . | head
   130.
   131└── services
   132    ├── frontend
   133    │   ├── bartender
   134    │   │   ├── kube.cue
   135    │   │   └── kube.yaml
   136    │   ├── breaddispatcher
   137    │   │   ├── kube.cue
   138    │   │   └── kube.yaml
   139...
   140```
   141
   142Each of the YAML files is converted to corresponding CUE files.
   143Comments of the YAML files are preserved.
   144
   145The result is not fully pleasing, though.
   146Take a look at `mon/prometheus/configmap.cue`.
   147
   148```
   149$ cat mon/prometheus/configmap.cue
   150package kube
   151
   152apiVersion: "v1"
   153kind:       "ConfigMap"
   154metadata: name: "prometheus"
   155data: {
   156    "alert.rules": """
   157        groups:
   158        - name: rules.yaml
   159...
   160```
   161
   162The configuration file still contains YAML embedded in a string value of one
   163of the fields.
   164The original YAML file might have looked like it was all structured data, but
   165the majority of it was a string containing, hopefully, valid YAML.
   166
   167The `-R` option attempts to detect structured YAML or JSON strings embedded
   168in the configuration files and then converts these recursively.
   169
   170<!-- TODO: update import label format -->
   171
   172```
   173$ cue import ./... -p kube -l 'strings.ToCamel(kind)' -l metadata.name -f -R
   174```
   175
   176Now the file looks like:
   177
   178```
   179$ cat mon/prometheus/configmap.cue
   180package kube
   181
   182import "encoding/yaml"
   183
   184configMap: prometheus: {
   185    apiVersion: "v1"
   186    kind:       "ConfigMap"
   187    metadata: name: "prometheus"
   188    data: {
   189        "alert.rules": yaml.Marshal(_cue_alert_rules)
   190        _cue_alert_rules: {
   191            groups: [{
   192...
   193```
   194
   195That looks better!
   196The resulting configuration file replaces the original embedded string
   197with a call to `yaml.Marshal` converting a structured CUE source to
   198a string with an equivalent YAML file.
   199Fields starting with an underscore (`_`) are not included when emitting
   200a configuration file (they are when enclosed in double quotes).
   201
   202```
   203$ cue eval ./mon/prometheus -e configMap.prometheus
   204apiVersion: "v1"
   205kind:       "ConfigMap"
   206metadata: {
   207    name: "prometheus"
   208}
   209data: {
   210    "alert.rules": """
   211        groups:
   212          - name: rules.yaml
   213...
   214```
   215
   216Yay!
   217
   218
   219## Quick 'n Dirty Conversion
   220
   221In this tutorial we show how to quickly eliminate boilerplate from a set
   222of configurations.
   223Manual tailoring will usually give better results, but takes considerably
   224more thought, while taking the quick and dirty approach gets you mostly there.
   225The result of such a quick conversion also forms a good basis for
   226a more thoughtful manual optimization.
   227
   228### Create top-level template
   229
   230Now we have imported the YAML files we can start the simplification process.
   231
   232Before we start the restructuring, lets save a full evaluation so that we
   233can verify that simplifications yield the same results.
   234
   235```
   236$ cue eval -c ./... >snapshot
   237```
   238
   239The `-c` option tells `cue` that only concrete values, that is valid JSON,
   240are allowed.
   241We focus on the objects defined in the various `kube.cue` files.
   242A quick inspection reveals that many of the Deployments and Services share
   243common structure.
   244
   245We copy one of the files containing both as a basis for creating our template
   246to the root of the directory tree.
   247
   248```
   249$ cp frontend/breaddispatcher/kube.cue .
   250```
   251
   252Modify this file as below.
   253
   254```
   255$ cat <<EOF > kube.cue
   256package kube
   257
   258service: [ID=_]: {
   259	apiVersion: "v1"
   260	kind:       "Service"
   261	metadata: {
   262		name: ID
   263		labels: {
   264			app:       ID     // by convention
   265			domain:    "prod" // always the same in the given files
   266			component: string // varies per directory
   267		}
   268	}
   269	spec: {
   270		// Any port has the following properties.
   271		ports: [...{
   272			port:     int
   273			protocol: *"TCP" | "UDP" // from the Kubernetes definition
   274			name:     string | *"client"
   275		}]
   276		selector: metadata.labels // we want those to be the same
   277	}
   278}
   279
   280deployment: [ID=_]: {
   281	apiVersion: "apps/v1"
   282	kind:       "Deployment"
   283	metadata: name: ID
   284	spec: {
   285		// 1 is the default, but we allow any number
   286		replicas: *1 | int
   287		template: {
   288			metadata: labels: {
   289				app:       ID
   290				domain:    "prod"
   291				component: string
   292			}
   293			// we always have one namesake container
   294			spec: containers: [{name: ID}]
   295		}
   296	}
   297}
   298EOF
   299```
   300
   301By replacing the service and deployment name with `[ID=_]` we have changed the
   302definition into a template matching any field.
   303CUE binds the field name to `ID` as a result.
   304During importing we used `metadata.name` as a key for the object names,
   305so we can now set this field to `ID`.
   306
   307Templates are applied to (are unified with) all entries in the struct in which
   308they are defined,
   309so we need to either strip fields specific to the `breaddispatcher` definition,
   310generalize them, or remove them.
   311
   312One of the labels defined in the Kubernetes metadata seems to be always set
   313to parent directory name.
   314We enforce this by defining `component: string`, meaning that a field
   315of name `component` must be set to some string value, and then define this
   316later on.
   317Any underspecified field results in an error when converting to, for instance,
   318JSON.
   319So a deployment or service will only be valid if this label is defined.
   320
   321<!-- TODO: once cycles in disjunctions are implemented
   322    port:       targetPort | int   // by default the same as targetPort
   323    targetPort: port | int         // by default the same as port
   324
   325Note that ports definition for service contains a cycle.
   326Specifying one of the ports will break the cycle.
   327The meaning of cycles are well-defined in CUE.
   328In practice this means that a template writer does not have to make any
   329assumptions about which of the fields that can be mutually derived from each
   330other a user of the template will want to specify.
   331-->
   332
   333Let's compare the result of merging our new template to our original snapshot.
   334
   335```
   336$ cue eval -c ./... >snapshot2
   337// /workdir/services/mon/alertmanager
   338deployment.alertmanager.spec.template.metadata.labels.component: incomplete value string:
   339    ./kube.cue:36:16
   340service.alertmanager.metadata.labels.component: incomplete value string:
   341    ./kube.cue:11:15
   342service.alertmanager.spec.selector.component: incomplete value string:
   343    ./kube.cue:11:15
   344...
   345```
   346
   347Oops.
   348The alert manager does not specify the `component` label.
   349This demonstrates how constraints can be used to catch inconsistencies
   350in your configurations.
   351
   352As there are very few objects that do not specify this label, we will modify
   353the configurations to include them everywhere.
   354We do this by setting a newly defined top-level field in each directory
   355to the directory name and modify our master template file to use it.
   356
   357<!--
   358```
   359$ cue add */kube.cue -p kube --list <<EOF
   360#Component: "{{.DisplayPath}}"
   361EOF
   362```
   363-->
   364
   365```
   366# set the component label to our new top-level field
   367$ sed -i.bak 's/component:.*string/component: #Component/' kube.cue
   368$ rm kube.cue.bak
   369
   370# add the new top-level field to our previous template definitions
   371$ cat <<EOF >> kube.cue
   372
   373#Component: string
   374EOF
   375
   376# add a file with the component label to each directory
   377$ ls -d */ | sed 's/.$//' | xargs -I DIR sh -c 'cd DIR; echo "package kube
   378
   379#Component: \"DIR\"
   380" > kube.cue; cd ..'
   381
   382# format the files
   383$ cue fmt kube.cue */kube.cue
   384```
   385
   386Let's try again to see if it is fixed:
   387
   388```
   389$ cue eval -c ./... >snapshot2
   390$ diff -wu snapshot snapshot2
   391...
   392```
   393
   394Except for having more consistent labels and some reordering, nothing changed.
   395We are happy and save the result as the new baseline.
   396
   397```
   398$ cp snapshot2 snapshot
   399```
   400
   401The corresponding boilerplate can now be removed with `cue trim`.
   402
   403```
   404$ find . | grep kube.cue | xargs wc -l | tail -1
   405 1887 total
   406$ cue trim ./...
   407$ find . | grep kube.cue | xargs wc -l | tail -1
   408 1312 total
   409```
   410
   411`cue trim` removes configuration from files that is already generated
   412by templates or comprehensions.
   413In doing so it removed over 500 lines of configuration, or over 30%!
   414
   415The following is proof that nothing changed semantically:
   416
   417```
   418$ cue eval -c ./... >snapshot2
   419$ diff -wu snapshot snapshot2 | wc -l
   4200
   421```
   422
   423We can do better, though.
   424A first thing to note is that DaemonSets and StatefulSets share a similar
   425structure to Deployments.
   426We generalize the top-level template as follows:
   427
   428```
   429$ cat <<EOF >> kube.cue
   430
   431daemonSet: [ID=_]: _spec & {
   432	apiVersion: "apps/v1"
   433	kind:       "DaemonSet"
   434	_name:      ID
   435}
   436
   437statefulSet: [ID=_]: _spec & {
   438	apiVersion: "apps/v1"
   439	kind:       "StatefulSet"
   440	_name:      ID
   441}
   442
   443deployment: [ID=_]: _spec & {
   444	apiVersion: "apps/v1"
   445	kind:       "Deployment"
   446	_name:      ID
   447	spec: replicas: *1 | int
   448}
   449
   450configMap: [ID=_]: {
   451	metadata: name: ID
   452	metadata: labels: component: #Component
   453}
   454
   455_spec: {
   456	_name: string
   457
   458	metadata: name: _name
   459	metadata: labels: component: #Component
   460	spec: selector: {}
   461	spec: template: {
   462		metadata: labels: {
   463			app:       _name
   464			component: #Component
   465			domain:    "prod"
   466		}
   467		spec: containers: [{name: _name}]
   468	}
   469}
   470EOF
   471$ cue fmt
   472```
   473
   474The common configuration has been factored out into `_spec`.
   475We introduced `_name` to aid both specifying and referring
   476to the name of an object.
   477For completeness, we added `configMap` as a top-level entry.
   478
   479Note that we have not yet removed the old definition of deployment.
   480This is fine.
   481As it is equivalent to the new one, unifying them will have no effect.
   482We leave its removal as an exercise to the reader.
   483
   484Next we observe that all deployments, stateful sets and daemon sets have
   485an accompanying service which shares many of the same fields.
   486We add:
   487
   488```
   489$ cat <<EOF >> kube.cue
   490
   491// Define the _export option and set the default to true
   492// for all ports defined in all containers.
   493_spec: spec: template: spec: containers: [...{
   494	ports: [...{
   495		_export: *true | false // include the port in the service
   496	}]
   497}]
   498
   499for x in [deployment, daemonSet, statefulSet] for k, v in x {
   500	service: "\(k)": {
   501		spec: selector: v.spec.template.metadata.labels
   502
   503		spec: ports: [
   504			for c in v.spec.template.spec.containers
   505			for p in c.ports
   506			if p._export {
   507				let Port = p.containerPort // Port is an alias
   508				port:       *Port | int
   509				targetPort: *Port | int
   510			},
   511		]
   512	}
   513}
   514EOF
   515$ cue fmt
   516```
   517
   518This example introduces a few new concepts.
   519Open-ended lists are indicated with an ellipsis (`...`).
   520The value following an ellipsis is unified with any subsequent elements and
   521defines the "type", or template, for additional list elements.
   522
   523The `Port` declaration is an alias.
   524Aliases are only visible in their lexical scope and are not part of the model.
   525They can be used to make shadowed fields visible within nested scopes or,
   526in this case, to reduce boilerplate without introducing new fields.
   527
   528Finally, this example introduces list and field comprehensions.
   529List comprehensions are analogous to list comprehensions found in other
   530languages.
   531Field comprehensions allow inserting fields in structs.
   532In this case, the field comprehension adds a namesake service for any
   533deployment, daemonSet, and statefulSet.
   534Field comprehensions can also be used to add a field conditionally.
   535
   536
   537Specifying the `targetPort` is not necessary, but since many files define it,
   538defining it here will allow those definitions to be removed
   539using `cue trim`.
   540We add an option `_export` for ports defined in containers to specify whether
   541to include them in the service and explicitly set this to false
   542for the respective ports in `infra/events`, `infra/tasks`, and `infra/watcher`.
   543
   544For the purpose of this tutorial, here are some quick patches:
   545```
   546$ cat <<EOF >>infra/events/kube.cue
   547
   548deployment: events: spec: template: spec: containers: [{ ports: [{_export: false}, _] }]
   549EOF
   550$ cat <<EOF >>infra/tasks/kube.cue
   551
   552deployment: tasks: spec: template: spec: containers: [{ ports: [{_export: false}, _] }]
   553EOF
   554$ cat <<EOF >>infra/watcher/kube.cue
   555
   556deployment: watcher: spec: template: spec: containers: [{ ports: [{_export: false}, _] }]
   557EOF
   558```
   559In practice it would be more proper form to add this field in the original
   560port declaration.
   561
   562We verify that all changes are acceptable and store another snapshot.
   563Then we run trim to further reduce our configuration:
   564
   565```
   566$ cue trim ./...
   567$ find . | grep kube.cue | xargs wc -l | tail -1
   568 1242 total
   569```
   570This is after removing the rewritten and now redundant deployment definition.
   571
   572We shaved off almost another 100 lines, even after adding the template.
   573You can verify that the service definitions are now gone in most of the files.
   574What remains is either some additional configuration, or inconsistencies that
   575should probably be cleaned up.
   576
   577But we have another trick up our sleeve.
   578With the `-s` or `--simplify` option we can tell `trim` or `fmt` to collapse
   579structs with a single element onto a single line. For instance:
   580
   581```
   582$ head frontend/breaddispatcher/kube.cue
   583package kube
   584
   585deployment: breaddispatcher: {
   586	spec: {
   587		template: {
   588			metadata: {
   589				annotations: {
   590					"prometheus.io.scrape": "true"
   591					"prometheus.io.port":   "7080"
   592				}
   593$ cue trim ./... -s
   594$ head -7 frontend/breaddispatcher/kube.cue
   595package kube
   596
   597deployment: breaddispatcher: spec: template: {
   598	metadata: annotations: {
   599		"prometheus.io.scrape": "true"
   600		"prometheus.io.port":   "7080"
   601	}
   602$ find . | grep kube.cue | xargs wc -l | tail -1
   603 1090 total
   604```
   605
   606Another 150 lines lost!
   607Collapsing lines like this can improve the readability of a configuration
   608by removing considerable amounts of punctuation.
   609
   610We save the result as the new baseline:
   611
   612```
   613$ cue eval -c ./... >snapshot2
   614$ cp snapshot2 snapshot
   615```
   616
   617
   618### Repeat for several subdirectories
   619
   620In the previous section we defined templates for services and deployments
   621in the root of our directory structure to capture the common traits of all
   622services and deployments.
   623In addition, we defined a directory-specific label.
   624In this section we will look into generalizing the objects per directory.
   625
   626
   627#### Directory `frontend`
   628
   629We observe that all deployments in subdirectories of `frontend`
   630have a single container with one port,
   631which is usually `7080`, but sometimes `8080`.
   632Also, most have two prometheus-related annotations, while some have one.
   633We leave the inconsistencies in ports, but add both annotations
   634unconditionally.
   635
   636```
   637$ cat <<EOF >> frontend/kube.cue
   638
   639deployment: [string]: spec: template: {
   640	metadata: annotations: {
   641		"prometheus.io.scrape": "true"
   642		"prometheus.io.port":   "\(spec.containers[0].ports[0].containerPort)"
   643	}
   644	spec: containers: [{
   645		ports: [{containerPort: *7080 | int}] // 7080 is the default
   646	}]
   647}
   648EOF
   649$ cue fmt ./frontend
   650
   651# check differences
   652$ cue eval -c ./... >snapshot2
   653$ diff -wu snapshot snapshot2
   654--- snapshot	2022-02-21 06:04:10.919832150 +0000
   655+++ snapshot2	2022-02-21 06:04:11.907780310 +0000
   656@@ -188,6 +188,7 @@
   657                 metadata: {
   658                     annotations: {
   659                         "prometheus.io.scrape": "true"
   660+                        "prometheus.io.port":   "7080"
   661                     }
   662                     labels: {
   663                         app:       "host"
   664@@ -327,6 +328,7 @@
   665                 metadata: {
   666                     annotations: {
   667                         "prometheus.io.scrape": "true"
   668+                        "prometheus.io.port":   "8080"
   669                     }
   670                     labels: {
   671                         app:       "valeter"
   672$ cp snapshot2 snapshot
   673```
   674
   675Two lines with annotations added, improving consistency.
   676
   677```
   678$ cue trim ./frontend/... -s
   679$ find . | grep kube.cue | xargs wc -l | tail -1
   680 1046 total
   681```
   682
   683Another 40 odd lines removed.
   684We may have gotten used to larger reductions, but at this point there is just
   685not much left to remove: in some of the frontend files there are only 4 lines
   686of configuration left.
   687
   688We save the result as the new baseline:
   689
   690```
   691$ cue eval -c ./... >snapshot2
   692$ cp snapshot2 snapshot
   693```
   694
   695
   696#### Directory `kitchen`
   697
   698In this directory we observe that all deployments have without exception
   699one container with port `8080`, all have the same liveness probe,
   700a single line of prometheus annotation, and most have
   701two or three disks with similar patterns.
   702
   703Let's add everything but the disks for now:
   704
   705```
   706$ cat <<EOF >> kitchen/kube.cue
   707
   708deployment: [string]: spec: template: {
   709	metadata: annotations: "prometheus.io.scrape": "true"
   710	spec: containers: [{
   711		ports: [{
   712			containerPort: 8080
   713		}]
   714		livenessProbe: {
   715			httpGet: {
   716				path: "/debug/health"
   717				port: 8080
   718			}
   719			initialDelaySeconds: 40
   720			periodSeconds:       3
   721		}
   722	}]
   723}
   724EOF
   725$ cue fmt ./kitchen
   726```
   727
   728A diff reveals that one prometheus annotation was added to a service.
   729We assume this to be an accidental omission and accept the differences
   730
   731Disks need to be defined in both the template spec section as well as in
   732the container where they are used.
   733We prefer to keep these two definitions together.
   734We take the volumes definition from `expiditer` (the first config in that
   735directory with two disks), and generalize it:
   736
   737```
   738$ cat <<EOF >> kitchen/kube.cue
   739
   740deployment: [ID=_]: spec: template: spec: {
   741	_hasDisks: *true | bool
   742
   743	// field comprehension using just "if"
   744	if _hasDisks {
   745		volumes: [{
   746			name: *"\(ID)-disk" | string
   747			gcePersistentDisk: pdName: *"\(ID)-disk" | string
   748			gcePersistentDisk: fsType: "ext4"
   749		}, {
   750			name: *"secret-\(ID)" | string
   751			secret: secretName: *"\(ID)-secrets" | string
   752		}, ...]
   753
   754		containers: [{
   755			volumeMounts: [{
   756				name:      *"\(ID)-disk" | string
   757				mountPath: *"/logs" | string
   758			}, {
   759				mountPath: *"/etc/certs" | string
   760				name:      *"secret-\(ID)" | string
   761				readOnly:  true
   762			}, ...]
   763		}]
   764	}
   765}
   766EOF
   767
   768$ cat <<EOF >> kitchen/souschef/kube.cue
   769
   770deployment: souschef: spec: template: spec: {
   771	_hasDisks: false
   772}
   773
   774EOF
   775$ cue fmt ./kitchen/...
   776```
   777
   778This template definition is not ideal: the definitions are positional, so if
   779configurations were to define the disks in a different order, there would be
   780no reuse or even conflicts.
   781Also note that in order to deal with this restriction, almost all field values
   782are just default values and can be overridden by instances.
   783A better way would be define a map of volumes,
   784similarly to how we organized the top-level Kubernetes objects,
   785and then generate these two sections from this map.
   786This requires some design, though, and does not belong in a
   787"quick-and-dirty" tutorial.
   788Later in this document we introduce a manually optimized configuration.
   789
   790We add the two disk by default and define a `_hasDisks` option to opt out.
   791The `souschef` configuration is the only one that defines no disks.
   792
   793```
   794$ cue trim -s ./kitchen/...
   795
   796# check differences
   797$ cue eval -c ./... >snapshot2
   798$ diff -wu snapshot snapshot2
   799...
   800$ cp snapshot2 snapshot
   801$ find . | grep kube.cue | xargs wc -l | tail -1
   802  925 total
   803```
   804
   805The diff shows that we added the `_hasDisks` option, but otherwise reveals no
   806differences.
   807We also reduced the configuration by a sizeable amount once more.
   808
   809However, on closer inspection of the remaining files we see a lot of remaining
   810fields in the disk specifications as a result of inconsistent naming.
   811Reducing configurations like we did in this exercise exposes inconsistencies.
   812The inconsistencies can be removed by simply deleting the overrides in the
   813specific configuration.
   814Leaving them as is gives a clear signal that a configuration is inconsistent.
   815
   816
   817### Conclusion of Quick 'n Dirty tutorial
   818
   819There is still some gain to be made with the other directories.
   820At nearly a 1000-line, or 55%, reduction, we leave the rest as an exercise to
   821the reader.
   822
   823We have shown how CUE can be used to reduce boilerplate, enforce consistencies,
   824and detect inconsistencies.
   825Being able to deal with consistencies and inconsistencies is a consequence of
   826the constraint-based model and harder to do with inheritance-based languages.
   827
   828We have indirectly also shown how CUE is well-suited for machine manipulation.
   829This is a factor of syntax and the order independence that follows from its
   830semantics.
   831The `trim` command is one of many possible automated refactor tools made
   832possible by this property.
   833Also this would be harder to do with inheritance-based configuration languages.
   834
   835
   836## Define commands
   837
   838The `cue export` command can be used to convert the created configuration back
   839to JSON.
   840In our case, this requires a top-level "emit value"
   841to convert our mapped Kubernetes objects back to a list.
   842Typically, this output is piped to tools like `kubectl` or `etcdctl`.
   843
   844In practice this means typing the same commands ad nauseam.
   845The next step is often to write wrapper tools.
   846But as there is often no one-size-fits-all solution, this lead to the
   847proliferation of marginally useful tools.
   848The `cue` tool provides an alternative by allowing the declaration of
   849frequently used commands in CUE itself.
   850Advantages:
   851
   852- added domain knowledge that CUE may use for improved analysis,
   853- only one language to learn,
   854- easy discovery of commands,
   855- no further configuration required,
   856- enforce uniform CLI standards across commands,
   857- standardized commands across an organization.
   858
   859Commands are defined in files ending with `_tool.cue` in the same package as
   860where the configuration files are defined on which the commands should operate.
   861Top-level values in the configuration are visible by the tool files
   862as long as they are not shadowed by top-level fields in the tool files.
   863Top-level fields in the tool files are not visible in the configuration files
   864and are not part of any model.
   865
   866The tool definitions also have access to additional builtin packages.
   867A CUE configuration is fully hermetic, disallowing any outside influence.
   868This property enables automated analysis and manipulation
   869such as the `trim` command.
   870The tool definitions, however, have access to such things as command line flags
   871and environment variables, random generators, file listings, and so on.
   872
   873We define the following tools for our example:
   874
   875- ls: list the Kubernetes objects defined in our configuration
   876- dump: dump all selected objects as a YAML stream
   877- create: send all selected objects to `kubectl` for creation
   878
   879### Preparations
   880
   881To work with Kubernetes we need to convert our map of Kubernetes objects
   882back to a simple list.
   883We create the tool file to do just that.
   884
   885```
   886$ cat <<EOF > kube_tool.cue
   887package kube
   888
   889objects: [for v in objectSets for x in v {x}]
   890
   891objectSets: [
   892	service,
   893	deployment,
   894	statefulSet,
   895	daemonSet,
   896	configMap,
   897]
   898EOF
   899```
   900
   901### Listing objects
   902
   903Commands are defined in the `command` section at the top-level of a tool file.
   904A `cue` command defines command line flags, environment variables, as well as
   905a set of tasks.
   906Examples tasks are load or write a file, dump something to the console,
   907download a web page, or execute a command.
   908
   909We start by defining the `ls` command which dumps all our objects
   910
   911```
   912$ cat <<EOF > ls_tool.cue
   913package kube
   914
   915import (
   916	"text/tabwriter"
   917	"tool/cli"
   918	"tool/file"
   919)
   920
   921command: ls: {
   922	task: print: cli.Print & {
   923		text: tabwriter.Write([
   924			for x in objects {
   925				"\(x.kind)  \t\(x.metadata.labels.component)  \t\(x.metadata.name)"
   926			},
   927		])
   928	}
   929
   930	task: write: file.Create & {
   931		filename: "foo.txt"
   932		contents: task.print.text
   933	}
   934}
   935EOF
   936```
   937<!-- TODO: use "let" once implemented-->
   938
   939NOTE: THE API OF THE TASK DEFINITIONS WILL CHANGE.
   940Although we may keep supporting this form if needed.
   941
   942The command is now available in the `cue` tool:
   943
   944```
   945$ cue cmd ls ./frontend/maitred
   946Service      frontend   maitred
   947Deployment   frontend   maitred
   948```
   949
   950As long as the name does not conflict with an existing command it can be
   951used as a top-level command as well:
   952```
   953$ cue ls ./frontend/maitred
   954Service      frontend   maitred
   955Deployment   frontend   maitred
   956```
   957
   958If more than one instance is selected the `cue` tool may either operate
   959on them one by one or merge them.
   960The default is to merge them.
   961Different instances of a package are typically not compatible:
   962different subdirectories may have different specializations.
   963A merge pre-expands templates of each instance and then merges their root
   964values.
   965The result may contain conflicts, such as our top-level `#Component` field,
   966but our per-type maps of Kubernetes objects should be free of conflict
   967(if there is, we have a problem with Kubernetes down the line).
   968A merge thus gives us a unified view of all objects.
   969
   970```
   971$ cue ls ./...
   972Service       frontend   bartender
   973Service       frontend   breaddispatcher
   974Service       frontend   host
   975Service       frontend   maitred
   976Service       frontend   valeter
   977Service       frontend   waiter
   978Service       frontend   waterdispatcher
   979Service       infra      download
   980Service       infra      etcd
   981Service       infra      events
   982...
   983
   984Deployment    proxy      nginx
   985StatefulSet   infra      etcd
   986DaemonSet     mon        node-exporter
   987ConfigMap     mon        alertmanager
   988ConfigMap     mon        prometheus
   989ConfigMap     proxy      authproxy
   990ConfigMap     proxy      nginx
   991```
   992
   993### Dumping a YAML Stream
   994
   995The following adds a command to dump the selected objects as a YAML stream.
   996
   997<!--
   998TODO: add command line flags to filter object types.
   999-->
  1000```
  1001$ cat <<EOF > dump_tool.cue
  1002package kube
  1003
  1004import (
  1005	"encoding/yaml"
  1006	"tool/cli"
  1007)
  1008
  1009command: dump: {
  1010	task: print: cli.Print & {
  1011		text: yaml.MarshalStream(objects)
  1012	}
  1013}
  1014EOF
  1015```
  1016
  1017<!--
  1018TODO: with new API as well as conversions implemented
  1019command dump task print: cli.Print(text: yaml.MarshalStream(objects))
  1020
  1021or without conversions:
  1022command dump task print: cli.Print & {text: yaml.MarshalStream(objects)}
  1023-->
  1024
  1025The `MarshalStream` command converts the list of objects to a '`---`'-separated
  1026stream of YAML values.
  1027
  1028
  1029### Creating Objects
  1030
  1031The `create` command sends a list of objects to `kubectl create`.
  1032
  1033```
  1034$ cat <<EOF > create_tool.cue
  1035package kube
  1036
  1037import (
  1038	"encoding/yaml"
  1039	"tool/exec"
  1040	"tool/cli"
  1041)
  1042
  1043command: create: {
  1044	task: kube: exec.Run & {
  1045		cmd:    "kubectl create --dry-run=client -f -"
  1046		stdin:  yaml.MarshalStream(objects)
  1047		stdout: string
  1048	}
  1049
  1050	task: display: cli.Print & {
  1051		text: task.kube.stdout
  1052	}
  1053}
  1054EOF
  1055```
  1056
  1057This command has two tasks, named `kube` and `display`.
  1058The `display` task depends on the output of the `kube` task.
  1059The `cue` tool does a static analysis of the dependencies and runs all
  1060tasks which dependencies are satisfied in parallel while blocking tasks
  1061for which an input is missing.
  1062
  1063```
  1064$ cue create ./frontend/...
  1065service/bartender created (dry run)
  1066service/breaddispatcher created (dry run)
  1067service/host created (dry run)
  1068service/maitred created (dry run)
  1069service/valeter created (dry run)
  1070service/waiter created (dry run)
  1071service/waterdispatcher created (dry run)
  1072deployment.apps/bartender created (dry run)
  1073deployment.apps/breaddispatcher created (dry run)
  1074deployment.apps/host created (dry run)
  1075deployment.apps/maitred created (dry run)
  1076deployment.apps/valeter created (dry run)
  1077deployment.apps/waiter created (dry run)
  1078deployment.apps/waterdispatcher created (dry run)
  1079```
  1080
  1081A production real-life version of this could should omit the `--dry-run=client` flag
  1082of course.
  1083
  1084### Extract CUE templates directly from Kubernetes Go source
  1085
  1086In order for `cue get go` to generate the CUE templates from Go sources, you first need to have the sources locally:
  1087
  1088```
  1089$ go get k8s.io/api/apps/v1@v0.23.4
  1090$ cue get go k8s.io/api/apps/v1
  1091
  1092```
  1093
  1094Now that we have the Kubernetes definitions in our module, we can import and use them:
  1095
  1096```
  1097$ cat <<EOF > k8s_defs.cue
  1098package kube
  1099
  1100import (
  1101	"k8s.io/api/core/v1"
  1102	apps_v1 "k8s.io/api/apps/v1"
  1103)
  1104
  1105service: [string]:     v1.#Service
  1106deployment: [string]:  apps_v1.#Deployment
  1107daemonSet: [string]:   apps_v1.#DaemonSet
  1108statefulSet: [string]: apps_v1.#StatefulSet
  1109EOF
  1110```
  1111
  1112And, finally, we'll format again:
  1113
  1114```
  1115cue fmt
  1116```
  1117
  1118## Manually tailored configuration
  1119
  1120In Section "Quick 'n Dirty" we showed how to quickly get going with CUE.
  1121With a bit more deliberation, one can reduce configurations even further.
  1122Also, we would like to define a configuration that is more generic and less tied
  1123to Kubernetes.
  1124
  1125We will rely heavily on CUEs order independence, which makes it easy to
  1126combine two configurations of the same object in a well-defined way.
  1127This makes it easy, for instance, to put frequently used fields in one file
  1128and more esoteric one in another and then combine them without fear that one
  1129will override the other.
  1130We will take this approach in this section.
  1131
  1132The end result of this tutorial is in the `manual` directory.
  1133In the next sections we will show how to get there.
  1134
  1135
  1136### Outline
  1137
  1138The basic premise of our configuration is to maintain two configurations,
  1139a simple and abstract one, and one compatible with Kubernetes.
  1140The Kubernetes version is automatically generated from the simple configuration.
  1141Each simplified object has a `kubernetes` section that get gets merged into
  1142the Kubernetes object upon conversion.
  1143
  1144We define one top-level file with our generic definitions.
  1145
  1146```
  1147// file cloud.cue
  1148package cloud
  1149
  1150service: [Name=_]: {
  1151    name: *Name | string // the name of the service
  1152
  1153    ...
  1154
  1155    // Kubernetes-specific options that get mixed in when converting
  1156    // to Kubernetes.
  1157    kubernetes: {
  1158    }
  1159}
  1160
  1161deployment: [Name=_]: {
  1162    name: *Name | string
  1163   ...
  1164}
  1165```
  1166
  1167A Kubernetes-specific file then contains the definitions to
  1168convert the generic objects to Kubernetes.
  1169
  1170Overall, the code modeling our services and the code generating the kubernetes
  1171code is separated, while still allowing to inject Kubernetes-specific
  1172data into our general model.
  1173At the same time, we can add additional information to our model without
  1174it ending up in the Kubernetes definitions causing it to barf.
  1175
  1176
  1177### Deployment Definition
  1178
  1179For our design we assume that all Kubernetes Pod derivatives only define one
  1180container.
  1181This is clearly not the case in general, but often it does and it is good
  1182practice.
  1183Conveniently, it simplifies our model as well.
  1184
  1185We base the model loosely on the master templates we derived in
  1186Section "Quick 'n Dirty".
  1187The first step we took is to eliminate `statefulSet` and `daemonSet` and
  1188rather just have a `deployment` allowing different kinds.
  1189
  1190```
  1191deployment: [Name=_]: _base & {
  1192    name:     *Name | string
  1193    ...
  1194```
  1195
  1196The kind only needs to be specified if the deployment is a stateful set or
  1197daemonset.
  1198This also eliminates the need for `_spec`.
  1199
  1200The next step is to pull common fields, such as `image` to the top level.
  1201
  1202Arguments can be specified as a map.
  1203```
  1204    arg: [string]: string
  1205    args: [for k, v in arg { "-\(k)=\(v)" }] | [...string]
  1206```
  1207
  1208If order matters, users could explicitly specify the list as well.
  1209
  1210For ports we define two simple maps from name to port number:
  1211
  1212```
  1213    // expose port defines named ports that is exposed in the service
  1214    expose: port: [string]: int
  1215
  1216    // port defines a named port that is not exposed in the service.
  1217    port: [string]: int
  1218```
  1219Both maps get defined in the container definition, but only `port` gets
  1220included in the service definition.
  1221This may not be the best model, and does not support all features,
  1222but it shows how one can chose a different representation.
  1223
  1224A similar story holds for environment variables.
  1225In most cases mapping strings to string suffices.
  1226The testdata uses other options though.
  1227We define a simple `env` map and an `envSpec` for more elaborate cases:
  1228
  1229```
  1230    env: [string]: string
  1231
  1232    envSpec: [string]: {}
  1233    envSpec: {
  1234        for k, v in env {
  1235            "\(k)" value: v
  1236        }
  1237    }
  1238```
  1239The simple map automatically gets mapped into the more elaborate map
  1240which then presents the full picture.
  1241
  1242Finally, our assumption that there is one container per deployment allows us
  1243to create a single definition for volumes, combining the information for
  1244volume spec and volume mount.
  1245
  1246```
  1247    volume: [Name=_]: {
  1248        name:       *Name | string
  1249        mountPath:  string
  1250        subPath:    null | string
  1251        readOnly:   bool
  1252        kubernetes: {}
  1253    }
  1254```
  1255
  1256All other fields that we way want to define can go into a generic kubernetes
  1257struct that gets merged in with all other generated kubernetes data.
  1258This even allows us to augment generated data, such as adding additional
  1259fields to the container.
  1260
  1261
  1262### Service Definition
  1263
  1264The service definition is straightforward.
  1265As we eliminated stateful and daemon sets, the field comprehension to
  1266automatically derive a service is now a bit simpler:
  1267
  1268```
  1269// define services implied by deployments
  1270service: {
  1271    for k, spec in deployment {
  1272        "\(k)": {
  1273            // Copy over all ports exposed from containers.
  1274            for Name, Port in spec.expose.port {
  1275                port: "\(Name)": {
  1276                    port:       *Port | int
  1277                    targetPort: *Port | int
  1278                }
  1279            }
  1280
  1281            // Copy over the labels
  1282            label: spec.label
  1283        }
  1284    }
  1285}
  1286```
  1287
  1288The complete top-level model definitions can be found at
  1289[doc/tutorial/kubernetes/manual/services/cloud.cue](https://review.gerrithub.io/plugins/gitiles/cue-lang/cue/+/refs/heads/master/doc/tutorial/kubernetes/manual/services/cloud.cue).
  1290
  1291The tailorings for this specific project (the labels) are defined
  1292[here](https://review.gerrithub.io/plugins/gitiles/cue-lang/cue/+/refs/heads/master/doc/tutorial/kubernetes/manual/services/kube.cue).
  1293
  1294
  1295### Converting to Kubernetes
  1296
  1297Converting services is fairly straightforward.
  1298
  1299```
  1300kubernetes: services: {
  1301    for k, x in service {
  1302        "\(k)": x.kubernetes & {
  1303            apiVersion: "v1"
  1304            kind:       "Service"
  1305
  1306            metadata: name:   x.name
  1307            metadata: labels: x.label
  1308            spec: selector:   x.label
  1309
  1310            spec: ports: [for p in x.port { p }]
  1311        }
  1312    }
  1313}
  1314```
  1315
  1316We add the Kubernetes boilerplate, map the top-level fields and mix in
  1317the raw `kubernetes` fields for each service.
  1318
  1319Mapping deployments is a bit more involved, though analogous.
  1320The complete definitions for Kubernetes conversions can be found at
  1321[doc/tutorial/kubernetes/manual/services/k8s.cue](https://review.gerrithub.io/plugins/gitiles/cue-lang/cue/+/refs/heads/master/doc/tutorial/kubernetes/manual/services/k8s.cue).
  1322
  1323Converting the top-level definitions to concrete Kubernetes code is the hardest
  1324part of this exercise.
  1325That said, most CUE users will never have to resort to this level of CUE
  1326to write configurations.
  1327For instance, none of the files in the subdirectories contain comprehensions,
  1328not even the template files in these directories (such as `kitchen/kube.cue`).
  1329Furthermore, none of the configuration files in any of the
  1330leaf directories contain string interpolations.
  1331
  1332
  1333### Metrics
  1334
  1335The fully written out manual configuration can be found in the `manual`
  1336subdirectory.
  1337Running our usual count yields
  1338```
  1339$ find . | grep kube.cue | xargs wc | tail -1
  1340     542    1190   11520 total
  1341```
  1342This does not count our conversion templates.
  1343Assuming that the top-level templates are reusable, and if we don't count them
  1344for both approaches, the manual approach shaves off about another 150 lines.
  1345If we count the templates as well, the two approaches are roughly equal.
  1346
  1347
  1348### Conclusions Manual Configuration
  1349
  1350We have shown that we can further compact a configuration by manually
  1351optimizing template files.
  1352However, we have also shown that the manual optimization only gives
  1353a marginal benefit with respect to the quick-and-dirty semi-automatic reduction.
  1354The benefits for the manual definition largely lies in the organizational
  1355flexibility one gets.
  1356
  1357Manually tailoring your configurations allows creating an abstraction layer
  1358between logical definitions and Kubernetes-specific definitions.
  1359At the same time, CUE's order independence
  1360makes it easy to mix in low-level Kubernetes configuration wherever it is
  1361convenient and applicable.
  1362
  1363Manual tailoring also allows us to add our own definitions without breaking
  1364Kubernetes.
  1365This is crucial in defining information relevant to definitions,
  1366but unrelated to Kubernetes, where they belong.
  1367
  1368Separating abstract from concrete configuration also allows us to create
  1369difference adaptors for the same configuration.
  1370
  1371
  1372<!-- TODO:
  1373## Conversion to `docker-compose`
  1374-->
View as plain text