...

Text file src/sigs.k8s.io/gateway-api/geps/gep-713.md

Documentation: sigs.k8s.io/gateway-api/geps

     1# GEP-713: Metaresources and Policy Attachment
     2
     3* Issue: [#713](https://github.com/kubernetes-sigs/gateway-api/issues/713)
     4* Status: Experimental
     5
     6> **Note**: This GEP is exempt from the [Probationary Period][expprob] rules of
     7> our GEP overview as it existed before those rules did, and so it has been
     8> explicitly grandfathered in.
     9
    10[expprob]:https://gateway-api.sigs.k8s.io/geps/overview/#probationary-period
    11
    12## TLDR
    13
    14This GEP aims to standardize terminology and processes around using one Kubernetes
    15object to modify the functions of one or more other objects.
    16
    17This GEP defines some terms, firstly: _Metaresource_.
    18
    19A Kubernetes object that _augments_ the behavior of an object
    20in a standard way is called a _Metaresource_.
    21
    22This document proposes controlling the creation of configuration in the underlying
    23Gateway data plane using two types of Policy Attachment.
    24A "Policy Attachment" is a specific type of _metaresource_ that can affect specific
    25settings across either one object (this is "Direct Policy Attachment"), or objects
    26in a hierarchy (this is "Inherited Policy Attachment").
    27
    28Individual policy APIs:
    29- must be their own CRDs (e.g. `TimeoutPolicy`, `RetryPolicy` etc),
    30- can be included in the Gateway API group and installation or be defined by
    31  implementations
    32- and must include a common `TargetRef` struct in their specification to identify
    33  how and where to apply that policy.
    34- _may_ include either a `defaults` section, an `overrides` section, or both. If
    35  these are included, the Policy is an Inherited Policy, and should use the
    36  inheritance rules defined in this document.
    37
    38For Inherited Policies, this GEP also describes a set of expected behaviors
    39for how settings can flow across a defined hierarchy.
    40
    41
    42## Goals
    43
    44* Establish a pattern for Policy resources which will be used for any policies
    45  included in the Gateway API spec
    46* Establish a pattern for Policy attachment, whether Direct or Inherited,
    47  which must be used for any implementation specific policies used with
    48  Gateway API resources
    49* Provide a way to distinguish between required and default values for all
    50  policy API implementations
    51* Enable policy attachment at all relevant scopes, including Gateways, Routes,
    52  Backends, along with how values should flow across a hierarchy if necessary
    53* Ensure the policy attachment specification is generic and forward thinking
    54  enough that it could be easily adapted to other grouping mechanisms like
    55  Namespaces in the future
    56* Provide a means of attachment that works for both ingress and mesh
    57  implementations of this API
    58* Provide a consistent specification that will ensure familiarity between both
    59  included and implementation-specific policies so they can both be interpreted
    60  the same way.
    61
    62## Out of scope
    63
    64* Define all potential policies that may be attached to resources
    65* Design the full structure and configuration of policies
    66
    67## Background and concepts
    68
    69When designing Gateway API, one of the things we’ve found is that we often need to be
    70able change the behavior of objects without being able to make changes to the spec
    71of those objects. Sometimes, this is because we can’t change the spec of the object
    72to hold the information we need ( ReferenceGrant, from
    73[GEP-709](https://gateway-api.sigs.k8s.io/geps/gep-709/), affecting Secrets
    74and Services is an example, as is Direct Policy Attachment), and sometimes it’s
    75because we want the behavior change to flow across multiple objects
    76(this is what Inherited Policy Attachment is for).
    77
    78To put this another way, sometimes we need ways to be able to affect how an object
    79is interpreted in the API, without representing the description of those effects
    80inside the spec of the object.
    81
    82This document describes the ways we design objects to meet these two use cases,
    83and why you might choose one or the other.
    84
    85We use the term “metaresource” to describe the class of objects that _only_ augment
    86the behavior of another Kubernetes object, regardless of what they are targeting.
    87
    88“Meta” here is used in its Greek sense of “more comprehensive”
    89or “transcending”, and “resource” rather than “object” because “metaresource”
    90is more pronounceable than “metaobject”. Additionally, a single word is better
    91than a phrase like “wrapper object” or “wrapper resource” overall, although both
    92of those terms are effectively synonymous with “metaresource”.
    93
    94A "Policy Attachment" is a metaresource that affects the fields in existing objects
    95(like Gateway or Routes), or influences the configuration that's generated in an
    96underlying data plane.
    97
    98"Direct Policy Attachment" is when a Policy object references a single object _only_,
    99and only modifies the fields of or the configuration associated with that object.
   100
   101"Inherited Policy Attachment" is when a Policy object references a single object
   102_and any child objects of that object_ (according to some defined hierarchy), and
   103modifies fields of the child objects, or configuration associated with the child
   104objects.
   105
   106In either case, a Policy may either affect an object by controlling the value
   107of one of the existing _fields_ in the `spec` of an object, or it may add
   108additional fields that are _not_ in the `spec` of the object.
   109
   110### Direct Policy Attachment
   111
   112A Direct Policy Attachment is tightly bound to one instance of a particular
   113Kind within a single namespace (or to an instance of a single Kind at cluster scope),
   114and only modifies the behavior of the object that matches its binding.
   115
   116As an example, one use case that Gateway API currently does not support is how
   117to configure details of the TLS required to connect to a backend (in other words,
   118if the process running inside the backend workload expects TLS, not that some
   119automated infrastructure layer is provisioning TLS as in the Mesh case).
   120
   121A hypothetical TLSConnectionPolicy that targets a Service could be used for this,
   122using the functionality of the Service as describing a set of endpoints. (It
   123should also be noted this is not the only way to solve this problem, just an
   124example to illustrate Direct Policy Attachment.)
   125
   126The TLSConnectionPolicy would look something like this:
   127
   128```yaml
   129apiVersion: gateway.networking.k8s.io/v1alpha2
   130kind: TLSConnectionPolicy
   131metadata:
   132  name: tlsport8443
   133  namespace: foo
   134spec:
   135  targetRef: # This struct is defined as part of Gateway API
   136    group: "" # Empty string means core - this is a standard convention
   137    kind: Service
   138    name: fooService
   139  tls:
   140    certificateAuthorityRefs:
   141      - name: CAcert
   142    port: 8443
   143
   144```
   145
   146All this does is tell an implementation, that for connecting to port `8443` on the
   147Service `fooService`, it should assume that the connection is TLS, and expect the
   148service's certificate to be validated by the chain in the `CAcert` Secret.
   149
   150Importantly, this would apply to _every_ usage of that Service across any HTTPRoutes
   151in that namespace, which could be useful for a Service that is reused in a lot of
   152HTTPRoutes.
   153
   154With these two examples in mind, here are some guidelines for when to consider
   155using Direct Policy Attachment:
   156
   157* The number or scope of objects to be modified is limited or singular. Direct
   158  Policy Attachments must target one specific object.
   159* The modifications to be made to the objects don’t have any transitive information -
   160  that is, the modifications only affect the single object that the targeted
   161  metaresource is bound to, and don’t have ramifications that flow beyond that
   162  object.
   163* In terms of status, it should be reasonably easy for a user to understand that
   164  everything is working - basically, as long as the targeted object exists, and
   165  the modifications are valid, the metaresource is valid, and this should be
   166  straightforward to communicate in one or two Conditions. Note that at the time
   167  of writing, this is *not* completed.
   168* Direct Policy Attachment _should_ only be used to target objects in the same
   169  namespace as the Policy object. Allowing cross-namespace references brings in
   170  significant security concerns, and/or difficulties about merging cross-namespace
   171  policy objects. Notably, Mesh use cases may need to do something like this for
   172  consumer policies, but in general, Policy objects that modify the behavior of
   173  things outside their own namespace should be avoided unless it uses a handshake
   174  of some sort, where the things outside the namespace can opt–out of the behavior.
   175  (Notably, this is the design that we used for ReferenceGrant).
   176
   177### Inherited Policy Attachment: It's all about the defaults and overrides
   178
   179Because a Inherited Policy is a metaresource, it targets some other resource
   180and _augments_ its behavior.
   181
   182But why have this distinct from other types of metaresource? Because Inherited
   183Policy resources are designed to have a way for settings to flow down a hierarchy.
   184
   185Defaults set the default value for something, and can be overridden by the
   186“lower” objects (like a connection timeout default policy on a Gateway being
   187overridable inside a HTTPRoute), and Overrides cannot be overridden by “lower”
   188objects (like setting a maximum client timeout to some non-infinite value at the
   189Gateway level to stop HTTPRoute owners from leaking connections over time).
   190
   191Here are some guidelines for when to consider using a Inherited Policy object:
   192
   193* The settings or configuration are bound to one containing object, but affect
   194  other objects attached to that one (for example, affecting HTTPRoutes attached
   195  to a single Gateway, or all HTTPRoutes in a GatewayClass).
   196* The settings need to able to be defaulted, but can be overridden on a per-object
   197  basis.
   198* The settings must be enforced by one persona, and not modifiable or removable
   199  by a lesser-privileged persona. (The owner of a GatewayClass may want to restrict
   200  something about all Gateways in a GatewayClass, regardless of who owns the Gateway,
   201  or a Gateway owner may want to enforce some setting across all attached HTTPRoutes).
   202* In terms of status, a good accounting for how to record that the Policy is
   203  attached is easy, but recording what resources the Policy is being applied to
   204  is not, and needs to be carefully designed to avoid fanout apiserver load.
   205  (This is not built at all in the current design either).
   206
   207When multiple Inherited Policies are used, they can interact in various ways,
   208which are governed by the following rules, which will be expanded on later in this document.
   209
   210* If a Policy does not affect an object's fields directly, then the resultant
   211  Policy should be the set of all distinct fields inside the relevant Policy objects,
   212  as set out by the rules below.
   213* For Policies that affect an object's existing fields, multiple instances of the
   214  same Policy Kind affecting an object's fields will be evaluated as
   215  though only a single Policy "wins" the right to affect each field. This operation
   216  is performed on a _per-distinct-field_ basis.
   217* Settings in `overrides` stanzas will win over the same setting in a `defaults`
   218  stanza.
   219* `overrides` settings operate in a "less specific beats more specific" fashion -
   220  Policies attached _higher_ up the hierarchy will beat the same type of Policy
   221  attached further down the hierarchy.
   222* `defaults` settings operate in a "more specific beats less specific" fashion -
   223  Policies attached _lower down_ the hierarchy will beat the same type of Policy
   224  attached further _up_ the hierarchy.
   225* For `defaults`, the _most specific_ value is the one _inside the object_ that
   226  the Policy applies to; that is, if a Policy specifies a `default`, and an object
   227  specifies a value, the _object's_ value will win.
   228* Policies interact with the fields they are controlling in a "replace value"
   229  fashion.
   230  * For fields where the `value` is a scalar, (like a string or a number)
   231    should have their value _replaced_ by the value in the Policy if it wins.
   232    Notably, this means that a `default` will only ever replace an empty or unset
   233    value in an object.
   234  * For fields where the value is an object, the Policy should include the fields
   235    in the object in its definition, so that the replacement can be on simple fields
   236    rather than complex ones.
   237  * For fields where the final value is non-scalar, but is not an _object_ with
   238    fields of its own, the value should be entirely replaced, _not_ merged. This
   239    means that lists of strings or lists of ints specified in a Policy will overwrite
   240    the empty list (in the case of a `default`) or any specified list (in the case
   241    of an `override`). The same applies to `map[string]string` fields. An example
   242    here would be a field that stores a map of annotations - specifying a Policy
   243    that overrides annotations will mean that a final object specifying those
   244    annotations will have its value _entirely replaced_ by an `override` setting.
   245* In the case that two Policies of the same type specify different fields, then
   246  _all_ of the specified fields should take effect on the affected object.
   247
   248Examples to further illustrate these rules are given below.
   249
   250## Naming Policy objects
   251
   252The preceding rules discuss how Policy objects should _behave_, but this section
   253describes how Policy objects should be _named_.
   254
   255Policy objects should be clearly named so as to indicate that they are Policy
   256metaresources.
   257
   258The simplest way to do that is to ensure that the type's name contains the `Policy`
   259string.
   260
   261Implementations _should_ use `Policy` as the last part of the names of object types
   262that use this pattern.
   263
   264If an implementation does not, then they _must_ clearly document what objects
   265are Policy metaresources in their documentation. Again, this is _not recommended_
   266without a _very_ good reason.
   267
   268## Policy Attachment examples and behavior
   269
   270This approach is building on concepts from all of the alternatives discussed
   271below. This is very similar to the (now removed) BackendPolicy resource in the API,
   272but also borrows some concepts from the [ServicePolicy
   273proposal](https://github.com/kubernetes-sigs/gateway-api/issues/611).
   274
   275### Policy Attachment for Ingress
   276Attaching a Directly Attached Policy to Gateway resources for ingress use cases
   277is relatively straightforward. A policy can reference the resource it wants to
   278apply to.
   279
   280Access is granted with RBAC - anyone that has access to create a RetryPolicy in
   281a given namespace can attach it to any resource within that namespace.
   282
   283![Simple Ingress Example](images/713-ingress-simple.png)
   284
   285An Inherited Policy can attach to a parent resource, and then each policy
   286applies to the referenced resource and everything below it in terms of hierarchy.
   287Although this example is likely more complex than many real world
   288use cases, it helps demonstrate how policy attachment can work across
   289namespaces.
   290
   291![Complex Ingress Example](images/713-ingress-complex.png)
   292
   293### Policy Attachment for Mesh
   294Although there is a great deal of overlap between ingress and mesh use cases,
   295mesh enables more complex policy attachment scenarios. For example, you may want
   296to apply policy to requests from a specific namespace to a backend in another
   297namespace.
   298
   299![Simple Mesh Example](images/713-mesh-simple.png)
   300
   301Policy attachment can be quite simple with mesh. Policy can be applied to any
   302resource in any namespace but it can only apply to requests from the same
   303namespace if the target is in a different namespace.
   304
   305At the other extreme, policy can be used to apply to requests from a specific
   306workload to a backend in another namespace. A route can be used to intercept
   307these requests and split them between different backends (foo-a and foo-b in
   308this case).
   309
   310![Complex Mesh Example](images/713-mesh-complex.png)
   311
   312### Policy TargetRef API
   313
   314Each Policy resource MUST include a single `targetRef` field. It must not
   315target more than one resource at a time, but it can be used to target larger
   316resources such as Gateways or Namespaces that may apply to multiple child
   317resources.
   318
   319As with most APIs, there are countless ways we could choose to expand this in
   320the future. This includes supporting multiple targetRefs and/or label selectors.
   321Although this would enable compelling functionality, it would increase the
   322complexity of an already complex API and potentially result in more conflicts
   323between policies. Although we may choose to expand the targeting capabilities
   324in the future, at this point it is strongly preferred to start with a simpler
   325pattern that still leaves room for future expansion.
   326
   327The `targetRef` field MUST have the following structure:
   328
   329```go
   330// PolicyTargetReference identifies an API object to apply policy to.
   331type PolicyTargetReference struct {
   332    // Group is the group of the target resource.
   333    //
   334    // +kubebuilder:validation:MinLength=1
   335    // +kubebuilder:validation:MaxLength=253
   336    Group string `json:"group"`
   337
   338    // Kind is kind of the target resource.
   339    //
   340    // +kubebuilder:validation:MinLength=1
   341    // +kubebuilder:validation:MaxLength=253
   342    Kind string `json:"kind"`
   343
   344    // Name is the name of the target resource.
   345    //
   346    // +kubebuilder:validation:MinLength=1
   347    // +kubebuilder:validation:MaxLength=253
   348    Name string `json:"name"`
   349
   350    // Namespace is the namespace of the referent. When unspecified, the local
   351    // namespace is inferred. Even when policy targets a resource in a different
   352    // namespace, it may only apply to traffic originating from the same
   353    // namespace as the policy.
   354    //
   355    // +kubebuilder:validation:MinLength=1
   356    // +kubebuilder:validation:MaxLength=253
   357    // +optional
   358    Namespace string `json:"namespace,omitempty"`
   359}
   360```
   361
   362### Sample Policy API
   363The following structure can be used as a starting point for any Policy resource
   364using this API pattern. Note that the PolicyTargetReference struct defined above
   365will be distributed as part of the Gateway API.
   366
   367```go
   368// ACMEServicePolicy provides a way to apply Service policy configuration with
   369// the ACME implementation of the Gateway API.
   370type ACMEServicePolicy struct {
   371    metav1.TypeMeta   `json:",inline"`
   372    metav1.ObjectMeta `json:"metadata,omitempty"`
   373
   374    // Spec defines the desired state of ACMEServicePolicy.
   375    Spec ACMEServicePolicySpec `json:"spec"`
   376
   377    // Status defines the current state of ACMEServicePolicy.
   378    Status ACMEServicePolicyStatus `json:"status,omitempty"`
   379}
   380
   381// ACMEServicePolicySpec defines the desired state of ACMEServicePolicy.
   382type ACMEServicePolicySpec struct {
   383    // TargetRef identifies an API object to apply policy to.
   384    TargetRef gatewayv1a2.PolicyTargetReference `json:"targetRef"`
   385
   386    // Override defines policy configuration that should override policy
   387    // configuration attached below the targeted resource in the hierarchy.
   388    // +optional
   389    Override *ACMEPolicyConfig `json:"override,omitempty"`
   390
   391    // Default defines default policy configuration for the targeted resource.
   392    // +optional
   393    Default *ACMEPolicyConfig `json:"default,omitempty"`
   394}
   395
   396// ACMEPolicyConfig contains ACME policy configuration.
   397type ACMEPolicyConfig struct {
   398    // Add configurable policy here
   399}
   400
   401// ACMEServicePolicyStatus defines the observed state of ACMEServicePolicy.
   402type ACMEServicePolicyStatus struct {
   403    // Conditions describe the current conditions of the ACMEServicePolicy.
   404    //
   405    // +optional
   406    // +listType=map
   407    // +listMapKey=type
   408    // +kubebuilder:validation:MaxItems=8
   409    Conditions []metav1.Condition `json:"conditions,omitempty"`
   410}
   411```
   412
   413### Hierarchy
   414Each policy MAY include default or override values. Default values are given
   415precedence from the bottom up, while override values are top down. That means
   416that a default attached to a Backend will have the highest precedence among
   417default values while an override value attached to a GatewayClass will have the
   418highest precedence overall.
   419
   420![Ingress and Sidecar Hierarchy](images/713-hierarchy.png)
   421
   422To illustrate this, consider 3 resources with the following hierarchy:
   423A > B > C. When attaching the concept of defaults and overrides to that, the
   424hierarchy would be expanded to this:
   425
   426A override > B override > C override > C default > B default > A default.
   427
   428Note that the hierarchy is reversed for defaults. The rationale here is that
   429overrides usually need to be enforced top down while defaults should apply to
   430the lowest resource first. For example, if an admin needs to attach required
   431policy, they can attach it as an override to a Gateway. That would have
   432precedence over Routes and Services below it. On the other hand, an app owner
   433may want to set a default timeout for their Service. That would have precedence
   434over defaults attached at higher levels such as Route or Gateway.
   435
   436If using defaults _and_ overrides, each policy resource MUST include 2 structs
   437within the spec. One with override values and the other with default values.
   438
   439In the following example, the policy attached to the Gateway requires cdn to
   440be enabled and provides some default configuration for that. The policy attached
   441to the Route changes the value for one of those fields (includeQueryString).
   442
   443```yaml
   444kind: CDNCachingPolicy # Example of implementation specific policy name
   445spec:
   446  override:
   447    cdn:
   448      enabled: true
   449  default:
   450    cdn:
   451      cachePolicy:
   452        includeHost: true
   453        includeProtocol: true
   454        includeQueryString: true
   455  targetRef:
   456    kind: Gateway
   457    name: example
   458---
   459kind: CDNCachingPolicy
   460spec:
   461  default:
   462    cdn:
   463      cachePolicy:
   464        includeQueryString: false
   465  targetRef:
   466    type: direct
   467    kind: HTTPRoute
   468    name: example
   469```
   470
   471In this final example, we can see how the override attached to the Gateway has
   472precedence over the default drainTimeout value attached to the Route. At the
   473same time, we can see that the default connectionTimeout attached to the Route
   474has precedence over the default attached to the Gateway.
   475
   476Also note how the different resources interact - fields that are not common across
   477objects _may_ both end up affecting the final object.
   478
   479![Inherited Policy Example](images/713-policy-hierarchy.png)
   480
   481#### Supported Resources
   482It is important to note that not every implementation will be able to support
   483policy attachment to each resource described in the hierarchy above. When that
   484is the case, implementations MUST clearly document which resources a policy may
   485be attached to.
   486
   487#### Attaching Policy to GatewayClass
   488GatewayClass may be the trickiest resource to attach policy to. Policy
   489attachment relies on the policy being defined within the same scope as the
   490target. This ensures that only users with write access to a policy resource in a
   491given scope will be able to modify policy at that level. Since GatewayClass is a
   492cluster scoped resource, this means that any policy attached to it must also be
   493cluster scoped.
   494
   495GatewayClass parameters provide an alternative to policy attachment that may be
   496easier for some implementations to support. These parameters can similarly be
   497used to set defaults and requirements for an entire GatewayClass.
   498
   499### Targeting External Services
   500In some cases (likely limited to mesh) we may want to apply policies to requests
   501to external services. To accomplish this, implementations can choose to support
   502a reference to a virtual resource type:
   503
   504```yaml
   505apiVersion: networking.acme.io/v1alpha1
   506kind: RetryPolicy
   507metadata:
   508  name: foo
   509spec:
   510  default:
   511    maxRetries: 5
   512  targetRef:
   513    group: networking.acme.io
   514    kind: ExternalService
   515    name: foo.com
   516```
   517
   518### Merging into existing `spec` fields
   519
   520It's possible (even likely) that configuration in a Policy may need to be merged
   521into an existing object's fields somehow, particularly for Inherited policies.
   522
   523When merging into an existing fields inside an object, Policy objects should
   524merge values at a scalar level, not at a struct or object level.
   525
   526For example, in the `CDNCachingPolicy` example above, the `cdn` struct contains
   527a `cachePolicy` struct that contains fields. If an implementation was merging
   528this configuration into an existing object that contained the same fields, it
   529should merge the fields at a scalar level, with the `includeHost`,
   530`includeProtocol`, and `includeQueryString` values being defaulted if they were
   531not specified in the object being controlled. Similarly, for `overrides`, the
   532values of the innermost scalar fields should overwrite the scalar fields in the
   533affected object.
   534
   535Implementations should not copy any structs from the Policy object directly into the
   536affected object, any fields that _are_ overridden should be overridden on a per-field
   537basis.
   538
   539In the case that the field in the Policy affects a struct that is a member of a list,
   540each existing item in the list in the affected object should have each of its
   541fields compared to the corresponding fields in the Policy.
   542
   543For non-scalar field _values_, like a list of strings, or a `map[string]string`
   544value, the _entire value_ must be overwritten by the value from the Policy. No
   545merging should take place. This mainly applies to `overrides`, since for
   546`defaults`, there should be no value present in a field on the final object.
   547
   548This table shows how this works for various types:
   549
   550|Type|Object config|Override Policy config|Result|
   551|----|-------------|----------------------|------|
   552|string| `key: "foo"` | `key: "bar"`  | `key: "bar"` |
   553|list| `key: ["a","b"]` | `key: ["c","d"]` | `key: ["c","d"]` |
   554|`map[string]string`| `key: {"foo": "a", "bar": "b"}` | `key: {"foo": "c", "bar": "d"}` | `key: {"foo": "c", "bar": "d"}` |
   555
   556
   557### Conflict Resolution
   558It is possible for multiple policies to target the same object _and_ the same
   559fields inside that object. If multiple policy resources target
   560the same resource _and_ have an identical field specified with different values,
   561precedence MUST be determined in order of the following criteria, continuing on
   562ties:
   563
   564* Direct Policies override Inherited Policies. If preventing settings from
   565  being overwritten is important, implementations should only use Inherited
   566  Policies, and the `override` stanza that implies. Note also that it's not
   567  intended that Direct and Inherited Policies should overlap, so this should
   568  only come up in exceptional circumstances.
   569* Inside Inherited Policies, the same setting in `overrides` beats the one in
   570  `defaults`.
   571* The oldest Policy based on creation timestamp. For example, a Policy with a
   572  creation timestamp of "2021-07-15 01:02:03" is given precedence over a Policy
   573  with a creation timestamp of "2021-07-15 01:02:04".
   574* The Policy appearing first in alphabetical order by `{namespace}/{name}`. For
   575  example, foo/bar is given precedence over foo/baz.
   576
   577For a better user experience, a validating webhook can be implemented to prevent
   578these kinds of conflicts all together.
   579
   580## Status and the Discoverability Problem
   581
   582So far, this document has talked about what Policy Attachment is, different types
   583of attachment, and how those attachments work.
   584
   585Probably the biggest impediment to this GEP moving forward is the discoverability
   586problem; that is, it’s critical that an object owner be able to know what policy
   587is affecting their object, and ideally its contents.
   588
   589To understand this a bit better, let’s consider this parable, with thanks to Flynn:
   590
   591### The Parable
   592
   593It's a sunny Wednesday afternoon, and the lead microservices developer for
   594Evil Genius Cupcakes is windsurfing. Work has been eating Ana alive for the
   595past two and a half weeks, but after successfully deploying version 3.6.0 of
   596the `baker` service this morning, she's escaped early to try to unwind a bit.
   597
   598Her shoulders are just starting to unknot when her phone pings with a text
   599from Charlie, down in the NOC. Waterproof phones are a blessing, but also a
   600curse.
   601
   602**Charlie**: _Hey Ana. Things are still running, more or less, but latencies
   603on everything in the `baker` namespace are crazy high after your last rollout,
   604and `baker` itself has a weirdly high load. Sorry to interrupt you on the lake
   605but can you take a look? Thanks!!_
   606
   607Ana stares at the phone for a long moment, heart sinking, then sighs and
   608turns back to shore.
   609
   610What she finds when dries off and grabs her laptop is strange. `baker` does
   611seem to be taking much more load than its clients are sending, and its clients
   612report much higher latencies than they’d expect. She doublechecks the
   613Deployment, the Service, and all the HTTPRoutes around `baker`; everything
   614looks good. `baker`’s logs show her mostly failed requests... with a lot of
   615duplicates? Ana checks her HTTPRoute again, though she's pretty sure you
   616can't configure retries there, and finds nothing. But it definitely looks like
   617clients are retrying when they shouldn’t be.
   618
   619She pings Charlie.
   620
   621**Ana**: _Hey Charlie. Something weird is up, looks like requests to `baker`
   622are failing but getting retried??_
   623
   624A minute later they answer.
   625
   626**Charlie**: 🤷 _Did you configure retries?_
   627
   628**Ana**: _Dude. I don’t even know how to._ 😂
   629
   630**Charlie**: _You just attach a RetryPolicy to your HTTPRoute._
   631
   632**Ana**: _Nope. Definitely didn’t do that._
   633
   634She types `kubectl get retrypolicy -n baker` and gets a permission error.
   635
   636**Ana**: _Huh, I actually don’t have permissions for RetryPolicy._ 🤔
   637
   638**Charlie**: 🤷 _Feels like you should but OK, guess that can’t be it._
   639
   640Minutes pass while both look at logs.
   641
   642**Charlie**: _I’m an idiot. There’s a RetryPolicy for the whole namespace –
   643sorry, too many policies in the dashboard and I missed it. Deleting that since
   644you don’t want retries._
   645
   646**Ana**: _Are you sure that’s a good–_
   647
   648Ana’s phone shrills while she’s typing, and she drops it. When she picks it
   649up again she sees a stack of alerts. She goes pale as she quickly flips
   650through them: there’s one for every single service in the `baker` namespace.
   651
   652**Ana**: _PUT IT BACK!!_
   653
   654**Charlie**: _Just did. Be glad you couldn't hear all the alarms here._ 😕
   655
   656**Ana**: _What the hell just happened??_
   657
   658**Charlie**: _At a guess, all the workloads in the `baker` namespace actually
   659fail a lot, but they seem OK because there are retries across the whole
   660namespace?_ 🤔
   661
   662Ana's blood runs cold.
   663
   664**Charlie**: _Yeah. Looking a little closer, I think your `baker` rollout this
   665morning would have failed without those retries._ 😕
   666
   667There is a pause while Ana's mind races through increasingly unpleasant
   668possibilities.
   669
   670**Ana**: _I don't even know where to start here. How long did that
   671RetryPolicy go in? Is it the only thing like it?_
   672
   673**Charlie**: _Didn’t look closely before deleting it, but I think it said a few
   674months ago. And there are lots of different kinds of policy and lots of
   675individual policies, hang on a minute..._
   676
   677**Charlie**: _Looks like about 47 for your chunk of the world, a couple hundred
   678system-wide._
   679
   680**Ana**: 😱 _Can you tell me what they’re doing for each of our services? I
   681can’t even_ look _at these things._ 😕
   682
   683**Charlie**: _That's gonna take awhile. Our tooling to show us which policies
   684bind to a given workload doesn't go the other direction._
   685
   686**Ana**: _...wait. You have to_ build tools _to know if retries are turned on??_
   687
   688Pause.
   689
   690**Charlie**: _Policy attachment is more complex than we’d like, yeah._ 😐
   691_Look, how ‘bout roll back your `baker` change for now? We can get together in
   692the morning and start sorting this out._
   693
   694Ana shakes her head and rolls back her edits to the `baker` Deployment, then
   695sits looking out over the lake as the deployment progresses.
   696
   697**Ana**: _Done. Are things happier now?_
   698
   699**Charlie**: _Looks like, thanks. Reckon you can get back to your sailboard._ 🙂
   700
   701Ana sighs.
   702
   703**Ana**: _Wish I could. Wind’s died down, though, and it'll be dark soon.
   704Just gonna head home._
   705
   706**Charlie**: _Ouch. Sorry to hear that._ 😐
   707
   708One more look out at the lake.
   709
   710**Ana**: _Thanks for the help. Wish we’d found better answers._ 😢
   711
   712### The Problem, restated
   713What this parable makes clear is that, in the absence of information about what
   714Policy is affecting an object, it’s very easy to make poor decisions.
   715
   716It’s critical that this proposal solve the problem of showing up to three things,
   717listed in increasing order of desirability:
   718
   719- _That_ some Policy is affecting a particular object
   720- _Which_ Policy is (or Policies are) affecting a particular object
   721- _What_ settings in the Policy are affecting the object.
   722
   723In the parable, if Ana and Charlie had known that there were Policies affecting
   724the relevant object, then they could have gone looking for the relevant Policies
   725and things would have played out differently. If they knew which Policies, they
   726would need to look less hard, and if they knew what the settings being applied
   727were, then the parable would have been able to be very short indeed.
   728
   729(There’s also another use case to consider, in that Charlie should have been able
   730to see that the Policy on the namespace was in use in many places before deleting
   731it.)
   732
   733To put this another way, Policy Attachment is effectively adding a fourth Persona,
   734the Policy Admin, to Gateway API’s persona list, and without a solution to the
   735discoverability problem, their actions are largely invisible to the Application
   736Developer. Not only that, but their concerns cut across the previously established
   737levels.
   738
   739![Gateway API diagram with Policy Admin](images/713-the-diagram-with-policy-admin.png)
   740
   741
   742From the Policy Admin’s point of view, they need to know across their whole remit
   743(which conceivably could be the whole cluster):
   744
   745- _What_ Policy has been created
   746- _Where_ it’s applied
   747- _What_ the resultant policy is saying
   748
   749Which again, come down to discoverability, and can probably be addressed in similar
   750ways at an API level to the Application Developer's concerns.
   751
   752An important note here is that a key piece of information for Policy Admins and
   753Cluster Operators is “How many things does this Policy affect?”. In the parable,
   754this would have enabled Charlie to know that deleting the Namespace Policy would
   755affect many other people than just Ana.
   756
   757### Problems we need to solve
   758
   759Before we can get into solutions, we need to discuss the problems that solutions
   760may need to solve, so that we have some criteria for evaluating those solutions.
   761
   762#### User discoverability
   763
   764Let's go through the various users of Gateway API and what they need to know about
   765Policy Attachment.
   766
   767In all of these cases, we should aim to keep the troubleshooting distance low;
   768that is, that there should be a minimum of hops required between objects from the
   769one owned by the user to the one responsible for a setting.
   770
   771Another way to think of the troubleshooting distance in this context is "How many
   772`kubectl` commands would the user need to do to understand that a Policy is relevant,
   773which Policy is relevant, and what configuration the full set of Policy is setting?"
   774
   775##### Application Developer Discoverability
   776
   777How does Ana, or any Application Developer who owns one or more Route objects know
   778that their object is affected by Policy, which Policy is affecting it, and what
   779the content of the Policy is?
   780
   781The best outcome is that Ana needs to look only at a specific route to know what
   782Policy settings are being applied to that Route, and where they come from.
   783However, some of the other problems below make it very difficult to achieve this.
   784
   785##### Policy Admin Discoverability
   786
   787How does the Policy Admin know what Policy is applied where, and what the content
   788of that Policy is?
   789How do they validate that Policy is being used in ways acceptable to their organization?
   790For any given Policy object, how do they know how many places it's being used?
   791
   792##### Cluster Admin Discoverability
   793
   794The Cluster Admin has similar concerns to the Policy Admin, but with a focus on
   795being able to determine what's relevant when something is broken.
   796
   797How does the Cluster Admin know what Policy is applied where, and what the content
   798of that Policy is?
   799
   800For any given Policy object, how do they know how many places it's being used?
   801
   802#### Evaluating and Displaying Resultant Policy
   803
   804For any given Policy type, whether Direct Attached or Inherited, implementations
   805will need to be able to _calculate_ the resultant set of Policy to be able to
   806apply that Policy to the correct parts of their data plane configuration.
   807However, _displaying_ that resultant set of Policy in a way that is straightforward
   808for the various personas to consume is much harder.
   809
   810The easiest possible option for Application Developers would be for the
   811implementation to make the full resultant set of Policy available in the status
   812of objects that the Policy affects. However, this runs into a few problems:
   813
   814- The status needs to be namespaced by the implementation
   815- The status could get large if there are a lot of Policy objects affecting an
   816  object
   817- Building a common data representation pattern that can fit into a single common
   818  schema is not straightforward.
   819- Updating one Policy object could cause many affected objects to need to be
   820  updated themselves. This sort of fan-out problem can be very bad for apiserver
   821  load, particularly if Policy changes rapidly, there are a lot of objects, or both.
   822
   823##### Status needs to be namespaced by implementation
   824
   825Because an object can be affected by multiple implementations at once, any status
   826we add must be namespaced by the implementation.
   827
   828In Route Parent status, we've used the parentRef plus the controller name for this.
   829
   830For Policy, we can do something similar and namespace by the reference to the
   831implementation's controller name.
   832
   833We can't easily namespace by the originating Policy because the source could be
   834more than one Policy object.
   835
   836##### Creating common data representation patterns
   837
   838The problem here is that we need to have a _common_ pattern for including the
   839details of an _arbitrarily defined_ object, that needs to be included in the base
   840API.
   841
   842So we can't use structured data, because we have no way of knowing what the
   843structure will be beforehand.
   844
   845This suggests that we need to use unstructured data for representing the main
   846body of an arbitrary Policy object.
   847
   848Practically, this will need to be a string representation of the YAML form of the
   849body of the Policy object (absent the metadata part of every Kubernetes object).
   850
   851Policy Attachment does not mandate anything about the design of the object's top
   852level except that it must be a Kubernetes object, so the only thing we can rely
   853on is the presence of the Kubernetes metadata elements: `apiVersion`, `kind`,
   854and `metadata`.
   855
   856A string representation of the rest of the file is the best we can do here.
   857
   858##### Fanout status update problems
   859
   860The fanout problem is that, when an update takes place in a single object (a
   861Policy, or an object with a Policy attached), an implementation may need to
   862update _many_ objects if it needs to place details of what Policy applies, or
   863what the resultant set of policy is on _every_ object.
   864
   865Historically, this is a risky strategy and needs to be carefully applied, as
   866it's an excellent way to create apiserver load problems, which can produce a large
   867range of bad effects for cluster stability.
   868
   869This does not mean that we can't do anything at all that affects multiple objects,
   870but that we need to carefully consider what information is stored in status so 
   871that _every_ Policy update does not require a status update.
   872
   873#### Solution summary
   874
   875Because Policy Attachment is a pattern for APIs, not an API, and needs to address
   876all the problems above, the strategy this GEP proposes is to define a range of
   877options for increasing the discoverabilty of Policy resources, and provide
   878guidelines for when they should be used.
   879
   880It's likely that at some stage, the Gateway API CRDs will include some Policy
   881resources, and these will be designed with all these discoverabiity solutions
   882in mind.
   883
   884
   885### Solution cookbook
   886
   887This section contains some required patterns for Policy objects and some
   888suggestions. Each will be marked as MUST, SHOULD, or MAY, using the standard 
   889meanings of those terms.
   890
   891Additionally, the status of each solution is noted at the beginning of the section.
   892
   893#### Standard label on CRD objects
   894
   895Status: Required
   896
   897Each CRD that defines a Policy object MUST include a label that specifies that
   898it is a Policy object, and that label MUST specify the _type_ of Policy attachment
   899in use.
   900
   901The label is `gateway.networking.k8s.io/policy: inherited|direct`.
   902
   903This solution is intended to allow both users and tooling to identify which CRDs
   904in the cluster should be treated as Policy objects, and so is intended to help
   905with discoverability generally. It will also be used by the forthcoming `kubectl`
   906plugin.
   907
   908##### Design considerations
   909
   910This is already part of the API pattern, but is being lifted to more prominience
   911here.
   912
   913#### Standard status struct
   914
   915Status: Experimental
   916
   917Policy objects SHOULD use the upstream `PolicyAncestorStatus` struct in their respective
   918Status structs. Please see the included `PolicyAncestorStatus` struct, and its use in
   919the `BackendTLSPolicy` object for detailed examples. Included here is a representative
   920version.
   921
   922This pattern enables different conditions to be set for different "Ancestors"
   923of the target resource. This is particularly helpful for policies that may be
   924implemented by multiple controllers or attached to resources with different
   925capabilities. This pattern also provides a clear view of what resources a
   926policy is affecting.
   927
   928For the best integration with community tooling and consistency across
   929the broader community, we recommend that all implementations transition 
   930to Policy status with this kind of nested structure.
   931
   932This is an `Ancestor` status rather than a `Parent` status, as in the Route status
   933because for Policy attachment, the relevant object may or may not be the direct
   934parent.
   935
   936For example, `BackendTLSPolicy` directly attaches to a Service, which may be included
   937in multiple Routes, in multiple Gateways. However, for many implementations, 
   938the status of the `BackendTLSPolicy` will be different only at the Gateway level, 
   939so Gateway is the relevant Ancestor for the status.
   940
   941Each Gateway that has a Route that includes a backend with an attached `BackendTLSPolicy`
   942MUST have a separate `PolicyAncestorStatus` section in the `BackendTLSPolicy`'s
   943`status.ancestors` stanza, which mandates that entries must be distinct using the
   944combination of the `AncestorRef` and the `ControllerName` fields as a key.
   945
   946See [GEP-1897][gep-1897] for the exact details.
   947
   948[gep-1897]: /geps/gep-1897
   949
   950```go
   951// PolicyAncestorStatus describes the status of a route with respect to an
   952// associated Ancestor.
   953//
   954// Ancestors refer to objects that are either the Target of a policy or above it in terms
   955// of object hierarchy. For example, if a policy targets a Service, an Ancestor could be
   956// a Route or a Gateway. 
   957
   958// In the context of policy attachment, the Ancestor is used to distinguish which
   959// resource results in a distinct application of this policy. For example, if a policy
   960// targets a Service, it may have a distinct result per attached Gateway.
   961// 
   962// Policies targeting the same resource may have different effects depending on the 
   963// ancestors of those resources. For example, different Gateways targeting the same
   964// Service may have different capabilities, especially if they have different underlying
   965// implementations. 
   966//
   967// For example, in BackendTLSPolicy, the Policy attaches to a Service that is
   968// used as a backend in a HTTPRoute that is itself attached to a Gateway.
   969// In this case, the relevant object for status is the Gateway, and that is the
   970// ancestor object referred to in this status.
   971//
   972// Note that a Target of a Policy is also a valid Ancestor, so for objects where
   973// the Target is the relevant object for status, this struct SHOULD still be used.
   974type PolicyAncestorStatus struct {
   975	// AncestorRef corresponds with a ParentRef in the spec that this
   976	// RouteParentStatus struct describes the status of.
   977	AncestorRef ParentReference `json:"ancestorRef"`
   978
   979	// ControllerName is a domain/path string that indicates the name of the
   980	// controller that wrote this status. This corresponds with the
   981	// controllerName field on GatewayClass.
   982	//
   983	// Example: "example.net/gateway-controller".
   984	//
   985	// The format of this field is DOMAIN "/" PATH, where DOMAIN and PATH are
   986	// valid Kubernetes names
   987	// (https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names).
   988	//
   989	// Controllers MUST populate this field when writing status. Controllers should ensure that
   990	// entries to status populated with their ControllerName are cleaned up when they are no
   991	// longer necessary.
   992	ControllerName GatewayController `json:"controllerName"`
   993
   994	// Conditions describes the status of the Policy with respect to the given Ancestor.
   995	//
   996	// +listType=map
   997	// +listMapKey=type
   998	// +kubebuilder:validation:MinItems=1
   999	// +kubebuilder:validation:MaxItems=8
  1000	Conditions []metav1.Condition `json:"conditions,omitempty"`
  1001}
  1002
  1003
  1004// PolicyStatus defines the common attributes that all Policies SHOULD include
  1005// within their status.
  1006type PolicyStatus struct {
  1007	// Ancestors is a list of ancestor resources (usually Gateways) that are
  1008	// associated with the route, and the status of the route with respect to
  1009	// each ancestor. When this route attaches to a parent, the controller that
  1010	// manages the parent and the ancestors MUST add an entry to this list when
  1011	// the controller first sees the route and SHOULD update the entry as
  1012	// appropriate when the relevant ancestor is modified.
  1013	//
  1014	// Note that choosing the relevant ancestor is left to the Policy designers;
  1015	// an important part of Policy design is designing the right object level at
  1016	// which to namespace this status.
  1017	//
  1018	// Note also that implementations MUST ONLY populate ancestor status for 
  1019	// the Ancestor resources they are responsible for. Implementations MUST
  1020	// use the ControllerName field to uniquely identify the entries in this list
  1021	// that they are responsible for.
  1022	//
  1023	// A maximum of 32 ancestors will be represented in this list. An empty list
  1024	// means the Policy is not relevant for any ancestors.
  1025	//
  1026	// +kubebuilder:validation:MaxItems=32
  1027	Ancestors []PolicyAncestorStatus `json:"ancestors"`
  1028}
  1029```
  1030
  1031##### Design considerations
  1032
  1033This is recommended as the base for Policy object's status. As Policy Attachment
  1034is a pattern, not an API, "recommended" is the strongest we can make this, but
  1035we believe that standardizing this will help a lot with discoverability.
  1036
  1037Note that is likely that all Gateway API tooling will expect policy status to follow
  1038this structure. To benefit from broader consistency and discoverability, we
  1039recommend transitioning to this structure for all Gateway API Policies.
  1040
  1041#### Standard status Condition on Policy-affected objects
  1042
  1043Support: Provisional
  1044
  1045This solution is IN PROGRESS and so is not binding yet.
  1046
  1047This solution requires definition in a GEP of its own to become binding.
  1048
  1049**The description included here is intended to illustrate the sort of solution
  1050that an eventual GEP will need to provide, _not to be a binding design.**
  1051
  1052Implementations that use Policy objects MUST put a Condition into `status.Conditions`
  1053of any objects affected by a Policy.
  1054
  1055That Condition must have a `type` ending in `PolicyAffected` (like
  1056`gateway.networking.k8s.io/PolicyAffected`),
  1057and have the optional `observedGeneration` field kept up to date when the `spec`
  1058of the Policy-attached object changes.
  1059
  1060Implementations _should_ use their own unique domain prefix for this Condition
  1061`type` - it is recommended that implementations use the same domain as in the
  1062`controllerName` field on GatewayClass (or some other implementation-unique
  1063domain for implementations that do not use GatewayClass).)
  1064
  1065For objects that do _not_ have a `status.Conditions` field available (`Secret`
  1066is a good example), that object MUST instead have an annotation of
  1067`gateway.networking.k8s.io/PolicyAffected: true` (or with an
  1068implementation-specific domain prefix) added instead.
  1069
  1070
  1071##### Design Considerations
  1072The intent here is to add at least a breadcrumb that leads object owners to have
  1073some way to know that their object is being affected by another object, while
  1074minimizing the number of updates necessary.
  1075
  1076Minimizing the object updates is done by only having an update be necessary when
  1077the affected object starts or stops being affected by a Policy, rather than if
  1078the Policy itself has been updated.
  1079
  1080There is already a similar Condition to be placed on _Policy_ objects, rather
  1081than on the _targeted_ objects, so this solution is also being included in the
  1082Condiions section below.
  1083
  1084#### GatewayClass status Extension Types listing
  1085
  1086Support: Provisional
  1087
  1088This solution is IN PROGRESS, and so is not binding yet.
  1089
  1090Each implementation MUST list all relevant CRDs in its GatewayClass status (like
  1091Policy, and other extension types, like paramsRef targets, filters, and so on). 
  1092
  1093This is going to be tracked in its own GEP, https://github.com/kubernetes-sigs/gateway-api/discussions/2118
  1094is the initial discussion. This document will be updated with the details once
  1095that GEP is opened.
  1096
  1097##### Design Considerations
  1098
  1099This solution:
  1100
  1101- is low cost in terms of apiserver updates (because it's only on the GatewayClass,
  1102  and only on implementation startup)
  1103- provides a standard place for all users to look for relevant objects
  1104- ties in to the Conformance Profiles design and other efforts about GatewayClass 
  1105  status
  1106
  1107#### Standard status stanza
  1108
  1109Support: Provisional
  1110
  1111This solution is IN PROGRESS and so is not binding yet.
  1112
  1113This solution requires definition in a GEP of its own to become binding.
  1114
  1115**The description included here is intended to illustrate the sort of solution
  1116that an eventual GEP will need to provide, _not to be a binding design. THIS IS
  1117AN EXPERIMENTAL SOLUTION DO NOT USE THIS YET.**
  1118
  1119An implementation SHOULD include the name, namespace, apiGroup and Kind of Policies
  1120affecting an object in the new `effectivePolicy` status stanza on Gateway API
  1121objects.
  1122
  1123This stanza looks like this:
  1124```yaml
  1125kind: Gateway
  1126...
  1127status:
  1128  effectivePolicy:
  1129  - name: some-policy
  1130    namespace: some-namespace
  1131    apiGroup: implementation.io
  1132    kind: AwesomePolicy
  1133  ...
  1134```
  1135
  1136##### Design Considerations
  1137
  1138This solution is designed to limit the number of status updates required by an
  1139implementation to when a Policy starts or stops being relevant for an object,
  1140rather than if that Policy's settings are updated.
  1141
  1142It helps a lot with discoverability, but comes at the cost of a reasonably high
  1143fanout cost. Implementations using this solution should ensure that status updates
  1144are deduplicated and only sent to the apiserver when absolutely necessary.
  1145
  1146Ideally, these status updates SHOULD be in a separate, lower-priority queue than
  1147other status updates or similar solution.
  1148
  1149#### PolicyBinding resource
  1150
  1151Support: Provisional
  1152
  1153This solution is IN PROGRESS and so is not binding yet.
  1154
  1155This solution requires definition in a GEP of its own to become binding.
  1156
  1157**The description included here is intended to illustrate the sort of solution
  1158that the eventual GEP will need to provide, _not to be a binding design. THIS IS
  1159AN EXPERIMENTAL SOLUTION DO NOT USE THIS YET.**
  1160
  1161Implementations SHOULD create an instance of a new `gateway.networking.k8s.io/EffectivePolicy`
  1162object when one or more Policy objects become relevant to the target object.
  1163
  1164The `EffectivePolicy` object MUST be in the same namespace as the object targeted
  1165by the Policy, and must have the _same name_ as the object targeted like the Policy.
  1166This is intended to mirror the Services/Endpoints naming convention, to allow for
  1167ease of discovery.
  1168
  1169The `EffectivePolicy` object MUST set the following information:
  1170
  1171- The name, namespace, apiGroup and Kind of Policy objects affecting the targeted
  1172  object.
  1173- The full resultant set of Policy affecting the targeted object.
  1174
  1175The above details MUST be namespaced using the `controllerName` of the implementation
  1176(could also be by GatewayClass), similar to Route status being namespaced by
  1177`parentRef`.
  1178
  1179An example `EffectivePolicy` object is included here - this may be superseded by
  1180a later GEP and should be updated or removed in that case. Note that it does
  1181_not_ contain a `spec` and a `status` stanza - by definition this object _only_
  1182contains `status` information.
  1183
  1184```yaml
  1185kind: EffectivePolicy
  1186apiVersion: gateway.networkking.k8s.io/v1alpha2
  1187metadata:
  1188  name: targeted-object
  1189  namespace: targeted-object-namespace
  1190policies:
  1191- controllerName: implementation.io/ControllerName
  1192  objects:
  1193  - name: some-policy
  1194    namespace: some-namespace
  1195    apiGroup: implementation.io
  1196    kind: AwesomePolicy
  1197  resultantPolicy:
  1198    awesomePolicy:
  1199      configitem1:
  1200        defaults:
  1201          foo: 1
  1202        overrides:
  1203          bar: important-setting
  1204
  1205```
  1206
  1207Note here that the `resultantPolicy` setting is defined using the same mechanisms
  1208as an `unstructured.Unstructured` object in the Kubernetes Go libraries - it's
  1209effectively a `map[string]struct{}` that is stored as a `map[string]string` -
  1210which allows an arbitrary object to be specified there.
  1211
  1212Users or tools reading the config underneath `resultantPolicy` SHOULD display
  1213it in its encoded form, and not try to deserialize it in any way.
  1214
  1215The rendered YAML MUST be usable as the `spec` for the type given.
  1216
  1217##### Design considerations
  1218
  1219This will provide _full_ visibility to end users of the _actual settings_ being
  1220applied to their object, which is a big discoverability win.
  1221
  1222However, it relies on the establishment and communication of a convention ("An 
  1223EffectivePolicy is right next to your affected object"), that may not be desirable.
  1224
  1225Thus its status as EXPERIMENTAL DO NOT USE YET.
  1226
  1227#### Validating Admission Controller to inform users about relevant Policy
  1228
  1229Implementations MAY supply a Validating Admission Webhook that will return a
  1230WARNING message when an applied object is affected by some Policy, which may be 
  1231an inherited or indirect one.
  1232
  1233The warning message MAY include the name, namespace, apiGroup and Kind of relevant
  1234Policy objects.
  1235
  1236##### Design Considerations
  1237
  1238Pro:
  1239
  1240- This gives object owners a very clear signal that something some Policy is
  1241  going to affect their object, at apply time, which helps a lot with discoverability.
  1242
  1243Cons:
  1244
  1245- Implementations would have to have a webhook, which is another thing to run.
  1246- The webhook will need to have the same data model that the implementation uses,
  1247  and keep track of which GatewayClasses, Gateways, Routes, and Policies are
  1248  relevant. Experience suggests this will not be a trivial engineering exercise,and will add a lot of implementation complexity.
  1249
  1250#### `kubectl` plugin or command-line tool
  1251To help improve UX and standardization, a kubectl plugin will be developed that
  1252will be capable of describing the computed sum of policy that applies to a given
  1253resource, including policies applied to parent resources.
  1254
  1255Each Policy CRD that wants to be supported by this plugin will need to follow
  1256the API structure defined above and add a `gateway.networking.k8s.io/policy: true`
  1257label to the CRD.
  1258
  1259### Conditions
  1260
  1261Implementations using Policy objects MUST include a `spec` and `status` stanza, and the `status` stanza MUST contain a `conditions` stanza, using the standard Condition format.
  1262
  1263Policy authors should consider namespacing the `conditions` stanza with a
  1264`controllerName`, as in Route status, if more than one implementation will be
  1265reconciling the Policy type.
  1266
  1267#### On `Policy` objects
  1268
  1269Controllers using the Gateway API policy attachment model MUST populate the 
  1270`Accepted` condition and reasons as defined below on policy resources to provide
  1271a consistent experience across implementations.
  1272
  1273```go
  1274// PolicyConditionType is a type of condition for a policy.
  1275type PolicyConditionType string
  1276
  1277// PolicyConditionReason is a reason for a policy condition.
  1278type PolicyConditionReason string
  1279
  1280const (
  1281  // PolicyConditionAccepted indicates whether the policy has been accepted or rejected
  1282  // by a targeted resource, and why.
  1283  //
  1284  // Possible reasons for this condition to be True are:
  1285  //
  1286  // * "Accepted"
  1287  //
  1288  // Possible reasons for this condition to be False are:
  1289  //
  1290  // * "Conflicted"
  1291  // * "Invalid"
  1292  // * "TargetNotFound"
  1293  //
  1294  PolicyConditionAccepted PolicyConditionType = "Accepted"
  1295
  1296  // PolicyReasonAccepted is used with the "Accepted" condition when the policy has been
  1297  // accepted by the targeted resource.
  1298  PolicyReasonAccepted PolicyConditionReason = "Accepted"
  1299
  1300  // PolicyReasonConflicted is used with the "Accepted" condition when the policy has not
  1301  // been accepted by a targeted resource because there is another policy that targets the same
  1302  // resource and a merge is not possible.
  1303  PolicyReasonConflicted PolicyConditionReason = "Conflicted"
  1304
  1305  // PolicyReasonInvalid is used with the "Accepted" condition when the policy is syntactically
  1306  // or semantically invalid.
  1307  PolicyReasonInvalid PolicyConditionReason = "Invalid"
  1308
  1309  // PolicyReasonTargetNotFound is used with the "Accepted" condition when the policy is attached to
  1310  // an invalid target resource
  1311  PolicyReasonTargetNotFound PolicyConditionReason = "TargetNotFound"
  1312)
  1313```
  1314
  1315#### On targeted resources
  1316
  1317(copied from [Standard Status Condition][#standard-status-condition])
  1318
  1319This solution requires definition in a GEP of its own to become binding.
  1320
  1321**The description included here is intended to illustrate the sort of solution
  1322that an eventual GEP will need to provide, _not to be a binding design.**
  1323
  1324Implementations that use Policy objects MUST put a Condition into `status.Conditions`
  1325of any objects affected by a Policy.
  1326
  1327That Condition must have a `type` ending in `PolicyAffected` (like
  1328`gateway.networking.k8s.io/PolicyAffected`),
  1329and have the optional `observedGeneration` field kept up to date when the `spec`
  1330of the Policy-attached object changes.
  1331
  1332Implementations _should_ use their own unique domain prefix for this Condition
  1333`type` - it is recommended that implementations use the same domain as in the
  1334`controllerName` field on GatewayClass (or some other implementation-unique
  1335domain for implementations that do not use GatewayClass).)
  1336
  1337For objects that do _not_ have a `status.Conditions` field available (`Secret`
  1338is a good example), that object MUST instead have an annotation of
  1339`gateway.networking.k8s.io/PolicyAffected: true` (or with an
  1340implementation-specific domain prefix) added instead.
  1341
  1342### Interaction with Custom Filters and other extension points
  1343There are multiple methods of custom extension in the Gateway API. Policy
  1344attachment and custom Route filters are two of these. Policy attachment is
  1345designed to provide arbitrary configuration fields that decorate Gateway API
  1346resources. Route filters provide custom request/response filters embedded inside
  1347Route resources. Both are extension methods for fields that cannot easily be
  1348standardized as core or extended fields of the Gateway API. The following
  1349guidance should be considered when introducing a custom field into any Gateway
  1350controller implementation:
  1351
  13521. For any given field that a Gateway controller implementation needs, the
  1353   possibility of using core or extended should always be considered before
  1354   using custom policy resources. This is encouraged to promote standardization
  1355   and, over time, to absorb capabilities into the API as first class fields,
  1356   which offer a more streamlined UX than custom policy attachment.
  1357
  13582. Although it's possible that arbitrary fields could be supported by custom
  1359   policy, custom route filters, and core/extended fields concurrently, it is
  1360   recommended that implementations only use multiple mechanisms for
  1361   representing the same fields when those fields really _need_ the defaulting
  1362   and/or overriding behavior that Policy Attachment provides. For example, a
  1363   custom filter that allowed the configuration of Authentication inside a
  1364   HTTPRoute object might also have an associated Policy resource that allowed
  1365   the filter's settings to be defaulted or overridden. It should be noted that
  1366   doing this in the absence of a solution to the status problem is likely to
  1367   be *very* difficult to troubleshoot.
  1368
  1369### Conformance Level
  1370This policy attachment pattern is associated with an "EXTENDED" conformance
  1371level. The implementations that support this policy attachment model will have
  1372the same behavior and semantics, although they may not be able to support
  1373attachment of all types of policy at all potential attachment points.
  1374
  1375### Apply Policies to Sections of a Resource
  1376Policies can target specific matches within nested objects. For instance, rather than
  1377applying a policy to the entire Gateway, we may want to attach it to a particular Gateway listener.
  1378
  1379To achieve this, an optional `sectionName` field can be set in the `targetRef` of a policy
  1380to refer to a specific listener within the target Gateway.
  1381
  1382```yaml
  1383apiVersion: gateway.networking.k8s.io/v1beta1
  1384kind: Gateway
  1385metadata:
  1386  name: foo-gateway
  1387spec:
  1388  gatewayClassName: foo-lb
  1389  listeners:
  1390  - name: bar
  1391    ...
  1392---
  1393apiVersion: networking.acme.io/v1alpha2
  1394kind: AuthenticationPolicy
  1395metadata:
  1396  name: foo
  1397spec:
  1398  provider:
  1399    issuer: "https://oidc.example.com"
  1400  targetRef:
  1401    name: foo-gateway
  1402    group: gateway.networking.k8s.io
  1403    kind: Gateway
  1404    sectionName: bar
  1405```
  1406
  1407The `sectionName` field can also be used to target a specific section of other resources:
  1408
  1409* Service.Ports.Name
  1410* xRoute.Rules.Name
  1411
  1412For example, the RetryPolicy below applies to a RouteRule inside an HTTPRoute.
  1413
  1414```yaml
  1415apiVersion: gateway.networking.k8s.io/v1alpha2
  1416kind: HTTPRoute
  1417metadata:
  1418  name: http-app-1
  1419  labels:
  1420    app: foo
  1421spec:
  1422  hostnames:
  1423  - "foo.com"
  1424  rules:
  1425  - name: bar
  1426    matches:
  1427    - path:
  1428        type: Prefix
  1429        value: /bar
  1430    forwardTo:
  1431    - serviceName: my-service1
  1432      port: 8080
  1433---
  1434apiVersion: networking.acme.io/v1alpha2
  1435kind: RetryPolicy
  1436metadata:
  1437  name: foo
  1438spec:
  1439  maxRetries: 5
  1440  targetRef:
  1441    name: foo
  1442    group: gateway.networking.k8s.io
  1443    kind: HTTPRoute
  1444    sectionName: bar
  1445```
  1446
  1447This would require adding a `name` field to those sub-resources that currently lack a name. For example,
  1448a `name` field could be added to the `RouteRule` object:
  1449```go
  1450type RouteRule struct {
  1451    // Name is the name of the Route rule. If more than one Route Rule is
  1452    // present, each Rule MUST specify a name. The names of Rules MUST be unique
  1453    // within a Route.
  1454    //
  1455    // Support: Core
  1456    //
  1457    // +kubebuilder:validation:MinLength=1
  1458    // +kubebuilder:validation:MaxLength=253
  1459    // +optional
  1460    Name string `json:"name,omitempty"`
  1461    // ...
  1462}
  1463```
  1464
  1465If a `sectionName` is specified, but does not exist on the targeted object, the Policy must fail to attach,
  1466and the policy implementation should record a `resolvedRefs` or similar Condition in the Policy's status.
  1467
  1468When multiple Policies of the same type target the same object, one with a `sectionName` specified, and one without,
  1469the one with a `sectionName` is more specific, and so will have all its settings apply. The less-specific Policy will
  1470not attach to the target.
  1471
  1472Note that the `sectionName` is currently intended to be used only for Direct Policy Attachment when references to
  1473SectionName are actually needed. Inherited Policies are always applied to the entire object. 
  1474The `PolicyTargetReferenceWithSectionName` API can be used to apply a direct Policy to a section of an object.
  1475
  1476### Advantages
  1477* Incredibly flexible approach that should work well for both ingress and mesh
  1478* Conceptually similar to existing ServicePolicy proposal and BackendPolicy
  1479  pattern
  1480* Easy to attach policy to resources we don’t control (Service, ServiceImport,
  1481  etc)
  1482* Minimal API changes required
  1483* Simplifies packaging an application for deployment as policy references do not
  1484  need to be part of the templating
  1485
  1486### Disadvantages
  1487* May be difficult to understand which policies apply to a request
  1488
  1489## Examples
  1490
  1491This section provides some examples of various types of Policy objects, and how
  1492merging, `defaults`, `overrides`, and other interactions work.
  1493
  1494### Direct Policy Attachment
  1495
  1496The following Policy sets the minimum TLS version required on a Gateway Listener:
  1497```yaml
  1498apiVersion: networking.example.io/v1alpha1
  1499kind: TLSMinimumVersionPolicy
  1500metadata:
  1501  name: minimum12
  1502  namespace: appns
  1503spec:
  1504  minimumTLSVersion: 1.2
  1505  targetRef:
  1506    name: internet
  1507    group: gateway.networking.k8s.io
  1508    kind: Gateway
  1509```
  1510
  1511Note that because there is no version controlling the minimum TLS version in the
  1512Gateway `spec`, this is an example of a non-field Policy.
  1513
  1514### Inherited Policy Attachment
  1515
  1516It also could be useful to be able to _default_ the `minimumTLSVersion` setting
  1517across multiple Gateways.
  1518
  1519This version of the above Policy allows this:
  1520```yaml
  1521apiVersion: networking.example.io/v1alpha1
  1522kind: TLSMinimumVersionPolicy
  1523metadata:
  1524  name: minimum12
  1525  namespace: appns
  1526spec:
  1527  defaults:
  1528    minimumTLSVersion: 1.2
  1529  targetRef:
  1530    name: appns
  1531    group: ""
  1532    kind: namespace
  1533```
  1534
  1535This Inherited Policy is using the implicit hierarchy that all resources belong
  1536to a namespace, so attaching a Policy to a namespace means affecting all possible
  1537resources in a namespace. Multiple hierarchies are possible, even within Gateway
  1538API, for example Gateway -> Route, Gateway -> Route -> Backend, Gateway -> Route
  1539-> Service. GAMMA Policies could conceivably use a hierarchy of Service -> Route
  1540as well.
  1541
  1542Note that this will not be very discoverable for Gateway owners in the absence of
  1543a solution to the Policy status problem. This is being worked on and this GEP will
  1544be updated once we have a design.
  1545
  1546Conceivably, a security or admin team may want to _force_ Gateways to have at least
  1547a minimum TLS version of `1.2` - that would be a job for `overrides`, like so:
  1548
  1549```yaml
  1550apiVersion: networking.example.io/v1alpha1
  1551kind: TLSMinimumVersionPolicy
  1552metadata:
  1553  name: minimum12
  1554  namespace: appns
  1555spec:
  1556  overrides:
  1557    minimumTLSVersion: 1.2
  1558  targetRef:
  1559    name: appns
  1560    group: ""
  1561    kind: namespace
  1562```
  1563
  1564This will make it so that _all Gateways_ in the `default` namespace _must_ use
  1565a minimum TLS version of `1.2`, and this _cannot_ be changed by Gateway owners.
  1566Only the Policy owner can change this Policy.
  1567
  1568### Handling non-scalar values
  1569
  1570In this example, we will assume that at some future point, HTTPRoute has grown
  1571fields to configure retries, including a field called `retryOn` that reflects
  1572the HTTP status codes that should be retried. The _value_ of this field is a
  1573list of strings, being the HTTP codes that must be retried. The `retryOn` field
  1574has no defaults in the field definitions (which is probably a bad design, but we
  1575need to show this interaction somehow!)
  1576
  1577We also assume that a Inherited `RetryOnPolicy` exists that allows both
  1578defaulting and overriding of the `retryOn` field.
  1579
  1580A full `RetryOnPolicy` to default the field to the codes `501`, `502`, and `503`
  1581would look like this:
  1582```yaml
  1583apiVersion: networking.example.io/v1alpha1
  1584kind: RetryOnPolicy
  1585metadata:
  1586  name: retryon5xx
  1587  namespace: appns
  1588spec:
  1589  defaults:
  1590    retryOn:
  1591      - "501"
  1592      - "502"
  1593      - "503"
  1594  targetRef:
  1595    kind: Gateway
  1596    group: gateway.networking.k8s.io
  1597    name: we-love-retries
  1598```
  1599
  1600This means that, for HTTPRoutes that do _NOT_ explicitly set this field to something
  1601else, (in other words, they contain an empty list), then the field will be set to
  1602a list containing `501`, `502`, and `503`. (Notably, because of Go zero values, this
  1603would also occur if the user explicitly set the value to the empty list.)
  1604
  1605However, if a HTTPRoute owner sets any value other than the empty list, then that
  1606value will remain, and the Policy will have _no effect_. These values are _not_
  1607merged.
  1608
  1609If the Policy used `overrides` instead:
  1610```yaml
  1611apiVersion: networking.example.io/v1alpha1
  1612kind: RetryOnPolicy
  1613metadata:
  1614  name: retryon5xx
  1615  namespace: appns
  1616spec:
  1617  overrides:
  1618    retryOn:
  1619      - "501"
  1620      - "502"
  1621      - "503"
  1622  targetRef:
  1623    kind: Gateway
  1624    group: gateway.networking.k8s.io
  1625    name: you-must-retry
  1626```
  1627
  1628Then no matter what the value is in the HTTPRoute, it will be set to `501`, `502`,
  1629`503` by the Policy override.
  1630
  1631### Interactions between defaults, overrides, and field values
  1632
  1633All HTTPRoutes that attach to the `YouMustRetry` Gateway will have any value
  1634_overwritten_ by this policy. The empty list, or any number of values, will all
  1635be replaced with `501`, `502`, and `503`.
  1636
  1637Now, let's also assume that we use the Namespace -> Gateway hierarchy on top of
  1638the Gateway -> HTTPRoute hierarchy, and allow attaching a `RetryOnPolicy` to a
  1639_namespace_. The expectation here is that this will affect all Gateways in a namespace
  1640and all HTTPRoutes that attach to those Gateways. (Note that the HTTPRoutes
  1641themselves may not necessarily be in the same namespace though.)
  1642
  1643If we apply the default policy from earlier to the namespace:
  1644```yaml
  1645apiVersion: networking.example.io/v1alpha1
  1646kind: RetryOnPolicy
  1647metadata:
  1648  name: retryon5xx
  1649  namespace: appns
  1650spec:
  1651  defaults:
  1652    retryOn:
  1653      - "501"
  1654      - "502"
  1655      - "503"
  1656  targetRef:
  1657    kind: Namespace
  1658    group: ""
  1659    name: appns
  1660```
  1661
  1662Then this will have the same effect as applying that Policy to every Gateway in
  1663the `default` namespace - namely that every HTTPRoute that attaches to every
  1664Gateway will have its `retryOn` field set to `501`, `502`, `503`, _if_ there is no
  1665other setting in the HTTPRoute itself.
  1666
  1667With two layers in the hierarchy, we have a more complicated set of interactions
  1668possible.
  1669
  1670Let's look at some tables for a particular HTTPRoute, assuming that it does _not_
  1671configure the `retryOn` field, for various types of Policy at different levels.
  1672
  1673#### Overrides interacting with defaults for RetryOnPolicy, empty list in HTTPRoute
  1674
  1675||None|Namespace override|Gateway override|HTTPRoute override|
  1676|----|-----|-----|----|----|
  1677|No default|Empty list|Namespace override| Gateway override Policy| HTTPRoute override|
  1678|Namespace default| Namespace default| Namespace override | Gateway override | HTTPRoute override |
  1679|Gateway default| Gateway default | Namespace override | Gateway override | HTTPRoute override |
  1680|HTTPRoute default| HTTPRoute default | Namespace override | Gateway override | HTTPRoute override|
  1681
  1682#### Overrides interacting with other overrides for RetryOnPolicy, empty list in HTTPRoute
  1683||No override|Namespace override A|Gateway override A|HTTPRoute override A|
  1684|----|-----|-----|----|----|
  1685|No override|Empty list|Namespace override| Gateway override| HTTPRoute override|
  1686|Namespace override B| Namespace override B| Namespace override<br />first created wins<br />otherwise first alphabetically | Namespace override B | Namespace override B|
  1687|Gateway override B| Gateway override B | Namespace override A| Gateway override<br />first created wins<br />otherwise first alphabetically | Gateway override B|
  1688|HTTPRoute override B| HTTPRoute override B | Namespace override A| Gateway override A| HTTPRoute override<br />first created wins<br />otherwise first alphabetically|
  1689
  1690#### Defaults interacting with other defaults for RetryOnPolicy, empty list in HTTPRoute
  1691||No default|Namespace default A|Gateway default A|HTTPRoute default A|
  1692|----|-----|-----|----|----|
  1693|No default|Empty list|Namespace default| Gateway default| HTTPRoute default A|
  1694|Namespace default B| Namespace default B| Namespace default<br />first created wins<br />otherwise first alphabetically | Gateway default A | HTTPRoute default A|
  1695|Gateway default B| Gateway default B| Gateway default B| Gateway default<br />first created wins<br />otherwise first alphabetically | HTTPRoute default A|
  1696|HTTPRoute default B| HTTPRoute default B| HTTPRoute default B| HTTPRoute default B| HTTPRoute default<br />first created wins<br />otherwise first alphabetically|
  1697
  1698
  1699Now, if the HTTPRoute _does_ specify a RetryPolicy,
  1700it's a bit easier, because we can basically disregard all defaults:
  1701
  1702#### Overrides interacting with defaults for RetryOnPolicy, value in HTTPRoute
  1703
  1704||None|Namespace override|Gateway override|HTTPRoute override|
  1705|----|-----|-----|----|----|
  1706|No default| Value in HTTPRoute|Namespace override| Gateway override | HTTPRoute override|
  1707|Namespace default|  Value in HTTPRoute| Namespace override | Gateway override | HTTPRoute override |
  1708|Gateway default|  Value in HTTPRoute | Namespace override | Gateway override | HTTPRoute override |
  1709|HTTPRoute default| Value in HTTPRoute | Namespace override | Gateway override | HTTPRoute override|
  1710
  1711#### Overrides interacting with other overrides for RetryOnPolicy, value in HTTPRoute
  1712||No override|Namespace override A|Gateway override A|HTTPRoute override A|
  1713|----|-----|-----|----|----|
  1714|No override|Value in HTTPRoute|Namespace override A| Gateway override A| HTTPRoute override A|
  1715|Namespace override B| Namespace override B| Namespace override<br />first created wins<br />otherwise first alphabetically | Namespace override B| Namespace override B|
  1716|Gateway override B| Gateway override B| Namespace override A| Gateway override<br />first created wins<br />otherwise first alphabetically | Gateway override B|
  1717|HTTPRoute override B| HTTPRoute override B | Namespace override A| Gateway override A| HTTPRoute override<br />first created wins<br />otherwise first alphabetically|
  1718
  1719#### Defaults interacting with other defaults for RetryOnPolicy, value in HTTPRoute
  1720||No default|Namespace default A|Gateway default A|HTTPRoute default A|
  1721|----|-----|-----|----|----|
  1722|No default|Value in HTTPRoute|Value in HTTPRoute|Value in HTTPRoute|Value in HTTPRoute|
  1723|Namespace default B|Value in HTTPRoute|Value in HTTPRoute|Value in HTTPRoute|Value in HTTPRoute|
  1724|Gateway default B|Value in HTTPRoute|Value in HTTPRoute|Value in HTTPRoute|Value in HTTPRoute|
  1725|HTTPRoute default B|Value in HTTPRoute|Value in HTTPRoute|Value in HTTPRoute|Value in HTTPRoute|
  1726
  1727
  1728## Removing BackendPolicy
  1729BackendPolicy represented the initial attempt to cover policy attachment for
  1730Gateway API. Although this proposal ended up with a similar structure to
  1731BackendPolicy, it is not clear that we ever found sufficient value or use cases
  1732for BackendPolicy. Given that this proposal provides more powerful ways to
  1733attach policy, BackendPolicy was removed.
  1734
  1735## Alternatives
  1736
  1737### 1. ServiceBinding for attaching Policies and Routes for Mesh
  1738A new ServiceBinding resource has been proposed for mesh use cases. This would
  1739provide a way to attach policies, including Routes to a Service.
  1740
  1741Most notably, these provide a way to attach different policies to requests
  1742coming from namespaces or specific Gateways. In the example below, a
  1743ServiceBinding in the consumer namespace would be applied to the selected
  1744Gateway and affect all requests from that Gateway to the foo Service. Beyond
  1745policy attachment, this would also support attaching Routes as policies, in this
  1746case the attached HTTPRoute would split requests between the foo-a and foo-b
  1747Service instead of the foo Service.
  1748
  1749![Simple Service Binding Example](images/713-servicebinding-simple.png)
  1750
  1751This approach can be used to attach a default set of policies to all requests
  1752coming from a namespace. The example below shows a ServiceBinding defined in the
  1753producer namespace that would apply to all requests from within the same
  1754namespace or from other namespaces that did not have their own ServiceBindings
  1755defined.
  1756
  1757![Complex Service Binding Example](images/713-servicebinding-complex.png)
  1758
  1759#### Advantages
  1760* Works well for mesh and any use cases where requests don’t always transit
  1761  through Gateways and Routes.
  1762* Allows policies to apply to an entire namespace.
  1763* Provides very clear attachment of polices, routes, and more to a specific
  1764  Service.
  1765* Works well for ‘shrink-wrap application developers’ - the packaged app does
  1766  not need to know about hostnames or policies or have extensive templates.
  1767* Works well for ‘dynamic’ / programmatic creation of workloads ( Pods,etc - see
  1768  CertManager)
  1769* It is easy to understand what policy applies to a workload - by listing the
  1770  bindings in the namespace.
  1771
  1772#### Disadvantages
  1773* Unclear how this would work with an ingress model. If Gateways, Routes, and
  1774  Backends are all in different namespaces, and each of those namespaces has
  1775  different ServiceBindings applying different sets of policies, it’s difficult
  1776  to understand which policy would be applied.
  1777* Unclear if/how this would interact with existing the ingress focused policy
  1778  proposal described below. If both coexisted, would it be possible for a user
  1779  to understand which policies were being applied to their requests?
  1780* Route status could get confusing when Routes were referenced as a policy by
  1781  ServiceBinding
  1782* Introduces a new mesh specific resource.
  1783
  1784### 2. Attaching Policies for Ingress
  1785An earlier proposal for policy attachment in the Gateway API suggested adding
  1786policy references to each Resource. This works very naturally for Ingress use
  1787cases where all requests follow a path through Gateways, Routes, and Backends.
  1788Adding policy attachment at each level enables different roles to define
  1789defaults and allow overrides at different levels.
  1790
  1791![Simple Ingress Attachment Example](images/713-ingress-attachment.png)
  1792
  1793#### Advantages
  1794* Consistent policy attachment at each level
  1795* Clear which policies apply to each component
  1796* Naturally translates to hierarchical Ingress model with ability to delegate
  1797  policy decisions to different roles
  1798
  1799#### Disadvantages
  1800* Policy overrides could become complicated
  1801* At least initially, policy attachment on Service would have to rely on Service
  1802  annotations or references from policy to Service(s)
  1803* No way to attach policy to other resources such as namespace or ServiceImport
  1804* May be difficult to modify Routes and Services if other components/roles are
  1805  managing them (eg Knative)
  1806
  1807### 3. Shared Policy Resource
  1808This is really just a slight variation or extension of the main proposal in this
  1809GEP. We would introduce a shared policy resource. This resource would follow the
  1810guidelines described above, including the `targetRef` as defined as well as
  1811`default` and `override` fields. Instead of carefully crafted CRD schemas for
  1812each of the `default` and `override` fields, we would use more generic
  1813`map[string]string` values. This would allow similar flexibility to annotations
  1814while still enabling the default and override concepts that are key to this
  1815proposal.
  1816
  1817Unfortunately this would be difficult to validate and would come with many of
  1818the downsides of annotations. A validating webhook would be required for any
  1819validation which could result in just as much or more work to maintain than
  1820CRDs. At this point we believe that the best experience will be from
  1821implementations providing their own policy CRDs that follow the patterns
  1822described in this GEP. We may want to explore tooling or guidance to simplify
  1823the creation of these policy CRDs to help simplify implementation and extension
  1824of this API.
  1825
  1826## References
  1827
  1828**Issues**
  1829* [Extensible Service Policy and Configuration](https://github.com/kubernetes-sigs/gateway-api/issues/611)
  1830
  1831**Docs**
  1832* [Policy Attachment and Binding](https://docs.google.com/document/d/13fyptUtO9NV_ZAgkoJlfukcBf2PVGhsKWG37yLkppJo/edit?resourcekey=0-Urhtj9gBkGBkSL1gHgbWKw)

View as plain text