1# GEP-1742: HTTPRoute Timeouts
2
3* Issue: [#1742](https://github.com/kubernetes-sigs/gateway-api/issues/1742)
4* Status: Experimental
5
6(See status definitions [here](overview.md#status).)
7
8## TLDR
9
10Create some sort of design so that Gateway API objects can be used to configure
11timeouts for different types of connection.
12
13## Goals
14
15- Create some method to configure some timeouts.
16- Timeout config must be applicable to most if not all Gateway API implementations.
17
18## Non-Goals
19
20- A standard API for every possible timeout that implementations may support.
21
22## Introduction
23
24In talking about Gateway API objects, particularly HTTPRoute, we've mentioned
25timeout configuration many times in the past as "too hard" to find the common
26ground necessary to make more generic configuration. This GEP intends firstly
27to make this process less difficult, then to find common timeouts that we can
28build into Gateway API.
29
30For this initial round, we'll focus on Layer 7 HTTP traffic, while acknowledging
31that Layer 4 connections have their own interesting timeouts as well.
32
33The following sections will review all the implementations, then document what
34timeouts are _available_ for the various data planes.
35
36### Background on implementations
37
38Most implementations that handle HTTPRoute objects use a proxy as the data plane
39implementation, that actually forwards flows as directed by Gateway API configuration.
40
41The following table is a review of all the listed implementations of Gateway API
42at the time of writing, with the data plane they use for Layer 7, based on what information
43could be found online. If there are errors here, or if the implementation doesn't
44support layer 7, please feel free to correct them.
45
46| Implementation | Data Plane |
47|----------------|------------|
48| Acnodal EPIC | Envoy |
49| Apache APISIX | Nginx |
50| BIG-IP Kubernetes Gateway| F5 BIG-IP |
51| Cilium | Envoy |
52| Contour | Envoy |
53| Emissary Ingress| Envoy |
54| Envoy Gateway | Envoy |
55| Flomesh Service Mesh | Pipy |
56| Gloo Edge | Envoy |
57| Google Kubernetes Engine (GKE) | Similar to Envoy Timeouts |
58| HAProxy Ingress | HAProxy |
59| Hashicorp Consul | Envoy |
60| Istio | Envoy |
61| Kong | Nginx |
62| Kuma | Envoy |
63| Litespeed | Litespeed WebADC |
64| NGINX Gateway Fabric | Nginx |
65| Traefik | Traefik |
66
67
68### Flow diagrams with available timeouts
69
70The following flow diagrams are based off the basic diagram below, with all the
71timeouts I could find included.
72
73In general, timeouts are recorded with the setting name or similar that the data
74plane uses for them, and are correct as far as I've parsed the documentation
75correctly.
76
77Idle timeouts are marked as such.
78
79```mermaid
80sequenceDiagram
81 participant C as Client
82 participant P as Proxy
83 participant U as Upstream
84 C->>P: Connection Started
85 C->>P: Starts sending Request
86 C->>P: Finishes Headers
87 C->>P: Finishes request
88 P->>U: Connection Started
89 P->>U: Starts sending Request
90 P->>U: Finishes request
91 P->>U: Finishes Headers
92 U->>P: Starts Response
93 U->>P: Finishes Headers
94 U->>P: Finishes Response
95 P->>C: Starts Response
96 P->>C: Finishes Headers
97 P->>C: Finishes Response
98 Note right of P: Repeat if connection sharing
99 U->>C: Connection ended
100```
101
102#### Envoy Timeouts
103
104For Envoy, some timeouts are configurable at either the HTTP Connection Manager
105(very, very roughly equivalent to a Listener), the Route (equivalent to a HTTPRoute)
106level, or the Cluster (usually close to the Service) or some combination. These
107are noted in the below diagram with a `CM`, `R`, or `Cluster` prefix respectively.
108
109```mermaid
110sequenceDiagram
111 participant C as Client
112 participant P as Envoy
113 participant U as Upstream
114 C->>P: Connection Started
115 activate P
116 Note left of P: transport_socket_connect_timeout for TLS
117 deactivate P
118 C->>P: Starts sending Request
119 activate C
120 activate P
121 activate P
122 C->>P: Finishes Headers
123 note left of P: CM request_headers_timeout
124 C->>P: Finishes request
125 deactivate P
126 activate U
127 note left of U: Cluster connect_timeout
128 deactivate U
129 P->>U: Connection Started
130 activate U
131 note right of U: CM idle_timeout<br />CM max_connection_duration
132 P->>U: Starts sending Request
133 P->>U: Finishes Headers
134 note left of P: CM request_timeout
135 P->>U: Finishes request
136 deactivate P
137 activate U
138 U->>P: Starts Response
139 U->>P: Finishes Headers
140 note right of U: R timeout<br/>R per_try_timeout<br/>R per_try_idle_timeout
141 U->>P: Finishes Response
142 deactivate U
143 P->>C: Starts Response
144 P->>C: Finishes Headers
145 P->>C: Finishes Response
146 Note left of C: CM stream_idle_timeout<br />R idle_timeout<br />CM,R max_stream_duration<br/>TCP proxy idle_timeout<br />TCP protocol idle_timeout
147 deactivate C
148 Note right of P: Repeat if connection sharing
149 U->>C: Connection ended
150 deactivate U
151```
152
153#### Nginx timeouts
154
155Nginx allows setting of GRPC and general HTTP timeouts separately, although the
156purposes seem to be roughly equivalent.
157
158```mermaid
159sequenceDiagram
160 participant C as Client
161 participant P as Nginx
162 participant U as Upstream
163 C->>P: Connection Started
164 activate P
165 C->>P: Starts sending Request
166 C->>P: Finishes Headers
167 Note right of P: client_headers_timeout
168 deactivate P
169 activate P
170 C->>P: Finishes request
171 deactivate P
172 Note right of P: client_body_timeout
173 activate U
174 note left of U: proxy_connect_timeout<br/>grpc_connect_timeout
175 deactivate U
176 P->>U: Connection Started
177 Activate U
178 Activate U
179 P->>U: Starts sending Request
180 P->>U: Finishes Headers
181 P->>U: Finishes request
182 Note right of U: (between write operations)<br/>proxy_send_timeout<br/>grpc_send_timeout
183 deactivate U
184 activate U
185 U->>P: Starts Response
186 U->>P: Finishes Headers
187 Note right of U: (between read operations)<br/>proxy_read_timeout<br/>grpc_read_timeout
188 U->>P: Finishes Response
189 deactivate U
190 activate P
191 P->>C: Starts Response
192 P->>C: Finishes Headers
193 P->>C: Finishes Response
194 deactivate P
195 Note left of P: send_timeout (only between two successive write operations)
196 Note left of C: Repeat if connection is shared until server's keepalive_timeout is hit
197 Note Right of U: upstream's keepalive_timeout (if keepalive enabled)
198 U->>C: Connection ended
199 deactivate U
200```
201
202#### HAProxy timeouts
203
204```mermaid
205sequenceDiagram
206 participant C as Client
207 participant P as Proxy
208 participant U as Upstream
209
210 C->>P: Connection Started
211 activate U
212 activate C
213 activate P
214 note left of P: timeout client (idle)
215 C->>P: Starts sending Request
216 C->>P: Finishes Headers
217 C->>P: Finishes request
218 note left of C: timeout http-request
219 deactivate C
220 activate C
221 note left of C: timeout client-fin
222 deactivate C
223 deactivate P
224 activate U
225 note left of U: timeout queue<br/>(wait for available server)
226 deactivate U
227
228 P->>U: Connection Started
229 activate U
230 P->>U: Starts sending Request
231 activate U
232 P->>U: Finishes Headers
233 P->>U: Finishes request
234
235 note right of U: timeout connect
236 deactivate U
237 note left of U: timeout server<br/>(idle timeout)
238 deactivate U
239 activate U
240 note left of U: timeout server-fin
241 deactivate U
242 U->>P: Starts Response
243 U->>P: Finishes Headers
244 U->>P: Finishes Response
245 P->>C: Starts Response
246 P->>C: Finishes Headers
247 P->>C: Finishes Response
248 activate C
249 note left of C: timeout http-keep-alive
250 deactivate C
251 Note right of P: Repeat if connection sharing
252 Note right of U: timeout tunnel<br/>(for upgraded connections)
253 deactivate U
254 U->>C: Connection ended
255
256```
257
258#### Traefik timeouts
259
260```mermaid
261sequenceDiagram
262 participant C as Client
263 participant P as Proxy
264 participant U as Upstream
265 C->>P: Connection Started
266 activate U
267 C->>P: Starts sending Request
268 activate P
269 C->>P: Finishes Headers
270 Note right of P: respondingTimeouts<br/>readTimeout
271 C->>P: Finishes request
272 deactivate P
273 P->>U: Connection Started
274 activate U
275 Note right of U: forwardingTimeouts<br/>dialTimeout
276 deactivate U
277 P->>U: Starts sending Request
278 P->>U: Finishes request
279 P->>U: Finishes Headers
280 U->>P: Starts Response
281 activate U
282 note right of U: forwardingTimeouts<br/>responseHeaderTimeout
283 U->>P: Finishes Headers
284 deactivate U
285 U->>P: Finishes Response
286 P->>C: Starts Response
287 activate P
288 P->>C: Finishes Headers
289 Note right of P: respondingTimeouts<br/>writeTimeout
290 P->>C: Finishes Response
291 deactivate P
292 Note right of P: Repeat if connection sharing
293 Note right of U: respondingTimeouts<br/>idleTimeout<br/>Keepalive connections only
294 deactivate U
295 U->>C: Connection ended
296
297```
298#### F5 BIG-IP Timeouts
299
300Could not find any HTTP specific timeouts. PRs welcomed. 😊
301
302#### Pipy Timeouts
303
304Could not find any HTTP specific timeouts. PRs welcomed. 😊
305
306#### Litespeed WebADC Timeouts
307
308Could not find any HTTP specific timeouts. PRs welcomed. 😊
309
310## API
311
312The above diagrams show that there are many different kinds of configurable timeouts
313supported by Gateway implementations: connect, idle, request, upstream, downstream.
314Although there may be opportunity for the specification of a common API for more of
315them in the future, this GEP will focus on the L7 timeouts in HTTPRoutes that are
316most valuable to clients.
317
318From the above analysis, it appears that most implementations are capable of
319supporting the configuration of simple client downstream request timeouts on HTTPRoute
320rules. This is a relatively small addition that would benefit many users.
321
322Some implementations support configuring a timeout for individual backend requests,
323separate from the overall client request timeout. This is particularly useful if a
324client HTTP request to a gateway can result in more than one call from the gateway
325to the destination backend service, for example, if automatic retries are supported.
326Adding support for this would also benefit many users.
327
328### Timeout values
329
330There are 2 kinds of timeouts that can be configured in an `HTTPRouteRule`:
331
3321. `timeouts.request` is the timeout for the Gateway API implementation to send a
333 response to a client HTTP request. Whether the gateway starts the timeout before
334 or after the entire client request stream has been received, is implementation dependent.
335 This field is optional `Extended` support.
336
3371. `timeouts.backendRequest` is a timeout for a single request from the gateway to a backend.
338 This field is optional `Extended` support. Typically used in conjunction with retry configuration,
339 if supported by an implementation.
340 Note that retry configuration will be the subject of a separate GEP (GEP-1731).
341
342```mermaid
343sequenceDiagram
344 participant C as Client
345 participant P as Proxy
346 participant U as Upstream
347 C->>P: Connection Started
348 note left of P: timeouts.request start time (min)
349 C->>P: Starts sending Request
350 C->>P: Finishes Headers
351 C->>P: Finishes request
352 note left of P: timeouts.request start time (max)
353 P->>U: Connection Started
354 note right of P: timeouts.backendRequest start time
355 P->>U: Starts sending Request
356 P->>U: Finishes request
357 P->>U: Finishes Headers
358 U->>P: Starts Response
359 U->>P: Finishes Headers
360 note right of P: timeouts.backendRequest end time
361 note left of P: timeouts.request end time
362 U->>P: Finishes Response
363 note right of P: Repeat if retry
364 P->>C: Starts Response
365 P->>C: Finishes Headers
366 P->>C: Finishes Response
367 Note right of P: Repeat if connection sharing
368 U->>C: Connection ended
369```
370
371Both timeout fields are [GEP-2257 Duration] values. A zero-valued timeout
372("0s") MUST be interpreted as disabling the timeout; a non-zero-valued timeout
373MUST be >= 1ms.
374
375[GEP-2257 Duration]:/geps/gep-2257/
376
377### GO
378
379```go
380type HTTPRouteRule struct {
381 // Timeouts defines the timeouts that can be configured for an HTTP request.
382 //
383 // Support: Extended
384 //
385 // +optional
386 // <gateway:experimental>
387 Timeouts *HTTPRouteTimeouts `json:"timeouts,omitempty"`
388
389 // ...
390}
391
392// HTTPRouteTimeouts defines timeouts that can be configured for an HTTPRoute.
393// Timeout values are represented with Gateway API Duration formatting.
394// Specifying a zero value such as "0s" is interpreted as no timeout.
395//
396// +kubebuilder:validation:XValidation:message="backendRequest timeout cannot be longer than request timeout",rule="!(has(self.request) && has(self.backendRequest) && duration(self.request) != duration('0s') && duration(self.backendRequest) > duration(self.request))"
397type HTTPRouteTimeouts struct {
398 // Request specifies the maximum duration for a gateway to respond to an HTTP request.
399 // If the gateway has not been able to respond before this deadline is met, the gateway
400 // MUST return a timeout error.
401 //
402 // For example, setting the `rules.timeouts.request` field to the value `10s` in an
403 // `HTTPRoute` will cause a timeout if a client request is taking longer than 10 seconds
404 // to complete.
405 //
406 // This timeout is intended to cover as close to the whole request-response transaction
407 // as possible although an implementation MAY choose to start the timeout after the entire
408 // request stream has been received instead of immediately after the transaction is
409 // initiated by the client.
410 //
411 // When this field is unspecified, request timeout behavior is implementation-specific.
412 //
413 // Support: Extended
414 //
415 // +optional
416 Request *Duration `json:"request,omitempty"`
417
418 // BackendRequest specifies a timeout for an individual request from the gateway
419 // to a backend. This covers the time from when the request first starts being
420 // sent from the gateway to when the full response has been received from the backend.
421 //
422 // An entire client HTTP transaction with a gateway, covered by the Request timeout,
423 // may result in more than one call from the gateway to the destination backend,
424 // for example, if automatic retries are supported.
425 //
426 // Because the Request timeout encompasses the BackendRequest timeout, the value of
427 // BackendRequest must be <= the value of Request timeout.
428 //
429 // Support: Extended
430 //
431 // +optional
432 BackendRequest *Duration `json:"backendRequest,omitempty"`
433}
434
435// Duration is a string value representing a duration in time. The foramat is as specified
436// in GEP-2257, a strict subset of the syntax parsed by Golang time.ParseDuration.
437//
438// +kubebuilder:validation:Pattern=`^([0-9]{1,5}(h|m|s|ms)){1,4}$`
439type Duration string
440```
441
442### YAML
443
444```yaml
445apiVersion: gateway.networking.k8s.io/v1beta1
446kind: HTTPRoute
447metadata:
448 name: timeout-example
449spec:
450 ...
451 rules:
452 - backendRefs:
453 - name: some-service
454 port: 8080
455 timeouts:
456 request: 10s
457 backendRequest: 2s
458```
459
460### Conformance Details
461
462Gateway implementations can indicate support for the optional behavior in this GEP using
463the following feature names:
464
465- `HTTPRouteRequestTimeout`: supports `rules.timeouts.request` in an `HTTPRoute`.
466- `HTTPRouteBackendTimeout`: supports `rules.timeouts.backendRequest` in an `HTTPRoute`.
467
468## Alternatives
469
470Timeouts could be configured using policy attachments or in objects other than `HTTPRouteRule`.
471
472### Policy Attachment
473
474Instead of configuring timeouts directly on an API object, they could be configured using policy
475attachments. The advantage to this approach would be that timeout policies can be not only
476configured for an `HTTPRouteRule`, but can also be added/overriden at a more fine
477(e.g., `HTTPBackendRef`) or course (e.g. `HTTPRoute`) level of granularity.
478
479The downside, however, is complexity introduced for the most common use case, adding a simple
480timeout for an HTTP request. Setting a single field in the route rule, instead of needing to
481create a policy resource, for this simple case seems much better.
482
483In the future, we could consider using policy attachments to configure less common kinds of
484timeouts that may be needed, but it would probably be better to instead extend the proposed API
485to support those timeouts as well.
486
487The default values of the proposed timeout fields could also be overridden
488using policy attachments in the future. For example, a policy attachment could be used to set the
489default value of `rules.timeouts.request` for all routes under an `HTTPRoute` or `Gateway`.
490
491### Other API Objects
492
493The new timeouts field could be added to a different API struct, instead of `HTTPRouteRule`.
494
495Putting it on an `HTTPBackendRef`, for example, would allow users to set different timeouts for different
496backends. This is a feature that we believe has not been requested by existing proxy or service mesh
497clients and is also not implementable using available timeouts of most proxies.
498
499Another alternative is to move the timeouts configuration up a level in the API to `HTTPRoute`. This
500would be convenient when a user wants the same timeout on all rules, but would be overly restrictive.
501Using policy attachments to override the default timeout value for all rules, as described in the
502previous section, is likely a better way to handle timeout configuration above the route rule level.
503
504## References
505
506(Add any additional document links. Again, we should try to avoid
507too much content not in version control to avoid broken links)
View as plain text