1## Anti-Patterns of Client creation
2
3### How to properly create a `ClientConn`: `grpc.NewClient`
4
5[`grpc.NewClient`](https://pkg.go.dev/google.golang.org/grpc#NewClient) is the
6function in the gRPC library that creates a virtual connection from a client
7application to a gRPC server. It takes a target URI (which represents the name
8of a logical backend service and resolves to one or more physical addresses) and
9a list of options, and returns a
10[`ClientConn`](https://pkg.go.dev/google.golang.org/grpc#ClientConn) object that
11represents the virtual connection to the server. The `ClientConn` contains one
12or more actual connections to real servers and attempts to maintain these
13connections by automatically reconnecting to them when they break. `NewClient`
14was introduced in gRPC-Go v1.63.
15
16### The wrong way: `grpc.Dial`
17
18[`grpc.Dial`](https://pkg.go.dev/google.golang.org/grpc#Dial) is a deprecated
19function that also creates the same virtual connection pool as `grpc.NewClient`.
20However, unlike `grpc.NewClient`, it immediately starts connecting and supports
21a few additional `DialOption`s that control this initial connection attempt.
22These are: `WithBlock`, `WithTimeout`, `WithReturnConnectionError`, and
23`FailOnNonTempDialError.
24
25That `grpc.Dial` creates connections immediately is not a problem in and of
26itself, but this behavior differs from how gRPC works in all other languages,
27and it can be convenient to have a constructor that does not perform I/O. It
28can also be confusing to users, as most people expect a function called `Dial`
29to create _a_ connection which may need to be recreated if it is lost.
30
31`grpc.Dial` uses "passthrough" as the default name resolver for backward
32compatibility while `grpc.NewClient` uses "dns" as its default name resolver.
33This subtle diffrence is important to legacy systems that also specified a
34custom dialer and expected it to receive the target string directly.
35
36For these reasons, using `grpc.Dial` is discouraged. Even though it is marked
37as deprecated, we will continue to support it until a v2 is released (and no
38plans for a v2 exist at the time this was written).
39
40### Especially bad: using deprecated `DialOptions`
41
42`FailOnNonTempDialError`, `WithBlock`, and `WithReturnConnectionError` are three
43`DialOption`s that are only supported by `Dial` because they only affect the
44behavior of `Dial` itself. `WithBlock` causes `Dial` to wait until the
45`ClientConn` reports its `State` as `connectivity.Connected`. The other two deal
46with returning connection errors before the timeout (`WithTimeout` or on the
47context when using `DialContext`).
48
49The reason these options can be a problem is that connections with a
50`ClientConn` are dynamic -- they may come and go over time. If your client
51successfully connects, the server could go down 1 second later, and your RPCs
52will fail. "Knowing you are connected" does not tell you much in this regard.
53
54Additionally, _all_ RPCs created on an "idle" or a "connecting" `ClientConn`
55will wait until their deadline or until a connection is established before
56failing. This means that you don't need to check that a `ClientConn` is "ready"
57before starting your RPCs. By default, RPCs will fail if the `ClientConn`
58enters the "transient failure" state, but setting `WaitForReady(true)` on a
59call will cause it to queue even in the "transient failure" state, and it will
60only ever fail due to a deadline, a server response, or a connection loss after
61the RPC was sent to a server.
62
63Some users of `Dial` use it as a way to validate the configuration of their
64system. If you wish to maintain this behavior but migrate to `NewClient`, you
65can call `State` and `WaitForStateChange` until the channel is connected.
66However, if this fails, it does not mean that your configuration was bad - it
67could also mean the service is not reachable by the client due to connectivity
68reasons.
69
70## Best practices for error handling in gRPC
71
72Instead of relying on failures at dial time, we strongly encourage developers to
73rely on errors from RPCs. When a client makes an RPC, it can receive an error
74response from the server. These errors can provide valuable information about
75what went wrong, including information about network issues, server-side errors,
76and incorrect usage of the gRPC API.
77
78By handling errors from RPCs correctly, developers can write more reliable and
79robust gRPC applications. Here are some best practices for error handling in
80gRPC:
81
82- Always check for error responses from RPCs and handle them appropriately.
83- Use the `status` field of the error response to determine the type of error
84 that occurred.
85- When retrying failed RPCs, consider using the built-in retry mechanism
86 provided by gRPC-Go, if available, instead of manually implementing retries.
87 Refer to the [gRPC-Go retry example
88 documentation](https://github.com/grpc/grpc-go/blob/master/examples/features/retry/README.md)
89 for more information. Note that this is not a substitute for client-side
90 retries as errors that occur after an RPC starts on a server cannot be
91 retried through gRPC's built-in mechanism.
92- If making an outgoing RPC from a server handler, be sure to translate the
93 status code before returning the error from your method handler. For example,
94 if the error is an `INVALID_ARGUMENT` status code, that probably means
95 your service has a bug (otherwise it shouldn't have triggered this error), in
96 which case `INTERNAL` is more appropriate to return back to your users.
97
98### Example: Handling errors from an RPC
99
100The following code snippet demonstrates how to handle errors from an RPC in
101gRPC:
102
103```go
104ctx, cancel := context.WithTimeout(context.Background(), time.Second)
105defer cancel()
106
107res, err := client.MyRPC(ctx, &MyRequest{})
108if err != nil {
109 // Handle the error appropriately,
110 // log it & return an error to the caller, etc.
111 log.Printf("Error calling MyRPC: %v", err)
112 return nil, err
113}
114
115// Use the response as appropriate
116log.Printf("MyRPC response: %v", res)
117```
118
119To determine the type of error that occurred, you can use the status field of
120the error response:
121
122```go
123resp, err := client.MakeRPC(context.TODO(), request)
124if err != nil {
125 if status, ok := status.FromError(err); ok {
126 // Handle the error based on its status code
127 if status.Code() == codes.NotFound {
128 log.Println("Requested resource not found")
129 } else {
130 log.Printf("RPC error: %v", status.Message())
131 }
132 } else {
133 // Handle non-RPC errors
134 log.Printf("Non-RPC error: %v", err)
135 }
136 return
137}
138
139// Use the response as needed
140log.Printf("Response received: %v", resp)
141```
142
143### Example: Using a backoff strategy
144
145When retrying failed RPCs, use a backoff strategy to avoid overwhelming the
146server or exacerbating network issues:
147
148```go
149var res *MyResponse
150var err error
151
152retryableStatusCodes := map[codes.Code]bool{
153 codes.Unavailable: true, // etc
154}
155
156// Retry the RPC a maximum number of times.
157for i := 0; i < maxRetries; i++ {
158 // Make the RPC.
159 res, err = client.MyRPC(context.TODO(), &MyRequest{})
160
161 // Check if the RPC was successful.
162 if !retryableStatusCodes[status.Code(err)] {
163 // The RPC was successful or errored in a non-retryable way;
164 // do not retry.
165 break
166 }
167
168 // The RPC is retryable; wait for a backoff period before retrying.
169 backoff := time.Duration(i+1) * time.Second
170 log.Printf("Error calling MyRPC: %v; retrying in %v", err, backoff)
171 time.Sleep(backoff)
172}
173
174// Check if the RPC was successful after all retries.
175if err != nil {
176 // All retries failed, so handle the error appropriately
177 log.Printf("Error calling MyRPC: %v", err)
178 return nil, err
179}
180
181// Use the response as appropriate.
182log.Printf("MyRPC response: %v", res)
183```
View as plain text