1# Checking Go API Compatibility
2
3The `apidiff` tool in this directory determines whether two examples of a
4package or module are compatible. The goal is to help the developer make an
5informed choice of semantic version after they have changed the code of their
6module.
7
8`apidiff` reports two kinds of changes: incompatible ones, which require
9incrementing the major part of the semantic version, and compatible ones, which
10require a minor version increment. If no API changes are reported but there are
11code changes that could affect client code, then the patch version should
12be incremented.
13
14`apidiff` may be used to display API differences between any two packages or
15modules, not just different versions of the same thing. It does this by ignoring
16the package import paths when directly comparing two packages, and
17by ignoring module paths when comparing two modules. That is to say, when
18comparing two modules, the package import paths **do** matter, but are compared
19_relative_ to their respective module root.
20
21## Compatibility Desiderata
22
23Any tool that checks compatibility can offer only an approximation. No tool can
24detect behavioral changes; and even if it could, whether a behavioral change is
25a breaking change or not depends on many factors, such as whether it closes a
26security hole or fixes a bug. Even a change that causes some code to fail to
27compile may not be considered a breaking change by the developers or their
28users. It may only affect code marked as experimental or unstable, for
29example, or the break may only manifest in unlikely cases.
30
31For a tool to be useful, its notion of compatibility must be relaxed enough to
32allow reasonable changes, like adding a field to a struct, but strict enough to
33catch significant breaking changes. A tool that is too lax will miss important
34incompatibilities, and users will stop trusting it; one that is too strict may
35generate so much noise that users will ignore it.
36
37To a first approximation, this tool reports a change as incompatible if it could
38cause client code to stop compiling. But `apidiff` ignores five ways in which
39code may fail to compile after a change. Three of them are mentioned in the
40[Go 1 Compatibility Guarantee](https://golang.org/doc/go1compat).
41
42### Unkeyed Struct Literals
43
44Code that uses an unkeyed struct literal would fail to compile if a field was
45added to the struct, making any such addition an incompatible change. An example:
46
47```
48// old
49type Point struct { X, Y int }
50
51// new
52type Point struct { X, Y, Z int }
53
54// client
55p := pkg.Point{1, 2} // fails in new because there are more fields than expressions
56```
57Here and below, we provide three snippets: the code in the old version of the
58package, the code in the new version, and the code written in a client of the package,
59which refers to it by the name `pkg`. The client code compiles against the old
60code but not the new.
61
62### Embedding and Shadowing
63
64Adding an exported field to a struct can break code that embeds that struct,
65because the newly added field may conflict with an identically named field
66at the same struct depth. A selector referring to the latter would become
67ambiguous and thus erroneous.
68
69
70```
71// old
72type Point struct { X, Y int }
73
74// new
75type Point struct { X, Y, Z int }
76
77// client
78type z struct { Z int }
79
80var v struct {
81 pkg.Point
82 z
83}
84
85_ = v.Z // fails in new
86```
87In the new version, the last line fails to compile because there are two embedded `Z`
88fields at the same depth, one from `z` and one from `pkg.Point`.
89
90
91### Using an Identical Type Externally
92
93If it is possible for client code to write a type expression representing the
94underlying type of a defined type in a package, then external code can use it in
95assignments involving the package type, making any change to that type incompatible.
96```
97// old
98type Point struct { X, Y int }
99
100// new
101type Point struct { X, Y, Z int }
102
103// client
104var p struct { X, Y int } = pkg.Point{} // fails in new because of Point's extra field
105```
106Here, the external code could have used the provided name `Point`, but chose not
107to. I'll have more to say about this and related examples later.
108
109### unsafe.Sizeof and Friends
110
111Since `unsafe.Sizeof`, `unsafe.Offsetof` and `unsafe.Alignof` are constant
112expressions, they can be used in an array type literal:
113
114```
115// old
116type S struct{ X int }
117
118// new
119type S struct{ X, y int }
120
121// client
122var a [unsafe.Sizeof(pkg.S{})]int = [8]int{} // fails in new because S's size is not 8
123```
124Use of these operations could make many changes to a type potentially incompatible.
125
126
127### Type Switches
128
129A package change that merges two different types (with same underlying type)
130into a single new type may break type switches in clients that refer to both
131original types:
132
133```
134// old
135type T1 int
136type T2 int
137
138// new
139type T1 int
140type T2 = T1
141
142// client
143switch x.(type) {
144case T1:
145case T2:
146} // fails with new because two cases have the same type
147```
148This sort of incompatibility is sufficiently esoteric to ignore; the tool allows
149merging types.
150
151## First Attempt at a Definition
152
153Our first attempt at defining compatibility captures the idea that all the
154exported names in the old package must have compatible equivalents in the new
155package.
156
157A new package is compatible with an old one if and only if:
158- For every exported package-level name in the old package, the same name is
159 declared in the new at package level, and
160- the names denote the same kind of object (e.g. both are variables), and
161- the types of the objects are compatible.
162
163We will work out the details (and make some corrections) below, but it is clear
164already that we will need to determine what makes two types compatible. And
165whatever the definition of type compatibility, it's certainly true that if two
166types are the same, they are compatible. So we will need to decide what makes an
167old and new type the same. We will call this sameness relation _correspondence_.
168
169## Type Correspondence
170
171Go already has a definition of when two types are the same:
172[type identity](https://golang.org/ref/spec#Type_identity).
173But identity isn't adequate for our purpose: it says that two defined
174types are identical if they arise from the same definition, but it's unclear
175what "same" means when talking about two different packages (or two versions of
176a single package).
177
178The obvious change to the definition of identity is to require that old and new
179[defined types](https://golang.org/ref/spec#Type_definitions)
180have the same name instead. But that doesn't work either, for two
181reasons. First, type aliases can equate two defined types with different names:
182
183```
184// old
185type E int
186
187// new
188type t int
189type E = t
190```
191Second, an unexported type can be renamed:
192
193```
194// old
195type u1 int
196var V u1
197
198// new
199type u2 int
200var V u2
201```
202Here, even though `u1` and `u2` are unexported, their exported fields and
203methods are visible to clients, so they are part of the API. But since the name
204`u1` is not visible to clients, it can be changed compatibly. We say that `u1`
205and `u2` are _exposed_: a type is exposed if a client package can declare variables of that type.
206
207We will say that an old defined type _corresponds_ to a new one if they have the
208same name, or one can be renamed to the other without otherwise changing the
209API. In the first example above, old `E` and new `t` correspond. In the second,
210old `u1` and new `u2` correspond.
211
212Two or more old defined types can correspond to a single new type: we consider
213"merging" two types into one to be a compatible change. As mentioned above,
214code that uses both names in a type switch will fail, but we deliberately ignore
215this case. However, a single old type can correspond to only one new type.
216
217So far, we've explained what correspondence means for defined types. To extend
218the definition to all types, we parallel the language's definition of type
219identity. So, for instance, an old and a new slice type correspond if their
220element types correspond.
221
222## Definition of Compatibility
223
224We can now present the definition of compatibility used by `apidiff`.
225
226### Module Compatibility
227
228> A new module is compatible with an old one if:
229>1. Each package present in the old module also appears in the new module,
230> with matching import paths relative to their respective module root, and
231>2. Each package present in both modules fulfills Package Compatibility as
232> defined below.
233>
234>Otherwise the modules are incompatible.
235
236If a package is converted into a nested module of the original module then
237comparing two versions of the module, before and after nested module creation,
238will produce an incompatible package removal message. This removal message does
239not necessarily mean that client code will need to change. If the package API
240retains Package Compatibility after nested module creation, then only the
241`go.mod` of the client code will need to change. Take the following example:
242
243```
244./
245 go.mod
246 go.sum
247 foo.go
248 bar/bar.go
249```
250
251Where `go.mod` is:
252
253```
254module example.com/foo
255
256go 1.20
257```
258
259Where `bar/bar.go` is:
260
261```
262package bar
263
264var V int
265```
266
267And `foo.go` is:
268
269```
270package foo
271
272import "example.com/foo/bar"
273
274_ = bar.V
275```
276
277Creating a nested module with the package `bar` while retaining Package
278Compatibility is _code_ compatible, because the import path of the package does
279not change:
280
281```
282./
283 go.mod
284 go.sum
285 foo.go
286 bar/
287 bar.go
288 go.mod
289 go.sum
290```
291
292Where `bar/go.mod` is:
293```
294module example.com/foo/bar
295
296go 1.20
297```
298
299And the top-level `go.mod` becomes:
300```
301module example.com/foo
302
303go 1.20
304
305// New dependency on nested module.
306require example.com/foo/bar v1.0.0
307```
308
309If during nested module creation either Package Compatibility is broken, like so
310in `bar/bar.go`:
311
312```
313package bar
314
315// Changed from V to T.
316var T int
317```
318
319Or the nested module uses a name other than the original package's import path,
320like so in `bar/go.mod`:
321
322```
323// Completely different module name
324module example.com/qux
325
326go 1.20
327```
328
329Then the move is backwards incompatible for client code.
330
331### Package Compatibility
332
333> A new package is compatible with an old one if:
334>1. Each exported name in the old package's scope also appears in the new
335>package's scope, and the object (constant, variable, function or type) denoted
336>by that name in the old package is compatible with the object denoted by the
337>name in the new package, and
338>2. For every exposed type that implements an exposed interface in the old package,
339> its corresponding type should implement the corresponding interface in the new package.
340>
341>Otherwise the packages are incompatible.
342
343As an aside, the tool also finds exported names in the new package that are not
344exported in the old, and marks them as compatible changes.
345
346Clause 2 is discussed further in "Whole-Package Compatibility."
347
348### Object Compatibility
349
350This section provides compatibility rules for constants, variables, functions
351and types.
352
353#### Constants
354
355>A new exported constant is compatible with an old one of the same name if and only if
356>1. Their types correspond, and
357>2. Their values are identical.
358
359It is tempting to allow changing a typed constant to an untyped one. That may
360seem harmless, but it can break code like this:
361
362```
363// old
364const C int64 = 1
365
366// new
367const C = 1
368
369// client
370var x = C // old type is int64, new is int
371var y int64 = x // fails with new: different types in assignment
372```
373
374A change to the value of a constant can break compatibility if the value is used
375in an array type:
376
377```
378// old
379const C = 1
380
381// new
382const C = 2
383
384// client
385var a [C]int = [1]int{} // fails with new because [2]int and [1]int are different types
386```
387Changes to constant values are rare, and determining whether they are compatible
388or not is better left to the user, so the tool reports them.
389
390#### Variables
391
392>A new exported variable is compatible with an old one of the same name if and
393>only if their types correspond.
394
395Correspondence doesn't look past names, so this rule does not prevent adding a
396field to `MyStruct` if the package declares `var V MyStruct`. It does, however, mean that
397
398```
399var V struct { X int }
400```
401is incompatible with
402```
403var V struct { X, Y int }
404```
405I discuss this at length below in the section "Compatibility, Types and Names."
406
407#### Functions
408
409>A new exported function or variable is compatible with an old function of the
410>same name if and only if their types (signatures) correspond.
411
412This rule captures the fact that, although many signature changes are compatible
413for all call sites, none are compatible for assignment:
414
415```
416var v func(int) = pkg.F
417```
418Here, `F` must be of type `func(int)` and not, for instance, `func(...int)` or `func(interface{})`.
419
420Note that the rule permits changing a function to a variable. This is a common
421practice, usually done for test stubbing, and cannot break any code at compile
422time.
423
424#### Exported Types
425
426> A new exported type is compatible with an old one if and only if their
427> names are the same and their types correspond.
428
429This rule seems far too strict. But, ignoring aliases for the moment, it demands only
430that the old and new _defined_ types correspond. Consider:
431```
432// old
433type T struct { X int }
434
435// new
436type T struct { X, Y int }
437```
438The addition of `Y` is a compatible change, because this rule does not require
439that the struct literals have to correspond, only that the defined types
440denoted by `T` must correspond. (Remember that correspondence stops at type
441names.)
442
443If one type is an alias that refers to the corresponding defined type, the
444situation is the same:
445
446```
447// old
448type T struct { X int }
449
450// new
451type u struct { X, Y int }
452type T = u
453```
454Here, the only requirement is that old `T` corresponds to new `u`, not that the
455struct types correspond. (We can't tell from this snippet that the old `T` and
456the new `u` do correspond; that depends on whether `u` replaces `T` throughout
457the API.)
458
459However, the following change is incompatible, because the names do not
460denote corresponding types:
461
462```
463// old
464type T = struct { X int }
465
466// new
467type T = struct { X, Y int }
468```
469### Type Literal Compatibility
470
471Only five kinds of types can differ compatibly: defined types, structs,
472interfaces, channels and numeric types. We only consider the compatibility of
473the last four when they are the underlying type of a defined type. See
474"Compatibility, Types and Names" for a rationale.
475
476We justify the compatibility rules by enumerating all the ways a type
477can be used, and by showing that the allowed changes cannot break any code that
478uses values of the type in those ways.
479
480Values of all types can be used in assignments (including argument passing and
481function return), but we do not require that old and new types are assignment
482compatible. That is because we assume that the old and new packages are never
483used together: any given binary will link in either the old package or the new.
484So in describing how a type can be used in the sections below, we omit
485assignment.
486
487Any type can also be used in a type assertion or conversion. The changes we allow
488below may affect the run-time behavior of these operations, but they cannot affect
489whether they compile. The only such breaking change would be to change
490the type `T` in an assertion `x.T` so that it no longer implements the interface
491type of `x`; but the rules for interfaces below disallow that.
492
493> A new type is compatible with an old one if and only if they correspond, or
494> one of the cases below applies.
495
496#### Defined Types
497
498Other than assignment, the only ways to use a defined type are to access its
499methods, or to make use of the properties of its underlying type. Rule 2 below
500covers the latter, and rules 3 and 4 cover the former.
501
502> A new defined type is compatible with an old one if and only if all of the
503> following hold:
504>1. They correspond.
505>2. Their underlying types are compatible.
506>3. The new exported value method set is a superset of the old.
507>4. The new exported pointer method set is a superset of the old.
508
509An exported method set is a method set with all unexported methods removed.
510When comparing methods of a method set, we require identical names and
511corresponding signatures.
512
513Removing an exported method is clearly a breaking change. But removing an
514unexported one (or changing its signature) can be breaking as well, if it
515results in the type no longer implementing an interface. See "Whole-Package
516Compatibility," below.
517
518#### Channels
519
520> A new channel type is compatible with an old one if
521> 1. The element types correspond, and
522> 2. Either the directions are the same, or the new type has no direction.
523
524Other than assignment, the only ways to use values of a channel type are to send
525and receive on them, to close them, and to use them as map keys. Changes to a
526channel type cannot cause code that closes a channel or uses it as a map key to
527fail to compile, so we need not consider those operations.
528
529Rule 1 ensures that any operations on the values sent or received will compile.
530Rule 2 captures the fact that any program that compiles with a directed channel
531must use either only sends, or only receives, so allowing the other operation
532by removing the channel direction cannot break any code.
533
534
535#### Interfaces
536
537> A new interface is compatible with an old one if and only if:
538> 1. The old interface does not have an unexported method, and it corresponds
539> to the new interfaces (i.e. they have the same method set), or
540> 2. The old interface has an unexported method and the new exported method set is a
541> superset of the old.
542
543Other than assignment, the only ways to use an interface are to implement it,
544embed it, or call one of its methods. (Interface values can also be used as map
545keys, but that cannot cause a compile-time error.)
546
547Certainly, removing an exported method from an interface could break a client
548call, so neither rule allows it.
549
550Rule 1 also disallows adding a method to an interface without an existing unexported
551method. Such an interface can be implemented in client code. If adding a method
552were allowed, a type that implements the old interface could fail to implement
553the new one:
554
555```
556type I interface { M1() } // old
557type I interface { M1(); M2() } // new
558
559// client
560type t struct{}
561func (t) M1() {}
562var i pkg.I = t{} // fails with new, because t lacks M2
563```
564
565Rule 2 is based on the observation that if an interface has an unexported
566method, the only way a client can implement it is to embed it.
567Adding a method is compatible in this case, because the embedding struct will
568continue to implement the interface. Adding a method also cannot break any call
569sites, since no program that compiles could have any such call sites.
570
571#### Structs
572
573> A new struct is compatible with an old one if all of the following hold:
574> 1. The new set of top-level exported fields is a superset of the old.
575> 2. The new set of _selectable_ exported fields is a superset of the old.
576> 3. If the old struct is comparable, so is the new one.
577
578The set of selectable exported fields is the set of exported fields `F`
579such that `x.F` is a valid selector expression for a value `x` of the struct
580type. `F` may be at the top level of the struct, or it may be a field of an
581embedded struct.
582
583Two fields are the same if they have the same name and corresponding types.
584
585Other than assignment, there are only four ways to use a struct: write a struct
586literal, select a field, use a value of the struct as a map key, or compare two
587values for equality. The first clause ensures that struct literals compile; the
588second, that selections compile; and the third, that equality expressions and
589map index expressions compile.
590
591#### Numeric Types
592
593> A new numeric type is compatible with an old one if and only if they are
594> both unsigned integers, both signed integers, both floats or both complex
595> types, and the new one is at least as large as the old on both 32-bit and
596> 64-bit architectures.
597
598Other than in assignments, numeric types appear in arithmetic and comparison
599expressions. Since all arithmetic operations but shifts (see below) require that
600operand types be identical, and by assumption the old and new types underly
601defined types (see "Compatibility, Types and Names," below), there is no way for
602client code to write an arithmetic expression that compiles with operands of the
603old type but not the new.
604
605Numeric types can also appear in type switches and type assertions. Again, since
606the old and new types underly defined types, type switches and type assertions
607that compiled using the old defined type will continue to compile with the new
608defined type.
609
610Going from an unsigned to a signed integer type is an incompatible change for
611the sole reason that only an unsigned type can appear as the right operand of a
612shift. If this rule is relaxed, then changes from an unsigned type to a larger
613signed type would be compatible. See [this
614issue](https://github.com/golang/go/issues/19113).
615
616Only integer types can be used in bitwise and shift operations, and for indexing
617slices and arrays. That is why switching from an integer to a floating-point
618type--even one that can represent all values of the integer type--is an
619incompatible change.
620
621
622Conversions from floating-point to complex types or vice versa are not permitted
623(the predeclared functions real, imag, and complex must be used instead). To
624prevent valid floating-point or complex conversions from becoming invalid,
625changing a floating-point type to a complex type or vice versa is considered an
626incompatible change.
627
628Although conversions between any two integer types are valid, assigning a
629constant value to a variable of integer type that is too small to represent the
630constant is not permitted. That is why the only compatible changes are to
631a new type whose values are a superset of the old. The requirement that the new
632set of values must include the old on both 32-bit and 64-bit machines allows
633conversions from `int32` to `int` and from `int` to `int64`, but not the other
634direction; and similarly for `uint`.
635
636Changing a type to or from `uintptr` is considered an incompatible change. Since
637its size is not specified, there is no way to know whether the new type's values
638are a superset of the old type's.
639
640## Whole-Package Compatibility
641
642Some changes that are compatible for a single type are not compatible when the
643package is considered as a whole. For example, if you remove an unexported
644method on a defined type, it may no longer implement an interface of the
645package. This can break client code:
646
647```
648// old
649type T int
650func (T) m() {}
651type I interface { m() }
652
653// new
654type T int // no method m anymore
655
656// client
657var i pkg.I = pkg.T{} // fails with new because T lacks m
658```
659
660Similarly, adding a method to an interface can cause defined types
661in the package to stop implementing it.
662
663The second clause in the definition for package compatibility handles these
664cases. To repeat:
665> 2. For every exposed type that implements an exposed interface in the old package,
666> its corresponding type should implement the corresponding interface in the new package.
667Recall that a type is exposed if it is part of the package's API, even if it is
668unexported.
669
670Other incompatibilities that involve more than one type in the package can arise
671whenever two types with identical underlying types exist in the old or new
672package. Here, a change "splits" an identical underlying type into two, breaking
673conversions:
674
675```
676// old
677type B struct { X int }
678type C struct { X int }
679
680// new
681type B struct { X int }
682type C struct { X, Y int }
683
684// client
685var b B
686_ = C(b) // fails with new: cannot convert B to C
687```
688Finally, changes that are compatible for the package in which they occur can
689break downstream packages. That can happen even if they involve unexported
690methods, thanks to embedding.
691
692The definitions given here don't account for these sorts of problems.
693
694
695## Compatibility, Types and Names
696
697The above definitions state that the only types that can differ compatibly are
698defined types and the types that underly them. Changes to other type literals
699are considered incompatible. For instance, it is considered an incompatible
700change to add a field to the struct in this variable declaration:
701
702```
703var V struct { X int }
704```
705or this alias definition:
706```
707type T = struct { X int }
708```
709
710We make this choice to keep the definition of compatibility (relatively) simple.
711A more precise definition could, for instance, distinguish between
712
713```
714func F(struct { X int })
715```
716where any changes to the struct are incompatible, and
717
718```
719func F(struct { X, u int })
720```
721where adding a field is compatible (since clients cannot write the signature,
722and thus cannot assign `F` to a variable of the signature type). The definition
723should then also allow other function signature changes that only require
724call-site compatibility, like
725
726```
727func F(struct { X, u int }, ...int)
728```
729The result would be a much more complex definition with little benefit, since
730the examples in this section rarely arise in practice.
View as plain text