README.md

Documentation: golang.org/x/exp/apidiff

     1# Checking Go API Compatibility
     2
     3The `apidiff` tool in this directory determines whether two examples of a
     4package or module are compatible. The goal is to help the developer make an
     5informed choice of semantic version after they have changed the code of their
     6module.
     7
     8`apidiff` reports two kinds of changes: incompatible ones, which require
     9incrementing the major part of the semantic version, and compatible ones, which
    10require a minor version increment. If no API changes are reported but there are
    11code changes that could affect client code, then the patch version should
    12be incremented.
    13
    14`apidiff` may be used to display API differences between any two packages or
    15modules, not just different versions of the same thing. It does this by ignoring
    16the package import paths when directly comparing two packages, and
    17by ignoring module paths when comparing two modules. That is to say, when
    18comparing two modules, the package import paths **do** matter, but are compared
    19_relative_ to their respective module root.
    20
    21## Compatibility Desiderata
    22
    23Any tool that checks compatibility can offer only an approximation. No tool can
    24detect behavioral changes; and even if it could, whether a behavioral change is
    25a breaking change or not depends on many factors, such as whether it closes a
    26security hole or fixes a bug. Even a change that causes some code to fail to
    27compile may not be considered a breaking change by the developers or their
    28users. It may only affect code marked as experimental or unstable, for
    29example, or the break may only manifest in unlikely cases.
    30
    31For a tool to be useful, its notion of compatibility must be relaxed enough to
    32allow reasonable changes, like adding a field to a struct, but strict enough to
    33catch significant breaking changes. A tool that is too lax will miss important
    34incompatibilities, and users will stop trusting it; one that is too strict may
    35generate so much noise that users will ignore it.
    36
    37To a first approximation, this tool reports a change as incompatible if it could
    38cause client code to stop compiling. But `apidiff` ignores five ways in which
    39code may fail to compile after a change. Three of them are mentioned in the
    40[Go 1 Compatibility Guarantee](https://golang.org/doc/go1compat).
    41
    42### Unkeyed Struct Literals
    43
    44Code that uses an unkeyed struct literal would fail to compile if a field was
    45added to the struct, making any such addition an incompatible change. An example:
    46
    47```
    48// old
    49type Point struct { X, Y int }
    50
    51// new
    52type Point struct { X, Y, Z int }
    53
    54// client
    55p := pkg.Point{1, 2} // fails in new because there are more fields than expressions
    56```
    57Here and below, we provide three snippets: the code in the old version of the
    58package, the code in the new version, and the code written in a client of the package,
    59which refers to it by the name `pkg`. The client code compiles against the old
    60code but not the new.
    61
    62### Embedding and Shadowing
    63
    64Adding an exported field to a struct can break code that embeds that struct,
    65because the newly added field may conflict with an identically named field
    66at the same struct depth. A selector referring to the latter would become
    67ambiguous and thus erroneous.
    68
    69
    70```
    71// old
    72type Point struct { X, Y int }
    73
    74// new
    75type Point struct { X, Y, Z int }
    76
    77// client
    78type z struct { Z int }
    79
    80var v struct {
    81    pkg.Point
    82    z
    83}
    84
    85_ = v.Z // fails in new
    86```
    87In the new version, the last line fails to compile because there are two embedded `Z`
    88fields at the same depth, one from `z` and one from `pkg.Point`.
    89
    90
    91### Using an Identical Type Externally
    92
    93If it is possible for client code to write a type expression representing the
    94underlying type of a defined type in a package, then external code can use it in
    95assignments involving the package type, making any change to that type incompatible.
    96```
    97// old
    98type Point struct { X, Y int }
    99
   100// new
   101type Point struct { X, Y, Z int }
   102
   103// client
   104var p struct { X, Y int } = pkg.Point{} // fails in new because of Point's extra field
   105```
   106Here, the external code could have used the provided name `Point`, but chose not
   107to. I'll have more to say about this and related examples later.
   108
   109### unsafe.Sizeof and Friends
   110
   111Since `unsafe.Sizeof`, `unsafe.Offsetof` and `unsafe.Alignof` are constant
   112expressions, they can be used in an array type literal:
   113
   114```
   115// old
   116type S struct{ X int }
   117
   118// new
   119type S struct{ X, y int }
   120
   121// client
   122var a [unsafe.Sizeof(pkg.S{})]int = [8]int{} // fails in new because S's size is not 8
   123```
   124Use of these operations could make many changes to a type potentially incompatible.
   125
   126
   127### Type Switches
   128
   129A package change that merges two different types (with same underlying type)
   130into a single new type may break type switches in clients that refer to both
   131original types:
   132
   133```
   134// old
   135type T1 int
   136type T2 int
   137
   138// new
   139type T1 int
   140type T2 = T1
   141
   142// client
   143switch x.(type) {
   144case T1:
   145case T2:
   146} // fails with new because two cases have the same type
   147```
   148This sort of incompatibility is sufficiently esoteric to ignore; the tool allows
   149merging types.
   150
   151## First Attempt at a Definition
   152
   153Our first attempt at defining compatibility captures the idea that all the
   154exported names in the old package must have compatible equivalents in the new
   155package.
   156
   157A new package is compatible with an old one if and only if:
   158- For every exported package-level name in the old package, the same name is
   159  declared in the new at package level, and
   160- the names denote the same kind of object (e.g. both are variables), and
   161- the types of the objects are compatible.
   162
   163We will work out the details (and make some corrections) below, but it is clear
   164already that we will need to determine what makes two types compatible. And
   165whatever the definition of type compatibility, it's certainly true that if two
   166types are the same, they are compatible. So we will need to decide what makes an
   167old and new type the same. We will call this sameness relation _correspondence_.
   168
   169## Type Correspondence
   170
   171Go already has a definition of when two types are the same:
   172[type identity](https://golang.org/ref/spec#Type_identity).
   173But identity isn't adequate for our purpose: it says that two defined
   174types are identical if they arise from the same definition, but it's unclear
   175what "same" means when talking about two different packages (or two versions of
   176a single package).
   177
   178The obvious change to the definition of identity is to require that old and new
   179[defined types](https://golang.org/ref/spec#Type_definitions)
   180have the same name instead. But that doesn't work either, for two
   181reasons. First, type aliases can equate two defined types with different names:
   182
   183```
   184// old
   185type E int
   186
   187// new
   188type t int
   189type E = t
   190```
   191Second, an unexported type can be renamed:
   192
   193```
   194// old
   195type u1 int
   196var V u1
   197
   198// new
   199type u2 int
   200var V u2
   201```
   202Here, even though `u1` and `u2` are unexported, their exported fields and
   203methods are visible to clients, so they are part of the API. But since the name
   204`u1` is not visible to clients, it can be changed compatibly. We say that `u1`
   205and `u2` are _exposed_: a type is exposed if a client package can declare variables of that type.
   206
   207We will say that an old defined type _corresponds_ to a new one if they have the
   208same name, or one can be renamed to the other without otherwise changing the
   209API. In the first example above, old `E` and new `t` correspond. In the second,
   210old `u1` and new `u2` correspond.
   211
   212Two or more old defined types can correspond to a single new type: we consider
   213"merging" two types into one to be a compatible change. As mentioned above,
   214code that uses both names in a type switch will fail, but we deliberately ignore
   215this case. However, a single old type can correspond to only one new type.
   216
   217So far, we've explained what correspondence means for defined types. To extend
   218the definition to all types, we parallel the language's definition of type
   219identity. So, for instance, an old and a new slice type correspond if their
   220element types correspond.
   221
   222## Definition of Compatibility
   223
   224We can now present the definition of compatibility used by `apidiff`.
   225
   226### Module Compatibility
   227
   228> A new module is compatible with an old one if:
   229>1. Each package present in the old module also appears in the new module,
   230> with matching import paths relative to their respective module root, and
   231>2. Each package present in both modules fulfills Package Compatibility as
   232> defined below.
   233>
   234>Otherwise the modules are incompatible.
   235
   236If a package is converted into a nested module of the original module then
   237comparing two versions of the module, before and after nested module creation,
   238will produce an incompatible package removal message. This removal message does
   239not necessarily mean that client code will need to change. If the package API
   240retains Package Compatibility after nested module creation, then only the
   241`go.mod` of the client code will need to change. Take the following example:
   242
   243```
   244./
   245  go.mod
   246  go.sum
   247  foo.go
   248  bar/bar.go
   249```
   250
   251Where `go.mod` is:
   252
   253```
   254module example.com/foo
   255
   256go 1.20
   257```
   258
   259Where `bar/bar.go` is:
   260
   261```
   262package bar
   263
   264var V int
   265```
   266
   267And `foo.go` is:
   268
   269```
   270package foo
   271
   272import "example.com/foo/bar"
   273
   274_ = bar.V
   275```
   276
   277Creating a nested module with the package `bar` while retaining Package
   278Compatibility is _code_ compatible, because the import path of the package does
   279not change:
   280
   281```
   282./
   283  go.mod
   284  go.sum
   285  foo.go
   286  bar/
   287    bar.go
   288    go.mod
   289    go.sum
   290```
   291
   292Where `bar/go.mod` is:
   293```
   294module example.com/foo/bar
   295
   296go 1.20
   297```
   298
   299And the top-level `go.mod` becomes:
   300```
   301module example.com/foo
   302
   303go 1.20
   304
   305// New dependency on nested module.
   306require example.com/foo/bar v1.0.0
   307```
   308
   309If during nested module creation either Package Compatibility is broken, like so
   310in `bar/bar.go`:
   311
   312```
   313package bar
   314
   315// Changed from V to T.
   316var T int
   317```
   318
   319Or the nested module uses a name other than the original package's import path,
   320like so in `bar/go.mod`:
   321
   322```
   323// Completely different module name
   324module example.com/qux
   325
   326go 1.20
   327```
   328
   329Then the move is backwards incompatible for client code.
   330
   331### Package Compatibility
   332
   333> A new package is compatible with an old one if:
   334>1. Each exported name in the old package's scope also appears in the new
   335>package's scope, and the object (constant, variable, function or type) denoted
   336>by that name in the old package is compatible with the object denoted by the
   337>name in the new package, and
   338>2. For every exposed type that implements an exposed interface in the old package,
   339> its corresponding type should implement the corresponding interface in the new package.
   340>
   341>Otherwise the packages are incompatible.
   342
   343As an aside, the tool also finds exported names in the new package that are not
   344exported in the old, and marks them as compatible changes.
   345
   346Clause 2 is discussed further in "Whole-Package Compatibility."
   347
   348### Object Compatibility
   349
   350This section provides compatibility rules for constants, variables, functions
   351and types.
   352
   353#### Constants
   354
   355>A new exported constant is compatible with an old one of the same name if and only if
   356>1. Their types correspond, and
   357>2. Their values are identical.
   358
   359It is tempting to allow changing a typed constant to an untyped one. That may
   360seem harmless, but it can break code like this:
   361
   362```
   363// old
   364const C int64 = 1
   365
   366// new
   367const C = 1
   368
   369// client
   370var x = C          // old type is int64, new is int
   371var y int64 = x // fails with new: different types in assignment
   372```
   373
   374A change to the value of a constant can break compatibility if the value is used
   375in an array type:
   376
   377```
   378// old
   379const C = 1
   380
   381// new
   382const C = 2
   383
   384// client
   385var a [C]int = [1]int{} // fails with new because [2]int and [1]int are different types
   386```
   387Changes to constant values are rare, and determining whether they are compatible
   388or not is better left to the user, so the tool reports them.
   389
   390#### Variables
   391
   392>A new exported variable is compatible with an old one of the same name if and
   393>only if their types correspond.
   394
   395Correspondence doesn't look past names, so this rule does not prevent adding a
   396field to `MyStruct` if the package declares `var V MyStruct`. It does, however, mean that
   397
   398```
   399var V struct { X int }
   400```
   401is incompatible with
   402```
   403var V struct { X, Y int }
   404```
   405I discuss this at length below in the section "Compatibility, Types and Names."
   406
   407#### Functions
   408
   409>A new exported function or variable is compatible with an old function of the
   410>same name if and only if their types (signatures) correspond.
   411
   412This rule captures the fact that, although many signature changes are compatible
   413for all call sites, none are compatible for assignment:
   414
   415```
   416var v func(int) = pkg.F
   417```
   418Here, `F` must be of type `func(int)` and not, for instance, `func(...int)` or `func(interface{})`.
   419
   420Note that the rule permits changing a function to a variable. This is a common
   421practice, usually done for test stubbing, and cannot break any code at compile
   422time.
   423
   424#### Exported Types
   425
   426> A new exported type is compatible with an old one if and only if their
   427> names are the same and their types correspond.
   428
   429This rule seems far too strict. But, ignoring aliases for the moment, it demands only
   430that the old and new _defined_ types correspond. Consider:
   431```
   432// old
   433type T struct { X int }
   434
   435// new
   436type T struct { X, Y int }
   437```
   438The addition of `Y` is a compatible change, because this rule does not require
   439that the struct literals have to correspond, only that the defined types
   440denoted by `T` must correspond. (Remember that correspondence stops at type
   441names.)
   442
   443If one type is an alias that refers to the corresponding defined type, the
   444situation is the same:
   445
   446```
   447// old
   448type T struct { X int }
   449
   450// new
   451type u struct { X, Y int }
   452type T = u
   453```
   454Here, the only requirement is that old `T` corresponds to new `u`, not that the
   455struct types correspond. (We can't tell from this snippet that the old `T` and
   456the new `u` do correspond; that depends on whether `u` replaces `T` throughout
   457the API.)
   458
   459However, the following change is incompatible, because the names do not
   460denote corresponding types:
   461
   462```
   463// old
   464type T = struct { X int }
   465
   466// new
   467type T = struct { X, Y int }
   468```
   469### Type Literal Compatibility
   470
   471Only five kinds of types can differ compatibly: defined types, structs,
   472interfaces, channels and numeric types. We only consider the compatibility of
   473the last four when they are the underlying type of a defined type. See
   474"Compatibility, Types and Names" for a rationale.
   475
   476We justify the compatibility rules by enumerating all the ways a type
   477can be used, and by showing that the allowed changes cannot break any code that
   478uses values of the type in those ways.
   479
   480Values of all types can be used in assignments (including argument passing and
   481function return), but we do not require that old and new types are assignment
   482compatible. That is because we assume that the old and new packages are never
   483used together: any given binary will link in either the old package or the new.
   484So in describing how a type can be used in the sections below, we omit
   485assignment.
   486
   487Any type can also be used in a type assertion or conversion. The changes we allow
   488below may affect the run-time behavior of these operations, but they cannot affect
   489whether they compile. The only such breaking change would be to change
   490the type `T` in an assertion `x.T` so that it no longer implements the interface
   491type of `x`; but the rules for interfaces below disallow that.
   492
   493> A new type is compatible with an old one if and only if they correspond, or
   494> one of the cases below applies.
   495
   496#### Defined Types
   497
   498Other than assignment, the only ways to use a defined type are to access its
   499methods, or to make use of the properties of its underlying type. Rule 2 below
   500covers the latter, and rules 3 and 4 cover the former.
   501
   502> A new defined type is compatible with an old one if and only if all of the
   503> following hold:
   504>1. They correspond.
   505>2. Their underlying types are compatible.
   506>3. The new exported value method set is a superset of the old.
   507>4. The new exported pointer method set is a superset of the old.
   508
   509An exported method set is a method set with all unexported methods removed.
   510When comparing methods of a method set, we require identical names and
   511corresponding signatures.
   512
   513Removing an exported method is clearly a breaking change. But removing an
   514unexported one (or changing its signature) can be breaking as well, if it
   515results in the type no longer implementing an interface. See "Whole-Package
   516Compatibility," below.
   517
   518#### Channels
   519
   520> A new channel type is compatible with an old one if
   521>  1. The element types correspond, and
   522>  2. Either the directions are the same, or the new type has no direction.
   523
   524Other than assignment, the only ways to use values of a channel type are to send
   525and receive on them, to close them, and to use them as map keys. Changes to a
   526channel type cannot cause code that closes a channel or uses it as a map key to
   527fail to compile, so we need not consider those operations.
   528
   529Rule 1 ensures that any operations on the values sent or received will compile.
   530Rule 2 captures the fact that any program that compiles with a directed channel
   531must use either only sends, or only receives, so allowing the other operation
   532by removing the channel direction cannot break any code.
   533
   534
   535#### Interfaces
   536
   537> A new interface is compatible with an old one if and only if:
   538> 1. The old interface does not have an unexported method, and it corresponds
   539>    to the new interfaces (i.e. they have the same method set), or
   540> 2. The old interface has an unexported method and the new exported method set is a
   541>    superset of the old.
   542
   543Other than assignment, the only ways to use an interface are to implement it,
   544embed it, or call one of its methods. (Interface values can also be used as map
   545keys, but that cannot cause a compile-time error.)
   546
   547Certainly, removing an exported method from an interface could break a client
   548call, so neither rule allows it.
   549
   550Rule 1 also disallows adding a method to an interface without an existing unexported
   551method. Such an interface can be implemented in client code. If adding a method
   552were allowed, a type that implements the old interface could fail to implement
   553the new one:
   554
   555```
   556type I interface { M1() }         // old
   557type I interface { M1(); M2() }   // new
   558
   559// client
   560type t struct{}
   561func (t) M1() {}
   562var i pkg.I = t{} // fails with new, because t lacks M2
   563```
   564
   565Rule 2 is based on the observation that if an interface has an unexported
   566method, the only way a client can implement it is to embed it.
   567Adding a method is compatible in this case, because the embedding struct will
   568continue to implement the interface. Adding a method also cannot break any call
   569sites, since no program that compiles could have any such call sites.
   570
   571#### Structs
   572
   573> A new struct is compatible with an old one if all of the following hold:
   574> 1. The new set of top-level exported fields is a superset of the old.
   575> 2. The new set of _selectable_ exported fields is a superset of the old.
   576> 3. If the old struct is comparable, so is the new one.
   577
   578The set of selectable exported fields is the set of exported fields `F`
   579such that `x.F` is a valid selector expression for a value `x` of the struct
   580type. `F` may be at the top level of the struct, or it may be a field of an
   581embedded struct.
   582
   583Two fields are the same if they have the same name and corresponding types.
   584
   585Other than assignment, there are only four ways to use a struct: write a struct
   586literal, select a field, use a value of the struct as a map key, or compare two
   587values for equality. The first clause ensures that struct literals compile; the
   588second, that selections compile; and the third, that equality expressions and
   589map index expressions compile.
   590
   591#### Numeric Types
   592
   593> A new numeric type is compatible with an old one if and only if they are
   594> both unsigned integers, both signed integers, both floats or both complex
   595> types, and the new one is at least as large as the old on both 32-bit and
   596> 64-bit architectures.
   597
   598Other than in assignments, numeric types appear in arithmetic and comparison
   599expressions. Since all arithmetic operations but shifts (see below) require that
   600operand types be identical, and by assumption the old and new types underly
   601defined types (see "Compatibility, Types and Names," below), there is no way for
   602client code to write an arithmetic expression that compiles with operands of the
   603old type but not the new.
   604
   605Numeric types can also appear in type switches and type assertions. Again, since
   606the old and new types underly defined types, type switches and type assertions
   607that compiled using the old defined type will continue to compile with the new
   608defined type.
   609
   610Going from an unsigned to a signed integer type is an incompatible change for
   611the sole reason that only an unsigned type can appear as the right operand of a
   612shift. If this rule is relaxed, then changes from an unsigned type to a larger
   613signed type would be compatible. See [this
   614issue](https://github.com/golang/go/issues/19113).
   615
   616Only integer types can be used in bitwise and shift operations, and for indexing
   617slices and arrays. That is why switching from an integer to a floating-point
   618type--even one that can represent all values of the integer type--is an
   619incompatible change.
   620
   621
   622Conversions from floating-point to complex types or vice versa are not permitted
   623(the predeclared functions real, imag, and complex must be used instead). To
   624prevent valid floating-point or complex conversions from becoming invalid,
   625changing a floating-point type to a complex type or vice versa is considered an
   626incompatible change.
   627
   628Although conversions between any two integer types are valid, assigning a
   629constant value to a variable of integer type that is too small to represent the
   630constant is not permitted. That is why the only compatible changes are to
   631a new type whose values are a superset of the old. The requirement that the new
   632set of values must include the old on both 32-bit and 64-bit machines allows
   633conversions from `int32` to `int` and from `int` to `int64`, but not the other
   634direction; and similarly for `uint`.
   635
   636Changing a type to or from `uintptr` is considered an incompatible change. Since
   637its size is not specified, there is no way to know whether the new type's values
   638are a superset of the old type's.
   639
   640## Whole-Package Compatibility
   641
   642Some changes that are compatible for a single type are not compatible when the
   643package is considered as a whole. For example, if you remove an unexported
   644method on a defined type, it may no longer implement an interface of the
   645package. This can break client code:
   646
   647```
   648// old
   649type T int
   650func (T) m() {}
   651type I interface { m() }
   652
   653// new
   654type T int // no method m anymore
   655
   656// client
   657var i pkg.I = pkg.T{} // fails with new because T lacks m
   658```
   659
   660Similarly, adding a method to an interface can cause defined types
   661in the package to stop implementing it.
   662
   663The second clause in the definition for package compatibility handles these
   664cases. To repeat:
   665> 2. For every exposed type that implements an exposed interface in the old package,
   666> its corresponding type should implement the corresponding interface in the new package.
   667Recall that a type is exposed if it is part of the package's API, even if it is
   668unexported.
   669
   670Other incompatibilities that involve more than one type in the package can arise
   671whenever two types with identical underlying types exist in the old or new
   672package. Here, a change "splits" an identical underlying type into two, breaking
   673conversions:
   674
   675```
   676// old
   677type B struct { X int }
   678type C struct { X int }
   679
   680// new
   681type B struct { X int }
   682type C struct { X, Y int }
   683
   684// client
   685var b B
   686_ = C(b) // fails with new: cannot convert B to C
   687```
   688Finally, changes that are compatible for the package in which they occur can
   689break downstream packages. That can happen even if they involve unexported
   690methods, thanks to embedding.
   691
   692The definitions given here don't account for these sorts of problems.
   693
   694
   695## Compatibility, Types and Names 
   696
   697The above definitions state that the only types that can differ compatibly are
   698defined types and the types that underly them. Changes to other type literals
   699are considered incompatible. For instance, it is considered an incompatible
   700change to add a field to the struct in this variable declaration:
   701
   702```
   703var V struct { X int }
   704```
   705or this alias definition:
   706```
   707type T = struct { X int }
   708```
   709
   710We make this choice to keep the definition of compatibility (relatively) simple.
   711A more precise definition could, for instance, distinguish between
   712
   713```
   714func F(struct { X int })
   715```
   716where any changes to the struct are incompatible, and
   717
   718```
   719func F(struct { X, u int })
   720```
   721where adding a field is compatible (since clients cannot write the signature,
   722and thus cannot assign `F` to a variable of the signature type). The definition
   723should then also allow other function signature changes that only require
   724call-site compatibility, like
   725
   726```
   727func F(struct { X, u int }, ...int)
   728```
   729The result would be a much more complex definition with little benefit, since
   730the examples in this section rarely arise in practice.
View as plain text