...

Text file src/github.com/opencontainers/image-spec/layer.md

Documentation: github.com/opencontainers/image-spec

     1# Image Layer Filesystem Changeset
     2
     3This document describes how to serialize a filesystem and filesystem changes like removed files into a blob called a layer.
     4One or more layers are applied on top of each other to create a complete filesystem.
     5This document will use a concrete example to illustrate how to create and consume these filesystem layers.
     6
     7This section defines the `application/vnd.oci.image.layer.v1.tar`, `application/vnd.oci.image.layer.v1.tar+gzip`, `application/vnd.oci.image.layer.v1.tar+zstd`, `application/vnd.oci.image.layer.nondistributable.v1.tar`, `application/vnd.oci.image.layer.nondistributable.v1.tar+gzip`, and `application/vnd.oci.image.layer.nondistributable.v1.tar+zstd` [media types](media-types.md).
     8
     9## `+gzip` Media Types
    10
    11- The media type `application/vnd.oci.image.layer.v1.tar+gzip` represents an `application/vnd.oci.image.layer.v1.tar` payload which has been compressed with [gzip][rfc1952_2].
    12- The media type `application/vnd.oci.image.layer.nondistributable.v1.tar+gzip` represents an `application/vnd.oci.image.layer.nondistributable.v1.tar` payload which has been compressed with [gzip][rfc1952_2].
    13
    14## `+zstd` Media Types
    15
    16- The media type `application/vnd.oci.image.layer.v1.tar+zstd` represents an `application/vnd.oci.image.layer.v1.tar` payload which has been compressed with [zstd][rfc8478].
    17- The media type `application/vnd.oci.image.layer.nondistributable.v1.tar+zstd` represents an `application/vnd.oci.image.layer.nondistributable.v1.tar` payload which has been compressed with [zstd][rfc8478].
    18
    19## Distributable Format
    20
    21- Layer Changesets for the [media type](media-types.md) `application/vnd.oci.image.layer.v1.tar` MUST be packaged in [tar archive][tar-archive].
    22- Layer Changesets for the [media type](media-types.md) `application/vnd.oci.image.layer.v1.tar` MUST NOT include duplicate entries for file paths in the resulting [tar archive][tar-archive].
    23
    24## Change Types
    25
    26Types of changes that can occur in a changeset are:
    27
    28- Additions
    29- Modifications
    30- Removals
    31
    32Additions and Modifications are represented the same in the changeset tar archive.
    33
    34Removals are represented using "[whiteout](#whiteouts)" file entries (See [Representing Changes](#representing-changes)).
    35
    36### File Types
    37
    38Throughout this document section, the use of word "files" or "entries" includes the following, where supported:
    39
    40- regular files
    41- directories
    42- sockets
    43- symbolic links
    44- block devices
    45- character devices
    46- FIFOs
    47
    48### File Attributes
    49
    50Where supported, MUST include file attributes for Additions and Modifications include:
    51
    52- Modification Time (`mtime`)
    53- User ID (`uid`)
    54  - User Name (`uname`) *secondary to `uid`*
    55- Group ID (`gid`)
    56  - Group Name (`gname`) *secondary to `gid`*
    57- Mode (`mode`)
    58- Extended Attributes (`xattrs`)
    59- Symlink reference (`linkname` + symbolic link type)
    60- [Hardlink](#hardlinks) reference (`linkname`)
    61
    62[Sparse files](https://en.wikipedia.org/wiki/Sparse_file) SHOULD NOT be used because they lack consistent support across tar implementations.
    63
    64#### Hardlinks
    65
    66- Hardlinks are a [POSIX concept](https://pubs.opengroup.org/onlinepubs/9699919799/functions/link.html) for having one or more directory entries for the same file on the same device.
    67- Not all filesystems support hardlinks (e.g. [FAT](https://en.wikipedia.org/wiki/File_Allocation_Table)).
    68- Hardlinks are possible with all [file types](#file-types) except `directories`.
    69- Non-directory files are considered "hardlinked" when their link count is greater than 1.
    70- Hardlinked files are on a same device (i.e. comparing Major:Minor pair) and have the same inode.
    71- The corresponding files that share the link with the > 1 linkcount may be outside the directory that the changeset is being produced from, in which case the `linkname` is not recorded in the changeset.
    72- Hardlinks are stored in a tar archive with type of a `1` char, per the [GNU Basic Tar Format][gnu-tar-standard] and [libarchive tar(5)][libarchive-tar].
    73- While approaches to deriving new or changed hardlinks may vary, a possible approach is:
    74
    75```text
    76SET LinkMap to map[< Major:Minor String >]map[< inode integer >]< path string >
    77SET LinkNames to map[< src path string >]< dest path string >
    78FOR each path in root path
    79  IF path type is directory
    80    CONTINUE
    81  ENDIF
    82  SET filestat to stat(path)
    83  IF filestat num of links == 1
    84    CONTINUE
    85  ENDIF
    86  IF LinkMap[filestat device][filestat inode] is not empty
    87    SET LinkNames[path] to LinkMap[filestat device][filestat inode]
    88  ELSE
    89    SET LinkMap[filestat device][filestat inode] to path
    90  ENDIF
    91END FOR
    92```
    93
    94With this approach, the link map and links names of a directory could be compared against that of another directory to derive additions and changes to hardlinks.
    95
    96#### Platform-specific attributes
    97
    98Implementations on Windows MUST support these additional attributes, encoded in [PAX vendor
    99extensions](https://github.com/libarchive/libarchive/wiki/ManPageTar5#pax-interchange-format) as follows:
   100
   101- [Windows file attributes](https://msdn.microsoft.com/en-us/library/windows/desktop/gg258117(v=vs.85).aspx) (`MSWINDOWS.fileattr`)
   102- [Security descriptor](https://msdn.microsoft.com/en-us/library/cc230366.aspx) (`MSWINDOWS.rawsd`): base64-encoded self-relative binary security descriptor
   103- Mount points (`MSWINDOWS.mountpoint`): if present on a directory symbolic link, then the link should be created as a [directory junction](https://en.wikipedia.org/wiki/NTFS_junction_point)
   104- Creation time (`LIBARCHIVE.creationtime`)
   105
   106## Creating
   107
   108### Initial Root Filesystem
   109
   110The initial root filesystem is the base or parent layer.
   111
   112For this example, an image root filesystem has an initial state as an empty directory.
   113The name of the directory is not relevant to the layer itself, only for the purpose of producing comparisons.
   114
   115Here is an initial empty directory structure for a changeset, with a unique directory name `rootfs-c9d-v1`.
   116
   117```text
   118rootfs-c9d-v1/
   119```
   120
   121### Populate Initial Filesystem
   122
   123Files and directories are then created:
   124
   125```text
   126rootfs-c9d-v1/
   127  etc/
   128    my-app-config
   129  bin/
   130    my-app-binary
   131    my-app-tools
   132```
   133
   134The `rootfs-c9d-v1` directory is then created as a plain [tar archive][tar-archive] with relative path to `rootfs-c9d-v1`.
   135Entries for the following files:
   136
   137```text
   138./
   139./etc/
   140./etc/my-app-config
   141./bin/
   142./bin/my-app-binary
   143./bin/my-app-tools
   144```
   145
   146### Populate a Comparison Filesystem
   147
   148Create a new directory and initialize it with a copy or snapshot of the prior root filesystem.
   149Example commands that can preserve [file attributes](#file-attributes) to make this copy are:
   150
   151- [cp(1)](https://linux.die.net/man/1/cp): `cp -a rootfs-c9d-v1/ rootfs-c9d-v1.s1/`
   152- [rsync(1)](https://linux.die.net/man/1/rsync):  `rsync -aHAX rootfs-c9d-v1/ rootfs-c9d-v1.s1/`
   153- [tar(1)](https://linux.die.net/man/1/tar): `mkdir rootfs-c9d-v1.s1 && tar --acls --xattrs -C rootfs-c9d-v1/ -c . | tar -C rootfs-c9d-v1.s1/ --acls --xattrs -x` (including `--selinux` where supported)
   154
   155Any [changes](#change-types) to the snapshot MUST NOT change or affect the directory it was copied from.
   156
   157For example `rootfs-c9d-v1.s1` is an identical snapshot of `rootfs-c9d-v1`.
   158In this way `rootfs-c9d-v1.s1` is prepared for updates and alterations.
   159
   160**Implementor's Note**: *a copy-on-write or union filesystem can efficiently make directory snapshots*
   161
   162Initial layout of the snapshot:
   163
   164```text
   165rootfs-c9d-v1.s1/
   166  etc/
   167    my-app-config
   168  bin/
   169    my-app-binary
   170    my-app-tools
   171```
   172
   173See [Change Types](#change-types) for more details on changes.
   174
   175For example, add a directory at `/etc/my-app.d` containing a default config file, removing the existing config file.
   176Also a change (in attribute or file content) to `./bin/my-app-tools` binary to handle the config layout change.
   177
   178Following these changes, the representation of the `rootfs-c9d-v1.s1` directory:
   179
   180```text
   181rootfs-c9d-v1.s1/
   182  etc/
   183    my-app.d/
   184      default.cfg
   185  bin/
   186    my-app-binary
   187    my-app-tools
   188```
   189
   190### Determining Changes
   191
   192When two directories are compared, the relative root is the top-level directory.
   193The directories are compared, looking for files that have been [added, modified, or removed](#change-types).
   194
   195For this example, `rootfs-c9d-v1/` and `rootfs-c9d-v1.s1/` are recursively compared, each as relative root path.
   196
   197The following changeset is found:
   198
   199```text
   200Added:      /etc/my-app.d/
   201Added:      /etc/my-app.d/default.cfg
   202Modified:   /bin/my-app-tools
   203Deleted:    /etc/my-app-config
   204```
   205
   206This reflects the removal of `/etc/my-app-config` and creation of a file and directory at `/etc/my-app.d/default.cfg`.
   207`/bin/my-app-tools` has also been replaced with an updated version.
   208
   209### Representing Changes
   210
   211A [tar archive][tar-archive] is then created which contains _only_ this changeset:
   212
   213- Added and modified files and directories in their entirety
   214- Deleted files or directories marked with a [whiteout file](#whiteouts)
   215
   216The resulting tar archive for `rootfs-c9d-v1.s1` has the following entries:
   217
   218```text
   219./etc/my-app.d/
   220./etc/my-app.d/default.cfg
   221./bin/my-app-tools
   222./etc/.wh.my-app-config
   223```
   224
   225To signify that the resource `./etc/my-app-config` MUST be removed when the changeset is applied, the basename of the entry is prefixed with `.wh.`.
   226
   227## Applying Changesets
   228
   229- Layer Changesets of [media type](media-types.md) `application/vnd.oci.image.layer.v1.tar` are _applied_, rather than simply extracted as tar archives.
   230- Applying a layer changeset requires special consideration for the [whiteout](#whiteouts) files.
   231- In the absence of any [whiteout](#whiteouts) files in a layer changeset, the archive is extracted like a regular tar archive.
   232
   233### Changeset over existing files
   234
   235This section specifies applying an entry from a layer changeset if the target path already exists.
   236
   237If the entry and the existing path are both directories, then the existing path's attributes MUST be replaced by those of the entry in the changeset.
   238In all other cases, the implementation MUST do the semantic equivalent of the following:
   239
   240- removing the file path (e.g. [`unlink(2)`](https://linux.die.net/man/2/unlink) on Linux systems)
   241- recreating the file path, based on the contents and attributes of the changeset entry
   242
   243## Whiteouts
   244
   245- A whiteout file is an empty file with a special filename that signifies a path should be deleted.
   246- A whiteout filename consists of the prefix `.wh.` plus the basename of the path to be deleted.
   247- As files prefixed with `.wh.` are special whiteout markers, it is not possible to create a filesystem which has a file or directory with a name beginning with `.wh.`.
   248- Once a whiteout is applied, the whiteout itself MUST also be hidden.
   249- Whiteout files MUST only apply to resources in lower/parent layers.
   250- Files that are present in the same layer as a whiteout file can only be hidden by whiteout files in subsequent layers.
   251
   252The following is a base layer with several resources:
   253
   254```text
   255a/
   256a/b/
   257a/b/c/
   258a/b/c/bar
   259```
   260
   261When the next layer is created, the original `a/b` directory is deleted and recreated with `a/b/c/foo`:
   262
   263```text
   264a/
   265a/.wh..wh..opq
   266a/b/
   267a/b/c/
   268a/b/c/foo
   269```
   270
   271When processing the second layer, `a/.wh..wh..opq` is applied first, before creating the new version of `a/b`, regardless of the ordering in which the whiteout file was encountered.
   272For example, the following layer is equivalent to the layer above:
   273
   274```text
   275a/
   276a/b/
   277a/b/c/
   278a/b/c/foo
   279a/.wh..wh..opq
   280```
   281
   282Implementations SHOULD generate layers such that the whiteout files appear before sibling directory entries.
   283
   284### Opaque Whiteout
   285
   286- In addition to expressing that a single entry should be removed from a lower layer, layers may remove all of the children using an opaque whiteout entry.
   287- An opaque whiteout entry is a file with the name `.wh..wh..opq` indicating that all siblings are hidden in the lower layer.
   288
   289Let's take the following base layer as an example:
   290
   291```text
   292etc/
   293  my-app-config
   294bin/
   295  my-app-binary
   296  my-app-tools
   297  tools/
   298    my-app-tool-one
   299```
   300
   301If all children of `bin/` are removed, the next layer would have the following:
   302
   303```text
   304bin/
   305  .wh..wh..opq
   306```
   307
   308This is called _opaque whiteout_ format.
   309An _opaque whiteout_ file hides _all_ children of the `bin/` including sub-directories and all descendants.
   310Using _explicit whiteout_ files, this would be equivalent to the following:
   311
   312```text
   313bin/
   314  .wh.my-app-binary
   315  .wh.my-app-tools
   316  .wh.tools
   317```
   318
   319In this case, a unique whiteout file is generated for each entry.
   320If there were more children of `bin/` in the base layer, there would be an entry for each.
   321Note that this opaque file will apply to _all_ children, including sub-directories, other resources and all descendants.
   322
   323Implementations SHOULD generate layers using _explicit whiteout_ files, but MUST accept both.
   324
   325Any given image is likely to be composed of several of these Image Filesystem Changeset tar archives.
   326
   327## Non-Distributable Layers
   328
   329> **NOTE**: Non-distributable layers are deprecated, and not recommended for future use.
   330> Implementations SHOULD NOT produce new non-distributable layers.
   331
   332Due to legal requirements, certain layers may not be regularly distributable.
   333Such "non-distributable" layers are typically downloaded directly from a distributor but never uploaded.
   334
   335Non-distributable layers SHOULD be tagged with an alternative mediatype of `application/vnd.oci.image.layer.nondistributable.v1.tar`.
   336Implementations SHOULD NOT upload layers tagged with this media type; however, such a media type SHOULD NOT affect whether an implementation downloads the layer.
   337
   338[Descriptors](descriptor.md) referencing non-distributable layers MAY include `urls` for downloading these layers directly; however, the presence of the `urls` field SHOULD NOT be used to determine whether or not a layer is non-distributable.
   339
   340[libarchive-tar]: https://github.com/libarchive/libarchive/wiki/ManPageTar5#POSIX_ustar_Archives
   341[gnu-tar-standard]: https://www.gnu.org/software/tar/manual/html_node/Standard.html
   342[rfc1952_2]: https://tools.ietf.org/html/rfc1952
   343[tar-archive]: https://en.wikipedia.org/wiki/Tar_(computing)
   344[rfc8478]: https://tools.ietf.org/html/rfc8478

View as plain text