...

Text file src/github.com/xi2/xz/testdata/xz-utils/README

Documentation: github.com/xi2/xz/testdata/xz-utils

     1
     2.xz Test Files
     3----------------
     4
     50. Introduction
     6
     7    This directory contains bunch of files to test handling of .xz files
     8    in .xz decoder implementations. Many of the files have been created
     9    by hand with a hex editor, thus there is no better "source code" than
    10    the files themselves. All the test files (*.xz) and this README have
    11    been put into the public domain.
    12
    13
    141. File Types
    15
    16    Good files (good-*.xz) must decode successfully without requiring
    17    a lot of CPU time or RAM.
    18
    19    Unsupported files (unsupported-*.xz) are good files, but headers
    20    indicate features not supported by the current file format
    21    specification.
    22
    23    Bad files (bad-*.xz) must cause the decoder to give an error. Like
    24    with the good files, these files must not require a lot of CPU time
    25    or RAM before they get detected to be broken.
    26
    27
    282. Descriptions of Individual Files
    29
    302.1. Good Files
    31
    32    good-0-empty.xz has one Stream with no Blocks.
    33
    34    good-0pad-empty.xz has one Stream with no Blocks followed by
    35    four-byte Stream Padding.
    36
    37    good-0cat-empty.xz has two zero-Block Streams concatenated without
    38    Stream Padding.
    39
    40    good-0catpad-empty.xz has two zero-Block Streams concatenated with
    41    four-byte Stream Padding between the Streams.
    42
    43    good-1-check-none.xz has one Stream with one Block with two
    44    uncompressed LZMA2 chunks and no integrity check.
    45
    46    good-1-check-crc32.xz has one Stream with one Block with two
    47    uncompressed LZMA2 chunks and CRC32 check.
    48
    49    good-1-check-crc64.xz is like good-1-check-crc32.xz but with CRC64.
    50
    51    good-1-check-sha256.xz is like good-1-check-crc32.xz but with
    52    SHA256.
    53
    54    good-2-lzma2.xz has one Stream with two Blocks with one uncompressed
    55    LZMA2 chunk in each Block.
    56
    57    good-1-block_header-1.xz has both Compressed Size and Uncompressed
    58    Size in the Block Header. This has also four extra bytes of Header
    59    Padding.
    60
    61    good-1-block_header-2.xz has known Compressed Size.
    62
    63    good-1-block_header-3.xz has known Uncompressed Size.
    64
    65    good-1-delta-lzma2.tiff.xz is an image file that compresses
    66    better with Delta+LZMA2 than with plain LZMA2.
    67
    68    good-1-x86-lzma2.xz uses the x86 filter (BCJ) and LZMA2. The
    69    uncompressed file is compress_prepared_bcj_x86 found from the tests
    70    directory.
    71
    72    good-1-sparc-lzma2.xz uses the SPARC filter and LZMA. The
    73    uncompressed file is compress_prepared_bcj_sparc found from the tests
    74    directory.
    75
    76    good-1-lzma2-1.xz has two LZMA2 chunks, of which the second sets
    77    new properties.
    78
    79    good-1-lzma2-2.xz has two LZMA2 chunks, of which the second resets
    80    the state without specifying new properties.
    81
    82    good-1-lzma2-3.xz has two LZMA2 chunks, of which the first is
    83    uncompressed and the second is LZMA. The first chunk resets dictionary
    84    and the second sets new properties.
    85
    86    good-1-lzma2-4.xz has three LZMA2 chunks: First is LZMA, second is
    87    uncompressed with dictionary reset, and third is LZMA with new
    88    properties but without dictionary reset.
    89
    90    good-1-lzma2-5.xz has an empty LZMA2 stream with only the end of
    91    payload marker. XZ Utils 5.0.1 and older incorrectly see this file
    92    as corrupt.
    93
    94    good-1-3delta-lzma2.xz has three Delta filters and LZMA2.
    95
    96
    972.2. Unsupported Files
    98
    99    unsupported-check.xz uses Check ID 0x02 which isn't supported by
   100    the current version of the file format. It is implementation-defined
   101    how this file handled (it may reject it, or decode it possibly with
   102    a warning).
   103
   104    unsupported-block_header.xz has a non-null byte in Header Padding,
   105    which may indicate presence of a new unsupported field.
   106
   107    unsupported-filter_flags-1.xz has unsupported Filter ID 0x7F.
   108
   109    unsupported-filter_flags-2.xz specifies only Delta filter in the
   110    List of Filter Flags, but Delta isn't allowed as the last filter in
   111    the chain. It could be a little more correct to detect this file as
   112    corrupt instead of unsupported, but saying it is unsupported is
   113    simpler in case of liblzma.
   114
   115    unsupported-filter_flags-3.xz specifies two LZMA2 filters in the
   116    List of Filter Flags. LZMA2 is allowed only as the last filter in the
   117    chain. It could be a little more correct to detect this file as
   118    corrupt instead of unsupported, but saying it is unsupported is
   119    simpler in case of liblzma.
   120
   121
   1222.3. Bad Files
   123
   124    bad-0pad-empty.xz has one Stream with no Blocks followed by
   125    five-byte Stream Padding. Stream Padding must be a multiple of four
   126    bytes, thus this file is corrupt.
   127
   128    bad-0catpad-empty.xz has two zero-Block Streams concatenated with
   129    five-byte Stream Padding between the Streams.
   130
   131    bad-0cat-alone.xz is good-0-empty.xz concatenated with an empty
   132    LZMA_Alone file.
   133
   134    bad-0cat-header_magic.xz is good-0cat-empty.xz but with one byte
   135    wrong in the Header Magic Bytes field of the second Stream. liblzma
   136    gives LZMA_DATA_ERROR for this. (LZMA_FORMAT_ERROR is used only if
   137    the first Stream of a file has invalid Header Magic Bytes.)
   138
   139    bad-0-header_magic.xz is good-0-empty.xz but with one byte wrong
   140    in the Header Magic Bytes field. liblzma gives LZMA_FORMAT_ERROR for
   141    this.
   142
   143    bad-0-footer_magic.xz is good-0-empty.xz but with one byte wrong
   144    in the Footer Magic Bytes field. liblzma gives LZMA_DATA_ERROR for
   145    this.
   146
   147    bad-0-empty-truncated.xz is good-0-empty.xz without the last byte
   148    of the file.
   149
   150    bad-0-nonempty_index.xz has no Blocks but Index claims that there is
   151    one Block.
   152
   153    bad-0-backward_size.xz has wrong Backward Size in Stream Footer.
   154
   155    bad-1-stream_flags-1.xz has different Stream Flags in Stream Header
   156    and Stream Footer.
   157
   158    bad-1-stream_flags-2.xz has wrong CRC32 in Stream Header.
   159
   160    bad-1-stream_flags-3.xz has wrong CRC32 in Stream Footer.
   161
   162    bad-1-vli-1.xz has two-byte variable-length integer in the
   163    Uncompressed Size field in Block Header while one-byte would be enough
   164    for that value. It's important that the file gets rejected due to too
   165    big integer encoding instead of due to Uncompressed Size not matching
   166    the value stored in the Block Header. That is, the decoder must not
   167    try to decode the Compressed Data field.
   168
   169    bad-1-vli-2.xz has ten-byte variable-length integer as Uncompressed
   170    Size in Block Header. It's important that the file gets rejected due
   171    to too big integer encoding instead of due to Uncompressed Size not
   172    matching the value stored in the Block Header. That is, the decoder
   173    must not try to decode the Compressed Data field.
   174
   175    bad-1-block_header-1.xz has Block Header that ends in the middle of
   176    the Filter Flags field.
   177
   178    bad-1-block_header-2.xz has Block Header that has Compressed Size and
   179    Uncompressed Size but no List of Filter Flags field.
   180
   181    bad-1-block_header-3.xz has wrong CRC32 in Block Header.
   182
   183    bad-1-block_header-4.xz has too big Compressed Size in Block Header
   184    (2^63 - 1 bytes while maximum is a little less, because the whole
   185    Block must stay smaller than 2^63). It's important that the file
   186    gets rejected due to invalid Compressed Size value; the decoder
   187    must not try decoding the Compressed Data field.
   188
   189    bad-1-block_header-5.xz has zero as Compressed Size in Block Header.
   190
   191    bad-1-block_header-6.xz has corrupt Block Header which may crash
   192    xz -lvv in XZ Utils 5.0.3 and earlier. It was fixed in the commit
   193    c0297445064951807803457dca1611b3c47e7f0f.
   194
   195    bad-2-index-1.xz has wrong Unpadded Sizes in Index.
   196
   197    bad-2-index-2.xz has wrong Uncompressed Sizes in Index.
   198
   199    bad-2-index-3.xz has non-null byte in Index Padding.
   200
   201    bad-2-index-4.xz wrong CRC32 in Index.
   202
   203    bad-2-index-5.xz has zero as Unpadded Size. It is important that the
   204    file gets rejected specifically due to Unpadded Size having an invalid
   205    value.
   206
   207    bad-2-compressed_data_padding.xz has non-null byte in the padding of
   208    the Compressed Data field of the first Block.
   209
   210    bad-1-check-crc32.xz has wrong Check (CRC32).
   211
   212    bad-1-check-crc64.xz has wrong Check (CRC64).
   213
   214    bad-1-check-sha256.xz has wrong Check (SHA-256).
   215
   216    bad-1-lzma2-1.xz has LZMA2 stream whose first chunk (uncompressed)
   217    doesn't reset the dictionary.
   218
   219    bad-1-lzma2-2.xz has two LZMA2 chunks, of which the second chunk
   220    indicates dictionary reset, but the LZMA compressed data tries to
   221    repeat data from the previous chunk.
   222
   223    bad-1-lzma2-3.xz sets new invalid properties (lc=8, lp=0, pb=0) in
   224    the middle of Block.
   225
   226    bad-1-lzma2-4.xz has two LZMA2 chunks, of which the first is
   227    uncompressed and the second is LZMA. The first chunk resets dictionary
   228    as it should, but the second chunk tries to reset state without
   229    specifying properties for LZMA.
   230
   231    bad-1-lzma2-5.xz is like bad-1-lzma2-4.xz but doesn't try to reset
   232    anything in the header of the second chunk.
   233
   234    bad-1-lzma2-6.xz has reserved LZMA2 control byte value (0x03).
   235
   236    bad-1-lzma2-7.xz has EOPM at LZMA level.
   237
   238    bad-1-lzma2-8.xz is like good-1-lzma2-4.xz but doesn't set new
   239    properties in the third LZMA2 chunk.
   240

View as plain text