...

Package xz

import "github.com/xi2/xz"
Overview
Index
Examples

Overview ▾

Package xz implements XZ decompression natively in Go.

Usage

For ease of use, this package is designed to have a similar API to compress/gzip. See the examples for further details.

Implementation

This package is a translation from C to Go of XZ Embedded (http://tukaani.org/xz/embedded.html) with enhancements made so as to implement all mandatory and optional parts of the XZ file format specification v1.0.4. It supports all filters and block check types, supports multiple streams, and performs index verification using SHA-256 as recommended by the specification.

Speed

On the author's Intel Ivybridge i5, decompression speed is about half that of the standard XZ Utils (tested with a recent linux kernel tarball).

Thanks

Thanks are due to Lasse Collin and Igor Pavlov, the authors of XZ Embedded, on whose code package xz is based. It would not exist without their decision to allow others to modify and reuse their code.

Bug reports

For bug reports relating to this package please contact the author through https://github.com/xi2/xz/issues, and not the authors of XZ Embedded.

Constants

DefaultDictMax is the default maximum dictionary size in bytes used by the decoder. This value is sufficient to decompress files created with XZ Utils "xz -9".

const DefaultDictMax = 1 << 26 // 64 MiB

Variables

Package specific errors.

var (
    ErrUnsupportedCheck = errors.New("xz: integrity check type not supported")
    ErrMemlimit         = errors.New("xz: LZMA2 dictionary size exceeds max")
    ErrFormat           = errors.New("xz: file format not recognized")
    ErrOptions          = errors.New("xz: compression options not supported")
    ErrData             = errors.New("xz: data is corrupt")
    ErrBuf              = errors.New("xz: data is truncated or corrupt")
)

type CheckID

CheckID is the type of the data integrity check in an XZ stream calculated from the uncompressed data.

type CheckID int
const (
    CheckNone   CheckID = 0x00
    CheckCRC32  CheckID = 0x01
    CheckCRC64  CheckID = 0x04
    CheckSHA256 CheckID = 0x0A
)

func (CheckID) String

func (id CheckID) String() string

An XZ stream contains a stream header which holds information about the stream. That information is exposed as fields of the Reader. Currently it contains only the stream's data integrity check type.

type Header struct {
    CheckType CheckID // type of the stream's data integrity check
}

type Reader

A Reader is an io.Reader that can be used to retrieve uncompressed data from an XZ file.

In general, an XZ file can be a concatenation of other XZ files. Reads from the Reader return the concatenation of the uncompressed data of each.

type Reader struct {
    Header
    // contains filtered or unexported fields
}

func NewReader

func NewReader(r io.Reader, dictMax uint32) (*Reader, error)

NewReader creates a new Reader reading from r. The decompressor will use an LZMA2 dictionary size up to dictMax bytes in size. Passing a value of zero sets dictMax to DefaultDictMax. If an individual XZ stream requires a dictionary size greater than dictMax in order to decompress, Read will return ErrMemlimit.

If NewReader is passed a value of nil for r then a Reader is created such that all read attempts will return io.EOF. This is useful if you just want to allocate memory for a Reader which will later be initialized with Reset.

Due to internal buffering, the Reader may read more data than necessary from r.

Example

Code:

// load some XZ data into memory
data, err := ioutil.ReadFile(
    filepath.Join("testdata", "xz-utils", "good-1-check-sha256.xz"))
if err != nil {
    log.Fatal(err)
}
// create an xz.Reader to decompress the data
r, err := xz.NewReader(bytes.NewReader(data), 0)
if err != nil {
    log.Fatal(err)
}
// write the decompressed data to os.Stdout
_, err = io.Copy(os.Stdout, r)
if err != nil {
    log.Fatal(err)
}

Output:

Hello
World!

func (*Reader) Multistream

func (z *Reader) Multistream(ok bool)

Multistream controls whether the reader is operating in multistream mode.

If enabled (the default), the Reader expects the input to be a sequence of XZ streams, possibly interspersed with stream padding, which it reads one after another. The effect is that the concatenation of a sequence of XZ streams or XZ files is treated as equivalent to the compressed result of the concatenation of the sequence. This is standard behaviour for XZ readers.

Calling Multistream(false) disables this behaviour; disabling the behaviour can be useful when reading file formats that distinguish individual XZ streams. In this mode, when the Reader reaches the end of the stream, Read returns io.EOF. To start the next stream, call z.Reset(nil) followed by z.Multistream(false). If there is no next stream, z.Reset(nil) will return io.EOF.

Example

Code:

// load some XZ data into memory
data, err := ioutil.ReadFile(
    filepath.Join("testdata", "xz-utils", "good-1-check-sha256.xz"))
if err != nil {
    log.Fatal(err)
}
// create a MultiReader that will read the data twice
mr := io.MultiReader(bytes.NewReader(data), bytes.NewReader(data))
// create an xz.Reader from the MultiReader
r, err := xz.NewReader(mr, 0)
if err != nil {
    log.Fatal(err)
}
// set Multistream mode to false
r.Multistream(false)
// decompress the first stream
_, err = io.Copy(os.Stdout, r)
if err != nil {
    log.Fatal(err)
}
fmt.Println("Read first stream")
// reset the XZ reader so it is ready to read the second stream
err = r.Reset(nil)
if err != nil {
    log.Fatal(err)
}
// set Multistream mode to false again
r.Multistream(false)
// decompress the second stream
_, err = io.Copy(os.Stdout, r)
if err != nil {
    log.Fatal(err)
}
fmt.Println("Read second stream")
// reset the XZ reader so it is ready to read further streams
err = r.Reset(nil)
// confirm that the second stream was the last one
if err == io.EOF {
    fmt.Println("No more streams")
}

Output:

Hello
World!
Read first stream
Hello
World!
Read second stream
No more streams

func (*Reader) Read

func (z *Reader) Read(p []byte) (n int, err error)

func (*Reader) Reset

func (z *Reader) Reset(r io.Reader) error

Reset, for non-nil values of io.Reader r, discards the Reader z's state and makes it equivalent to the result of its original state from NewReader, but reading from r instead. This permits reusing a Reader rather than allocating a new one.

If you wish to leave r unchanged use z.Reset(nil). This keeps r unchanged and ensures internal buffering is preserved. If the Reader was at the end of a stream it is then ready to read any follow on streams. If there are no follow on streams z.Reset(nil) returns io.EOF. If the Reader was not at the end of a stream then z.Reset(nil) does nothing.