var ( Int32EncoderTraits int32EncoderTraits Int32DecoderTraits int32DecoderTraits Int64EncoderTraits int64EncoderTraits Int64DecoderTraits int64DecoderTraits Int96EncoderTraits int96EncoderTraits Int96DecoderTraits int96DecoderTraits Float32EncoderTraits float32EncoderTraits Float32DecoderTraits float32DecoderTraits Float64EncoderTraits float64EncoderTraits Float64DecoderTraits float64DecoderTraits BooleanEncoderTraits boolEncoderTraits BooleanDecoderTraits boolDecoderTraits ByteArrayEncoderTraits byteArrayEncoderTraits ByteArrayDecoderTraits byteArrayDecoderTraits FixedLenByteArrayEncoderTraits fixedLenByteArrayEncoderTraits FixedLenByteArrayDecoderTraits fixedLenByteArrayDecoderTraits )
func LevelEncodingMaxBufferSize(encoding parquet.Encoding, maxLvl int16, nbuffered int) int
LevelEncodingMaxBufferSize estimates the max number of bytes needed to encode data with the specified encoding given the max level and number of buffered values provided.
func NewDictConverter(dict TypedDecoder) utils.DictionaryConverter
NewDictConverter creates a dict converter of the appropriate type, using the passed in decoder as the decoder to decode the dictionary index.
BinaryMemoTable is an extension of the MemoTable interface adding extra methods for handling byte arrays/strings/fixed length byte arrays.
type BinaryMemoTable interface { MemoTable // ValuesSize returns the total number of bytes needed to copy all of the values // from this table. ValuesSize() int // CopyOffsets populates out with the start and end offsets of each value in the // table data. Out should be sized to Size()+1 to accomodate all of the offsets. CopyOffsets(out []int32) // CopyOffsetsSubset is like CopyOffsets but only gets a subset of the offsets // starting at the specified index. CopyOffsetsSubset(start int, out []int32) // CopyFixedWidthValues exists to cope with the fact that the table doesn't track // the fixed width when inserting the null value into the databuffer populating // a zero length byte slice for the null value (if found). CopyFixedWidthValues(start int, width int, out []byte) // VisitValues calls visitFn on each value in the table starting with the index specified VisitValues(start int, visitFn func([]byte)) // Retain increases the reference count of the separately stored binary data that is // kept alongside the table which contains all of the values in the table. This is // safe to call simultaneously across multiple goroutines. Retain() // Release decreases the reference count by 1 of the separately stored binary data // kept alongside the table containing the values. When the reference count goes to // 0, the memory is freed. This is safe to call across multiple goroutines simultaneously. Release() }
func NewBinaryDictionary(mem memory.Allocator) BinaryMemoTable
NewBinaryDictionary returns a memotable interface for use with strings, byte slices, parquet.ByteArray and parquet.FixedLengthByteArray only.
func NewBinaryMemoTable(mem memory.Allocator) BinaryMemoTable
BooleanDecoder is the interface for all encoding types that implement decoding bool values.
type BooleanDecoder interface { TypedDecoder Decode([]bool) (int, error) DecodeSpaced([]bool, int, []byte, int64) (int, error) }
BooleanEncoder is the interface for all encoding types that implement encoding bool values.
type BooleanEncoder interface { TypedEncoder Put([]bool) PutSpaced([]bool, []byte, int64) }
Buffer is an interface used as a general interface for handling buffers regardless of the underlying implementation.
type Buffer interface { Len() int Buf() []byte Bytes() []byte Resize(int) Release() }
BufferWriter is a utility class for building and writing to a memory.Buffer with a given allocator that fulfills the interfaces io.Write, io.WriteAt and io.Seeker, while providing the ability to pre-allocate memory.
type BufferWriter struct {
// contains filtered or unexported fields
}
func NewBufferWriter(initial int, mem memory.Allocator) *BufferWriter
NewBufferWriter constructs a buffer with initially reserved/allocated memory.
func NewBufferWriterFromBuffer(b *memory.Buffer, mem memory.Allocator) *BufferWriter
NewBufferWriterFromBuffer wraps the provided buffer to allow it to fulfill these interfaces.
func (b *BufferWriter) Bytes() []byte
Bytes returns the current bytes slice of slice Len
func (b *BufferWriter) Cap() int
Cap returns the current capacity of the underlying buffer
func (b *BufferWriter) Finish() *memory.Buffer
Finish returns the current buffer, with the responsibility for releasing the memory on the caller, resetting this writer to be re-used
func (b *BufferWriter) Len() int
Len provides the current Length of the byte slice
func (b *BufferWriter) Release()
Release the underlying buffer and not allocate anything else. To re-use this buffer, Reset() or Finish() should be called
func (b *BufferWriter) Reserve(nbytes int)
Reserve ensures that there is at least enough capacity to write nbytes without another allocation, may allocate more than that in order to efficiently reduce allocations
func (b *BufferWriter) Reset(initial int)
Reset will release any current memory and initialize it with the new allocated bytes.
func (b *BufferWriter) Seek(offset int64, whence int) (int64, error)
Seek fulfills the io.Seeker interface returning it's new position whence must be io.SeekStart, io.SeekCurrent or io.SeekEnd or it will be ignored.
func (b *BufferWriter) SetOffset(offset int)
func (b *BufferWriter) Tell() int64
func (b *BufferWriter) Truncate()
func (b *BufferWriter) UnsafeWrite(buf []byte) (int, error)
UnsafeWrite does not check the capacity / length before writing.
func (b *BufferWriter) UnsafeWriteCopy(ncopies int, pattern []byte) (int, error)
func (b *BufferWriter) Write(buf []byte) (int, error)
func (b *BufferWriter) WriteAt(p []byte, offset int64) (n int, err error)
WriteAt writes the bytes from p into this buffer starting at offset.
Does not affect the internal position of the writer.
ByteArrayDecoder is the interface for all encoding types that implement decoding parquet.ByteArray values.
type ByteArrayDecoder interface { TypedDecoder Decode([]parquet.ByteArray) (int, error) DecodeSpaced([]parquet.ByteArray, int, []byte, int64) (int, error) }
ByteArrayDictConverter is a helper for dictionary handling which is used for converting run length encoded indexes into the actual values that are stored in the dictionary index page.
type ByteArrayDictConverter struct {
// contains filtered or unexported fields
}
func (dc *ByteArrayDictConverter) Copy(out interface{}, vals []utils.IndexType) error
Copy populates the slice provided with the values in the dictionary at the indexes in the vals slice.
func (dc *ByteArrayDictConverter) Fill(out interface{}, val utils.IndexType) error
Fill populates the slice passed in entirely with the value at dictionary index indicated by val
func (dc *ByteArrayDictConverter) FillZero(out interface{})
FillZero populates the entire slice of out with the zero value for parquet.ByteArray
func (dc *ByteArrayDictConverter) IsValid(idxes ...utils.IndexType) bool
IsValid verifies that the set of indexes passed in are all valid indexes in the dictionary and if necessary decodes dictionary indexes up to the index requested.
ByteArrayEncoder is the interface for all encoding types that implement encoding parquet.ByteArray values.
type ByteArrayEncoder interface { TypedEncoder Put([]parquet.ByteArray) PutSpaced([]parquet.ByteArray, []byte, int64) }
DecoderTraits provides an interface for more easily interacting with types to generate decoders for specific types.
type DecoderTraits interface { Decoder(e parquet.Encoding, descr *schema.Column, useDict bool, mem memory.Allocator) TypedDecoder BytesRequired(int) int }
DeltaBitPackInt32Decoder decodes Int32 values which are packed using the Delta BitPacking algorithm.
type DeltaBitPackInt32Decoder struct {
// contains filtered or unexported fields
}
func (d DeltaBitPackInt32Decoder) Allocator() memory.Allocator
func (d *DeltaBitPackInt32Decoder) Decode(out []int32) (int, error)
Decode retrieves min(remaining values, len(out)) values from the data and returns the number of values actually decoded and any errors encountered.
func (d *DeltaBitPackInt32Decoder) DecodeSpaced(out []int32, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
DecodeSpaced is like Decode, but the result is spaced out appropriately based on the passed in bitmap
func (d DeltaBitPackInt32Decoder) SetData(nvalues int, data []byte) error
SetData sets the bytes and the expected number of values to decode into the decoder, updating the decoder and allowing it to be reused.
func (DeltaBitPackInt32Decoder) Type() parquet.Type
Type returns the physical parquet type that this decoder decodes, in this case Int32
DeltaBitPackInt32Encoder is an encoder for the delta bitpacking encoding for int32 data.
type DeltaBitPackInt32Encoder struct {
// contains filtered or unexported fields
}
func (enc DeltaBitPackInt32Encoder) EstimatedDataEncodedSize() int64
EstimatedDataEncodedSize returns the current amount of data actually flushed out and written
func (enc DeltaBitPackInt32Encoder) FlushValues() (Buffer, error)
FlushValues flushes any remaining data and returns the finished encoded buffer or returns nil and any error encountered during flushing.
func (enc DeltaBitPackInt32Encoder) Put(in []int32)
Put writes the values from the provided slice of int32 to the encoder
func (enc DeltaBitPackInt32Encoder) PutSpaced(in []int32, validBits []byte, validBitsOffset int64)
PutSpaced takes a slice of int32 along with a bitmap that describes the nulls and an offset into the bitmap in order to write spaced data to the encoder.
func (DeltaBitPackInt32Encoder) Type() parquet.Type
Type returns the underlying physical type this encoder works with, in this case Int32
DeltaBitPackInt64Decoder decodes a delta bit packed int64 column of data.
type DeltaBitPackInt64Decoder struct {
// contains filtered or unexported fields
}
func (d DeltaBitPackInt64Decoder) Allocator() memory.Allocator
func (d *DeltaBitPackInt64Decoder) Decode(out []int64) (int, error)
Decode retrieves min(remaining values, len(out)) values from the data and returns the number of values actually decoded and any errors encountered.
func (d DeltaBitPackInt64Decoder) DecodeSpaced(out []int64, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
DecodeSpaced is like Decode, but the result is spaced out appropriately based on the passed in bitmap
func (d DeltaBitPackInt64Decoder) SetData(nvalues int, data []byte) error
SetData sets the bytes and the expected number of values to decode into the decoder, updating the decoder and allowing it to be reused.
func (DeltaBitPackInt64Decoder) Type() parquet.Type
Type returns the physical parquet type that this decoder decodes, in this case Int64
DeltaBitPackInt32Encoder is an encoder for the delta bitpacking encoding for int32 data.
type DeltaBitPackInt64Encoder struct {
// contains filtered or unexported fields
}
func (enc DeltaBitPackInt64Encoder) EstimatedDataEncodedSize() int64
EstimatedDataEncodedSize returns the current amount of data actually flushed out and written
func (enc DeltaBitPackInt64Encoder) FlushValues() (Buffer, error)
FlushValues flushes any remaining data and returns the finished encoded buffer or returns nil and any error encountered during flushing.
func (enc DeltaBitPackInt64Encoder) Put(in []int64)
Put writes the values from the provided slice of int64 to the encoder
func (enc DeltaBitPackInt64Encoder) PutSpaced(in []int64, validBits []byte, validBitsOffset int64)
PutSpaced takes a slice of int64 along with a bitmap that describes the nulls and an offset into the bitmap in order to write spaced data to the encoder.
func (DeltaBitPackInt64Encoder) Type() parquet.Type
Type returns the underlying physical type this encoder works with, in this case Int64
DeltaByteArrayDecoder is a decoder for a column of data encoded using incremental or prefix encoding.
type DeltaByteArrayDecoder struct { *DeltaLengthByteArrayDecoder // contains filtered or unexported fields }
func (d *DeltaByteArrayDecoder) Allocator() memory.Allocator
func (d *DeltaByteArrayDecoder) Decode(out []parquet.ByteArray) (int, error)
Decode decodes byte arrays into the slice provided and returns the number of values actually decoded
func (d *DeltaByteArrayDecoder) DecodeSpaced(out []parquet.ByteArray, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
DecodeSpaced is like decode, but the result is spaced out based on the bitmap provided.
func (d DeltaByteArrayDecoder) Encoding() parquet.Encoding
Encoding returns the encoding type used by this decoder to decode the bytes.
func (d *DeltaByteArrayDecoder) SetData(nvalues int, data []byte) error
SetData expects the passed in data to be the prefix lengths, followed by the blocks of suffix data in order to initialize the decoder.
func (DeltaByteArrayDecoder) Type() parquet.Type
Type returns the underlying physical type this decoder operates on, in this case ByteArrays only
func (d DeltaByteArrayDecoder) ValuesLeft() int
ValuesLeft returns the number of remaining values that can be decoded
DeltaByteArrayEncoder is an encoder for writing bytearrays which are delta encoded this is also known as incremental encoding or front compression. For each element in a sequence of strings, we store the prefix length of the previous entry plus the suffix see https://en.wikipedia.org/wiki/Incremental_encoding for a longer description.
This is stored as a sequence of delta-encoded prefix lengths followed by the suffixes encoded as delta length byte arrays.
type DeltaByteArrayEncoder struct {
// contains filtered or unexported fields
}
func (e *DeltaByteArrayEncoder) Allocator() memory.Allocator
func (e *DeltaByteArrayEncoder) Bytes() []byte
Bytes returns the current bytes that have been written to the encoder's buffer but doesn't transfer ownership.
func (e *DeltaByteArrayEncoder) Encoding() parquet.Encoding
func (enc *DeltaByteArrayEncoder) EstimatedDataEncodedSize() int64
func (enc *DeltaByteArrayEncoder) FlushValues() (Buffer, error)
Flush flushes any remaining data out and returns the finished encoded buffer. or returns nil and any error encountered during flushing.
func (enc *DeltaByteArrayEncoder) Put(in []parquet.ByteArray)
Put writes a slice of ByteArrays to the encoder
func (enc *DeltaByteArrayEncoder) PutSpaced(in []parquet.ByteArray, validBits []byte, validBitsOffset int64)
PutSpaced is like Put, but assumes the data is already spaced for nulls and uses the bitmap provided and offset to compress the data before writing it without the null slots.
func (e *DeltaByteArrayEncoder) Release()
func (e *DeltaByteArrayEncoder) ReserveForWrite(n int)
ReserveForWrite allocates n bytes so that the next n bytes written do not require new allocations.
func (e *DeltaByteArrayEncoder) Reset()
Reset drops the data currently in the encoder and resets for new use.
func (DeltaByteArrayEncoder) Type() parquet.Type
Type returns the underlying physical type this operates on, in this case ByteArrays only
DeltaLengthByteArrayDecoder is a decoder for handling data produced by the corresponding encoder which expects delta packed lengths followed by the bytes of data.
type DeltaLengthByteArrayDecoder struct {
// contains filtered or unexported fields
}
func (d *DeltaLengthByteArrayDecoder) Allocator() memory.Allocator
func (d *DeltaLengthByteArrayDecoder) Decode(out []parquet.ByteArray) (int, error)
Decode populates the passed in slice with data decoded until it hits the length of out or runs out of values in the column to decode, then returns the number of values actually decoded.
func (d *DeltaLengthByteArrayDecoder) DecodeSpaced(out []parquet.ByteArray, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
DecodeSpaced is like Decode, but for spaced data using the provided bitmap to determine where the nulls should be inserted.
func (d *DeltaLengthByteArrayDecoder) Encoding() parquet.Encoding
Encoding returns the encoding type used by this decoder to decode the bytes.
func (d *DeltaLengthByteArrayDecoder) SetData(nvalues int, data []byte) error
SetData sets in the expected data to the decoder which should be nvalues delta packed lengths followed by the rest of the byte array data immediately after.
func (DeltaLengthByteArrayDecoder) Type() parquet.Type
Type returns the underlying type which is handled by this encoder, ByteArrays only.
func (d *DeltaLengthByteArrayDecoder) ValuesLeft() int
ValuesLeft returns the number of remaining values that can be decoded
DeltaLengthByteArrayEncoder encodes data using by taking all of the byte array lengths and encoding them in front using delta encoding, followed by all of the binary data concatenated back to back. The expected savings is from the cost of encoding the lengths and possibly better compression in the data which will no longer be interleaved with the lengths.
This encoding is always preferred over PLAIN for byte array columns where possible.
For example, if the data was "Hello", "World", "Foobar", "ABCDEF" the encoded data would be: DeltaEncoding(5, 5, 6, 6) "HelloWorldFoobarABCDEF"
type DeltaLengthByteArrayEncoder struct {
// contains filtered or unexported fields
}
func (e *DeltaLengthByteArrayEncoder) Allocator() memory.Allocator
func (e *DeltaLengthByteArrayEncoder) Bytes() []byte
Bytes returns the current bytes that have been written to the encoder's buffer but doesn't transfer ownership.
func (e *DeltaLengthByteArrayEncoder) Encoding() parquet.Encoding
func (e *DeltaLengthByteArrayEncoder) EstimatedDataEncodedSize() int64
func (enc *DeltaLengthByteArrayEncoder) FlushValues() (Buffer, error)
FlushValues flushes any remaining data and returns the final encoded buffer of data or returns nil and any error encountered.
func (enc *DeltaLengthByteArrayEncoder) Put(in []parquet.ByteArray)
Put writes the provided slice of byte arrays to the encoder
func (enc *DeltaLengthByteArrayEncoder) PutSpaced(in []parquet.ByteArray, validBits []byte, validBitsOffset int64)
PutSpaced is like Put, but the data is spaced out according to the bitmap provided and is compressed accordingly before it is written to drop the null data from the write.
func (e *DeltaLengthByteArrayEncoder) Release()
func (e *DeltaLengthByteArrayEncoder) ReserveForWrite(n int)
ReserveForWrite allocates n bytes so that the next n bytes written do not require new allocations.
func (e *DeltaLengthByteArrayEncoder) Reset()
Reset drops the data currently in the encoder and resets for new use.
func (DeltaLengthByteArrayEncoder) Type() parquet.Type
Type returns the underlying type which is handled by this encoder, ByteArrays only.
DictByteArrayDecoder is a decoder for decoding dictionary encoded data for parquet.ByteArray columns
type DictByteArrayDecoder struct {
// contains filtered or unexported fields
}
func (d *DictByteArrayDecoder) Decode(out []parquet.ByteArray) (int, error)
Decode populates the passed in slice with min(len(out), remaining values) values, decoding using the dictionary to get the actual values. Returns the number of values actually decoded and any error encountered.
func (d *DictByteArrayDecoder) DecodeIndices(numValues int, bldr array.Builder) (int, error)
func (d *DictByteArrayDecoder) DecodeIndicesSpaced(numValues, nullCount int, validBits []byte, offset int64, bldr array.Builder) (int, error)
func (d *DictByteArrayDecoder) DecodeSpaced(out []parquet.ByteArray, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
Decode spaced is like Decode but will space out the data leaving slots for null values based on the provided bitmap.
func (d *DictByteArrayDecoder) InsertDictionary(bldr array.Builder) error
func (d *DictByteArrayDecoder) SetData(nvals int, data []byte) error
SetData sets the index value data into the decoder.
func (d *DictByteArrayDecoder) SetDict(dict TypedDecoder)
SetDict sets a decoder that can be used to decode the dictionary that is used for this column in order to return the proper values.
func (DictByteArrayDecoder) Type() parquet.Type
Type returns the underlying physical type that can be decoded with this decoder
DictByteArrayEncoder is an encoder for parquet.ByteArray data using dictionary encoding
type DictByteArrayEncoder struct {
// contains filtered or unexported fields
}
func (d *DictByteArrayEncoder) BitWidth() int
BitWidth returns the max bitwidth that would be necessary for encoding the index values currently in the dictionary based on the size of the dictionary index.
func (d *DictByteArrayEncoder) DictEncodedSize() int
DictEncodedSize returns the current size of the encoded dictionary
func (d *DictByteArrayEncoder) EstimatedDataEncodedSize() int64
EstimatedDataEncodedSize returns the maximum number of bytes needed to store the RLE encoded indexes, not including the dictionary index in the computation.
func (d *DictByteArrayEncoder) FlushValues() (Buffer, error)
FlushValues dumps all the currently buffered indexes that would become the data page to a buffer and returns it or returns nil and any error encountered.
func (d *DictByteArrayEncoder) NumEntries() int
NumEntries returns the number of entires in the dictionary index for this encoder.
func (d *DictByteArrayEncoder) PreservedDictionary() arrow.Array
func (enc *DictByteArrayEncoder) Put(in []parquet.ByteArray)
Put takes a slice of ByteArrays to add and encode.
func (enc *DictByteArrayEncoder) PutByteArray(in parquet.ByteArray)
PutByteArray adds a single byte array to buffer, updating the dictionary and encoded size if it's a new value
func (enc *DictByteArrayEncoder) PutDictionary(values arrow.Array) error
PutDictionary allows pre-seeding a dictionary encoder with a dictionary from an Arrow Array.
The passed in array must not have any nulls and this can only be called on an empty encoder.
func (d *DictByteArrayEncoder) PutIndices(data arrow.Array) error
func (enc *DictByteArrayEncoder) PutSpaced(in []parquet.ByteArray, validBits []byte, validBitsOffset int64)
PutSpaced like with the non-dict encoder leaves out the values where the validBits bitmap is 0
func (d *DictByteArrayEncoder) Release()
func (d *DictByteArrayEncoder) Reset()
Reset drops all the currently encoded values from the index and indexes from the data to allow restarting the encoding process.
func (enc *DictByteArrayEncoder) Type() parquet.Type
Type returns the underlying physical type that can be encoded with this encoder
func (enc *DictByteArrayEncoder) WriteDict(out []byte)
WriteDict writes the dictionary out to the provided slice, out should be at least DictEncodedSize() bytes
func (d *DictByteArrayEncoder) WriteIndices(out []byte) (int, error)
WriteIndices performs Run Length encoding on the indexes and the writes the encoded index value data to the provided byte slice, returning the number of bytes actually written. If any error is encountered, it will return -1 and the error.
DictDecoder is a special TypedDecoder which implements dictionary decoding
type DictDecoder interface { TypedDecoder // SetDict takes in a decoder which can decode the dictionary index to be used SetDict(TypedDecoder) }
func NewDictDecoder(t parquet.Type, descr *schema.Column, mem memory.Allocator) DictDecoder
NewDictDecoder is like NewDecoder but for dictionary encodings, panics if type is bool.
if mem is nil, memory.DefaultAllocator will be used
DictEncoder is a special kind of TypedEncoder which implements Dictionary encoding.
type DictEncoder interface { TypedEncoder // WriteIndices populates the byte slice with the final indexes of data and returns // the number of bytes written WriteIndices(out []byte) (int, error) // DictEncodedSize returns the current size of the encoded dictionary index. DictEncodedSize() int // BitWidth returns the bitwidth needed to encode all of the index values based // on the number of values in the dictionary index. BitWidth() int // WriteDict populates out with the dictionary index values, out should be sized to at least // as many bytes as DictEncodedSize WriteDict(out []byte) // NumEntries returns the number of values currently in the dictionary index. NumEntries() int // PutDictionary allows pre-seeding a dictionary encoder with // a dictionary from an Arrow Array. // // The passed in array must not have any nulls and this can only // be called on an empty encoder. The dictionary passed in will // be stored internally as a preserved dictionary, and will be // released when this encoder is reset or released. PutDictionary(arrow.Array) error // PreservedDictionary returns the currently stored preserved dict // from PutDictionary or nil. PreservedDictionary() arrow.Array // PutIndices adds the indices from the passed in integral array to // the column data. It is assumed that the indices are within the bounds // of [0,dictSize) and is not validated. Returns an error if a non-integral // array is passed. PutIndices(arrow.Array) error }
DictFixedLenByteArrayDecoder is a decoder for decoding dictionary encoded data for parquet.FixedLenByteArray columns
type DictFixedLenByteArrayDecoder struct {
// contains filtered or unexported fields
}
func (d *DictFixedLenByteArrayDecoder) Decode(out []parquet.FixedLenByteArray) (int, error)
Decode populates the passed in slice with min(len(out), remaining values) values, decoding using the dictionary to get the actual values. Returns the number of values actually decoded and any error encountered.
func (d *DictFixedLenByteArrayDecoder) DecodeIndices(numValues int, bldr array.Builder) (int, error)
func (d *DictFixedLenByteArrayDecoder) DecodeIndicesSpaced(numValues, nullCount int, validBits []byte, offset int64, bldr array.Builder) (int, error)
func (d *DictFixedLenByteArrayDecoder) DecodeSpaced(out []parquet.FixedLenByteArray, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
Decode spaced is like Decode but will space out the data leaving slots for null values based on the provided bitmap.
func (d *DictFixedLenByteArrayDecoder) SetData(nvals int, data []byte) error
SetData sets the index value data into the decoder.
func (d *DictFixedLenByteArrayDecoder) SetDict(dict TypedDecoder)
SetDict sets a decoder that can be used to decode the dictionary that is used for this column in order to return the proper values.
func (DictFixedLenByteArrayDecoder) Type() parquet.Type
Type returns the underlying physical type that can be decoded with this decoder
DictFixedLenByteArrayEncoder is an encoder for parquet.FixedLenByteArray data using dictionary encoding
type DictFixedLenByteArrayEncoder struct {
// contains filtered or unexported fields
}
func (d *DictFixedLenByteArrayEncoder) BitWidth() int
BitWidth returns the max bitwidth that would be necessary for encoding the index values currently in the dictionary based on the size of the dictionary index.
func (d *DictFixedLenByteArrayEncoder) DictEncodedSize() int
DictEncodedSize returns the current size of the encoded dictionary
func (d *DictFixedLenByteArrayEncoder) EstimatedDataEncodedSize() int64
EstimatedDataEncodedSize returns the maximum number of bytes needed to store the RLE encoded indexes, not including the dictionary index in the computation.
func (d *DictFixedLenByteArrayEncoder) FlushValues() (Buffer, error)
FlushValues dumps all the currently buffered indexes that would become the data page to a buffer and returns it or returns nil and any error encountered.
func (d *DictFixedLenByteArrayEncoder) NumEntries() int
NumEntries returns the number of entires in the dictionary index for this encoder.
func (d *DictFixedLenByteArrayEncoder) PreservedDictionary() arrow.Array
func (enc *DictFixedLenByteArrayEncoder) Put(in []parquet.FixedLenByteArray)
Put writes fixed length values to a dictionary encoded column
func (enc *DictFixedLenByteArrayEncoder) PutDictionary(values arrow.Array) error
PutDictionary allows pre-seeding a dictionary encoder with a dictionary from an Arrow Array.
The passed in array must not have any nulls and this can only be called on an empty encoder.
func (d *DictFixedLenByteArrayEncoder) PutIndices(data arrow.Array) error
func (enc *DictFixedLenByteArrayEncoder) PutSpaced(in []parquet.FixedLenByteArray, validBits []byte, validBitsOffset int64)
PutSpaced is like Put but leaves space for nulls
func (d *DictFixedLenByteArrayEncoder) Release()
func (d *DictFixedLenByteArrayEncoder) Reset()
Reset drops all the currently encoded values from the index and indexes from the data to allow restarting the encoding process.
func (enc *DictFixedLenByteArrayEncoder) Type() parquet.Type
Type returns the underlying physical type that can be encoded with this encoder
func (enc *DictFixedLenByteArrayEncoder) WriteDict(out []byte)
WriteDict overrides the embedded WriteDict function to call a specialized function for copying out the Fixed length values from the dictionary more efficiently.
func (d *DictFixedLenByteArrayEncoder) WriteIndices(out []byte) (int, error)
WriteIndices performs Run Length encoding on the indexes and the writes the encoded index value data to the provided byte slice, returning the number of bytes actually written. If any error is encountered, it will return -1 and the error.
DictFloat32Decoder is a decoder for decoding dictionary encoded data for float32 columns
type DictFloat32Decoder struct {
// contains filtered or unexported fields
}
func (d *DictFloat32Decoder) Decode(out []float32) (int, error)
Decode populates the passed in slice with min(len(out), remaining values) values, decoding using the dictionary to get the actual values. Returns the number of values actually decoded and any error encountered.
func (d *DictFloat32Decoder) DecodeIndices(numValues int, bldr array.Builder) (int, error)
func (d *DictFloat32Decoder) DecodeIndicesSpaced(numValues, nullCount int, validBits []byte, offset int64, bldr array.Builder) (int, error)
func (d *DictFloat32Decoder) DecodeSpaced(out []float32, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
Decode spaced is like Decode but will space out the data leaving slots for null values based on the provided bitmap.
func (d *DictFloat32Decoder) SetData(nvals int, data []byte) error
SetData sets the index value data into the decoder.
func (d *DictFloat32Decoder) SetDict(dict TypedDecoder)
SetDict sets a decoder that can be used to decode the dictionary that is used for this column in order to return the proper values.
func (DictFloat32Decoder) Type() parquet.Type
Type returns the underlying physical type that can be decoded with this decoder
DictFloat32Encoder is an encoder for float32 data using dictionary encoding
type DictFloat32Encoder struct {
// contains filtered or unexported fields
}
func (d *DictFloat32Encoder) BitWidth() int
BitWidth returns the max bitwidth that would be necessary for encoding the index values currently in the dictionary based on the size of the dictionary index.
func (d *DictFloat32Encoder) DictEncodedSize() int
DictEncodedSize returns the current size of the encoded dictionary
func (d *DictFloat32Encoder) EstimatedDataEncodedSize() int64
EstimatedDataEncodedSize returns the maximum number of bytes needed to store the RLE encoded indexes, not including the dictionary index in the computation.
func (d *DictFloat32Encoder) FlushValues() (Buffer, error)
FlushValues dumps all the currently buffered indexes that would become the data page to a buffer and returns it or returns nil and any error encountered.
func (d *DictFloat32Encoder) NumEntries() int
NumEntries returns the number of entires in the dictionary index for this encoder.
func (d *DictFloat32Encoder) PreservedDictionary() arrow.Array
func (enc *DictFloat32Encoder) Put(in []float32)
Put encodes the values passed in, adding to the index as needed.
func (enc *DictFloat32Encoder) PutDictionary(values arrow.Array) error
PutDictionary allows pre-seeding a dictionary encoder with a dictionary from an Arrow Array.
The passed in array must not have any nulls and this can only be called on an empty encoder.
func (d *DictFloat32Encoder) PutIndices(data arrow.Array) error
func (enc *DictFloat32Encoder) PutSpaced(in []float32, validBits []byte, validBitsOffset int64)
PutSpaced is the same as Put but for when the data being encoded has slots open for null values, using the bitmap provided to skip values as needed.
func (d *DictFloat32Encoder) Release()
func (d *DictFloat32Encoder) Reset()
Reset drops all the currently encoded values from the index and indexes from the data to allow restarting the encoding process.
func (enc *DictFloat32Encoder) Type() parquet.Type
Type returns the underlying physical type that can be encoded with this encoder
func (enc *DictFloat32Encoder) WriteDict(out []byte)
WriteDict populates the byte slice with the dictionary index
func (d *DictFloat32Encoder) WriteIndices(out []byte) (int, error)
WriteIndices performs Run Length encoding on the indexes and the writes the encoded index value data to the provided byte slice, returning the number of bytes actually written. If any error is encountered, it will return -1 and the error.
DictFloat64Decoder is a decoder for decoding dictionary encoded data for float64 columns
type DictFloat64Decoder struct {
// contains filtered or unexported fields
}
func (d *DictFloat64Decoder) Decode(out []float64) (int, error)
Decode populates the passed in slice with min(len(out), remaining values) values, decoding using the dictionary to get the actual values. Returns the number of values actually decoded and any error encountered.
func (d *DictFloat64Decoder) DecodeIndices(numValues int, bldr array.Builder) (int, error)
func (d *DictFloat64Decoder) DecodeIndicesSpaced(numValues, nullCount int, validBits []byte, offset int64, bldr array.Builder) (int, error)
func (d *DictFloat64Decoder) DecodeSpaced(out []float64, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
Decode spaced is like Decode but will space out the data leaving slots for null values based on the provided bitmap.
func (d *DictFloat64Decoder) SetData(nvals int, data []byte) error
SetData sets the index value data into the decoder.
func (d *DictFloat64Decoder) SetDict(dict TypedDecoder)
SetDict sets a decoder that can be used to decode the dictionary that is used for this column in order to return the proper values.
func (DictFloat64Decoder) Type() parquet.Type
Type returns the underlying physical type that can be decoded with this decoder
DictFloat64Encoder is an encoder for float64 data using dictionary encoding
type DictFloat64Encoder struct {
// contains filtered or unexported fields
}
func (d *DictFloat64Encoder) BitWidth() int
BitWidth returns the max bitwidth that would be necessary for encoding the index values currently in the dictionary based on the size of the dictionary index.
func (d *DictFloat64Encoder) DictEncodedSize() int
DictEncodedSize returns the current size of the encoded dictionary
func (d *DictFloat64Encoder) EstimatedDataEncodedSize() int64
EstimatedDataEncodedSize returns the maximum number of bytes needed to store the RLE encoded indexes, not including the dictionary index in the computation.
func (d *DictFloat64Encoder) FlushValues() (Buffer, error)
FlushValues dumps all the currently buffered indexes that would become the data page to a buffer and returns it or returns nil and any error encountered.
func (d *DictFloat64Encoder) NumEntries() int
NumEntries returns the number of entires in the dictionary index for this encoder.
func (d *DictFloat64Encoder) PreservedDictionary() arrow.Array
func (enc *DictFloat64Encoder) Put(in []float64)
Put encodes the values passed in, adding to the index as needed.
func (enc *DictFloat64Encoder) PutDictionary(values arrow.Array) error
PutDictionary allows pre-seeding a dictionary encoder with a dictionary from an Arrow Array.
The passed in array must not have any nulls and this can only be called on an empty encoder.
func (d *DictFloat64Encoder) PutIndices(data arrow.Array) error
func (enc *DictFloat64Encoder) PutSpaced(in []float64, validBits []byte, validBitsOffset int64)
PutSpaced is the same as Put but for when the data being encoded has slots open for null values, using the bitmap provided to skip values as needed.
func (d *DictFloat64Encoder) Release()
func (d *DictFloat64Encoder) Reset()
Reset drops all the currently encoded values from the index and indexes from the data to allow restarting the encoding process.
func (enc *DictFloat64Encoder) Type() parquet.Type
Type returns the underlying physical type that can be encoded with this encoder
func (enc *DictFloat64Encoder) WriteDict(out []byte)
WriteDict populates the byte slice with the dictionary index
func (d *DictFloat64Encoder) WriteIndices(out []byte) (int, error)
WriteIndices performs Run Length encoding on the indexes and the writes the encoded index value data to the provided byte slice, returning the number of bytes actually written. If any error is encountered, it will return -1 and the error.
DictInt32Decoder is a decoder for decoding dictionary encoded data for int32 columns
type DictInt32Decoder struct {
// contains filtered or unexported fields
}
func (d *DictInt32Decoder) Decode(out []int32) (int, error)
Decode populates the passed in slice with min(len(out), remaining values) values, decoding using the dictionary to get the actual values. Returns the number of values actually decoded and any error encountered.
func (d *DictInt32Decoder) DecodeIndices(numValues int, bldr array.Builder) (int, error)
func (d *DictInt32Decoder) DecodeIndicesSpaced(numValues, nullCount int, validBits []byte, offset int64, bldr array.Builder) (int, error)
func (d *DictInt32Decoder) DecodeSpaced(out []int32, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
Decode spaced is like Decode but will space out the data leaving slots for null values based on the provided bitmap.
func (d *DictInt32Decoder) SetData(nvals int, data []byte) error
SetData sets the index value data into the decoder.
func (d *DictInt32Decoder) SetDict(dict TypedDecoder)
SetDict sets a decoder that can be used to decode the dictionary that is used for this column in order to return the proper values.
func (DictInt32Decoder) Type() parquet.Type
Type returns the underlying physical type that can be decoded with this decoder
DictInt32Encoder is an encoder for int32 data using dictionary encoding
type DictInt32Encoder struct {
// contains filtered or unexported fields
}
func (d *DictInt32Encoder) BitWidth() int
BitWidth returns the max bitwidth that would be necessary for encoding the index values currently in the dictionary based on the size of the dictionary index.
func (d *DictInt32Encoder) DictEncodedSize() int
DictEncodedSize returns the current size of the encoded dictionary
func (d *DictInt32Encoder) EstimatedDataEncodedSize() int64
EstimatedDataEncodedSize returns the maximum number of bytes needed to store the RLE encoded indexes, not including the dictionary index in the computation.
func (d *DictInt32Encoder) FlushValues() (Buffer, error)
FlushValues dumps all the currently buffered indexes that would become the data page to a buffer and returns it or returns nil and any error encountered.
func (d *DictInt32Encoder) NumEntries() int
NumEntries returns the number of entires in the dictionary index for this encoder.
func (d *DictInt32Encoder) PreservedDictionary() arrow.Array
func (enc *DictInt32Encoder) Put(in []int32)
Put encodes the values passed in, adding to the index as needed.
func (enc *DictInt32Encoder) PutDictionary(values arrow.Array) error
PutDictionary allows pre-seeding a dictionary encoder with a dictionary from an Arrow Array.
The passed in array must not have any nulls and this can only be called on an empty encoder.
func (d *DictInt32Encoder) PutIndices(data arrow.Array) error
func (enc *DictInt32Encoder) PutSpaced(in []int32, validBits []byte, validBitsOffset int64)
PutSpaced is the same as Put but for when the data being encoded has slots open for null values, using the bitmap provided to skip values as needed.
func (d *DictInt32Encoder) Release()
func (d *DictInt32Encoder) Reset()
Reset drops all the currently encoded values from the index and indexes from the data to allow restarting the encoding process.
func (enc *DictInt32Encoder) Type() parquet.Type
Type returns the underlying physical type that can be encoded with this encoder
func (enc *DictInt32Encoder) WriteDict(out []byte)
WriteDict populates the byte slice with the dictionary index
func (d *DictInt32Encoder) WriteIndices(out []byte) (int, error)
WriteIndices performs Run Length encoding on the indexes and the writes the encoded index value data to the provided byte slice, returning the number of bytes actually written. If any error is encountered, it will return -1 and the error.
DictInt64Decoder is a decoder for decoding dictionary encoded data for int64 columns
type DictInt64Decoder struct {
// contains filtered or unexported fields
}
func (d *DictInt64Decoder) Decode(out []int64) (int, error)
Decode populates the passed in slice with min(len(out), remaining values) values, decoding using the dictionary to get the actual values. Returns the number of values actually decoded and any error encountered.
func (d *DictInt64Decoder) DecodeIndices(numValues int, bldr array.Builder) (int, error)
func (d *DictInt64Decoder) DecodeIndicesSpaced(numValues, nullCount int, validBits []byte, offset int64, bldr array.Builder) (int, error)
func (d *DictInt64Decoder) DecodeSpaced(out []int64, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
Decode spaced is like Decode but will space out the data leaving slots for null values based on the provided bitmap.
func (d *DictInt64Decoder) SetData(nvals int, data []byte) error
SetData sets the index value data into the decoder.
func (d *DictInt64Decoder) SetDict(dict TypedDecoder)
SetDict sets a decoder that can be used to decode the dictionary that is used for this column in order to return the proper values.
func (DictInt64Decoder) Type() parquet.Type
Type returns the underlying physical type that can be decoded with this decoder
DictInt64Encoder is an encoder for int64 data using dictionary encoding
type DictInt64Encoder struct {
// contains filtered or unexported fields
}
func (d *DictInt64Encoder) BitWidth() int
BitWidth returns the max bitwidth that would be necessary for encoding the index values currently in the dictionary based on the size of the dictionary index.
func (d *DictInt64Encoder) DictEncodedSize() int
DictEncodedSize returns the current size of the encoded dictionary
func (d *DictInt64Encoder) EstimatedDataEncodedSize() int64
EstimatedDataEncodedSize returns the maximum number of bytes needed to store the RLE encoded indexes, not including the dictionary index in the computation.
func (d *DictInt64Encoder) FlushValues() (Buffer, error)
FlushValues dumps all the currently buffered indexes that would become the data page to a buffer and returns it or returns nil and any error encountered.
func (d *DictInt64Encoder) NumEntries() int
NumEntries returns the number of entires in the dictionary index for this encoder.
func (d *DictInt64Encoder) PreservedDictionary() arrow.Array
func (enc *DictInt64Encoder) Put(in []int64)
Put encodes the values passed in, adding to the index as needed.
func (enc *DictInt64Encoder) PutDictionary(values arrow.Array) error
PutDictionary allows pre-seeding a dictionary encoder with a dictionary from an Arrow Array.
The passed in array must not have any nulls and this can only be called on an empty encoder.
func (d *DictInt64Encoder) PutIndices(data arrow.Array) error
func (enc *DictInt64Encoder) PutSpaced(in []int64, validBits []byte, validBitsOffset int64)
PutSpaced is the same as Put but for when the data being encoded has slots open for null values, using the bitmap provided to skip values as needed.
func (d *DictInt64Encoder) Release()
func (d *DictInt64Encoder) Reset()
Reset drops all the currently encoded values from the index and indexes from the data to allow restarting the encoding process.
func (enc *DictInt64Encoder) Type() parquet.Type
Type returns the underlying physical type that can be encoded with this encoder
func (enc *DictInt64Encoder) WriteDict(out []byte)
WriteDict populates the byte slice with the dictionary index
func (d *DictInt64Encoder) WriteIndices(out []byte) (int, error)
WriteIndices performs Run Length encoding on the indexes and the writes the encoded index value data to the provided byte slice, returning the number of bytes actually written. If any error is encountered, it will return -1 and the error.
DictInt96Decoder is a decoder for decoding dictionary encoded data for parquet.Int96 columns
type DictInt96Decoder struct {
// contains filtered or unexported fields
}
func (d *DictInt96Decoder) Decode(out []parquet.Int96) (int, error)
Decode populates the passed in slice with min(len(out), remaining values) values, decoding using the dictionary to get the actual values. Returns the number of values actually decoded and any error encountered.
func (d *DictInt96Decoder) DecodeIndices(numValues int, bldr array.Builder) (int, error)
func (d *DictInt96Decoder) DecodeIndicesSpaced(numValues, nullCount int, validBits []byte, offset int64, bldr array.Builder) (int, error)
func (d *DictInt96Decoder) DecodeSpaced(out []parquet.Int96, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
Decode spaced is like Decode but will space out the data leaving slots for null values based on the provided bitmap.
func (d *DictInt96Decoder) SetData(nvals int, data []byte) error
SetData sets the index value data into the decoder.
func (d *DictInt96Decoder) SetDict(dict TypedDecoder)
SetDict sets a decoder that can be used to decode the dictionary that is used for this column in order to return the proper values.
func (DictInt96Decoder) Type() parquet.Type
Type returns the underlying physical type that can be decoded with this decoder
DictInt96Encoder is an encoder for parquet.Int96 data using dictionary encoding
type DictInt96Encoder struct {
// contains filtered or unexported fields
}
func (d *DictInt96Encoder) BitWidth() int
BitWidth returns the max bitwidth that would be necessary for encoding the index values currently in the dictionary based on the size of the dictionary index.
func (d *DictInt96Encoder) DictEncodedSize() int
DictEncodedSize returns the current size of the encoded dictionary
func (d *DictInt96Encoder) EstimatedDataEncodedSize() int64
EstimatedDataEncodedSize returns the maximum number of bytes needed to store the RLE encoded indexes, not including the dictionary index in the computation.
func (d *DictInt96Encoder) FlushValues() (Buffer, error)
FlushValues dumps all the currently buffered indexes that would become the data page to a buffer and returns it or returns nil and any error encountered.
func (d *DictInt96Encoder) NumEntries() int
NumEntries returns the number of entires in the dictionary index for this encoder.
func (d *DictInt96Encoder) PreservedDictionary() arrow.Array
func (enc *DictInt96Encoder) Put(in []parquet.Int96)
Put encodes the values passed in, adding to the index as needed
func (enc *DictInt96Encoder) PutDictionary(arrow.Array) error
PutDictionary allows pre-seeding a dictionary encoder with a dictionary from an Arrow Array.
The passed in array must not have any nulls and this can only be called on an empty encoder.
func (d *DictInt96Encoder) PutIndices(data arrow.Array) error
func (enc *DictInt96Encoder) PutSpaced(in []parquet.Int96, validBits []byte, validBitsOffset int64)
PutSpaced is like Put but assumes space for nulls
func (d *DictInt96Encoder) Release()
func (d *DictInt96Encoder) Reset()
Reset drops all the currently encoded values from the index and indexes from the data to allow restarting the encoding process.
func (enc *DictInt96Encoder) Type() parquet.Type
Type returns the underlying physical type that can be encoded with this encoder
func (enc *DictInt96Encoder) WriteDict(out []byte)
WriteDict populates the byte slice with the dictionary index
func (d *DictInt96Encoder) WriteIndices(out []byte) (int, error)
WriteIndices performs Run Length encoding on the indexes and the writes the encoded index value data to the provided byte slice, returning the number of bytes actually written. If any error is encountered, it will return -1 and the error.
EncoderTraits is an interface for the different types to make it more convenient to construct encoders for specific types.
type EncoderTraits interface { Encoder(format.Encoding, bool, *schema.Column, memory.Allocator) TypedEncoder }
FixedLenByteArrayDecoder is the interface for all encoding types that implement decoding parquet.FixedLenByteArray values.
type FixedLenByteArrayDecoder interface { TypedDecoder Decode([]parquet.FixedLenByteArray) (int, error) DecodeSpaced([]parquet.FixedLenByteArray, int, []byte, int64) (int, error) }
FixedLenByteArrayDictConverter is a helper for dictionary handling which is used for converting run length encoded indexes into the actual values that are stored in the dictionary index page.
type FixedLenByteArrayDictConverter struct {
// contains filtered or unexported fields
}
func (dc *FixedLenByteArrayDictConverter) Copy(out interface{}, vals []utils.IndexType) error
Copy populates the slice provided with the values in the dictionary at the indexes in the vals slice.
func (dc *FixedLenByteArrayDictConverter) Fill(out interface{}, val utils.IndexType) error
Fill populates the slice passed in entirely with the value at dictionary index indicated by val
func (dc *FixedLenByteArrayDictConverter) FillZero(out interface{})
FillZero populates the entire slice of out with the zero value for parquet.FixedLenByteArray
func (dc *FixedLenByteArrayDictConverter) IsValid(idxes ...utils.IndexType) bool
IsValid verifies that the set of indexes passed in are all valid indexes in the dictionary and if necessary decodes dictionary indexes up to the index requested.
FixedLenByteArrayEncoder is the interface for all encoding types that implement encoding parquet.FixedLenByteArray values.
type FixedLenByteArrayEncoder interface { TypedEncoder Put([]parquet.FixedLenByteArray) PutSpaced([]parquet.FixedLenByteArray, []byte, int64) }
Float32Decoder is the interface for all encoding types that implement decoding float32 values.
type Float32Decoder interface { TypedDecoder Decode([]float32) (int, error) DecodeSpaced([]float32, int, []byte, int64) (int, error) }
Float32DictConverter is a helper for dictionary handling which is used for converting run length encoded indexes into the actual values that are stored in the dictionary index page.
type Float32DictConverter struct {
// contains filtered or unexported fields
}
func (dc *Float32DictConverter) Copy(out interface{}, vals []utils.IndexType) error
Copy populates the slice provided with the values in the dictionary at the indexes in the vals slice.
func (dc *Float32DictConverter) Fill(out interface{}, val utils.IndexType) error
Fill populates the slice passed in entirely with the value at dictionary index indicated by val
func (dc *Float32DictConverter) FillZero(out interface{})
FillZero populates the entire slice of out with the zero value for float32
func (dc *Float32DictConverter) IsValid(idxes ...utils.IndexType) bool
IsValid verifies that the set of indexes passed in are all valid indexes in the dictionary and if necessary decodes dictionary indexes up to the index requested.
Float32Encoder is the interface for all encoding types that implement encoding float32 values.
type Float32Encoder interface { TypedEncoder Put([]float32) PutSpaced([]float32, []byte, int64) }
Float64Decoder is the interface for all encoding types that implement decoding float64 values.
type Float64Decoder interface { TypedDecoder Decode([]float64) (int, error) DecodeSpaced([]float64, int, []byte, int64) (int, error) }
Float64DictConverter is a helper for dictionary handling which is used for converting run length encoded indexes into the actual values that are stored in the dictionary index page.
type Float64DictConverter struct {
// contains filtered or unexported fields
}
func (dc *Float64DictConverter) Copy(out interface{}, vals []utils.IndexType) error
Copy populates the slice provided with the values in the dictionary at the indexes in the vals slice.
func (dc *Float64DictConverter) Fill(out interface{}, val utils.IndexType) error
Fill populates the slice passed in entirely with the value at dictionary index indicated by val
func (dc *Float64DictConverter) FillZero(out interface{})
FillZero populates the entire slice of out with the zero value for float64
func (dc *Float64DictConverter) IsValid(idxes ...utils.IndexType) bool
IsValid verifies that the set of indexes passed in are all valid indexes in the dictionary and if necessary decodes dictionary indexes up to the index requested.
Float64Encoder is the interface for all encoding types that implement encoding float64 values.
type Float64Encoder interface { TypedEncoder Put([]float64) PutSpaced([]float64, []byte, int64) }
Int32Decoder is the interface for all encoding types that implement decoding int32 values.
type Int32Decoder interface { TypedDecoder Decode([]int32) (int, error) DecodeSpaced([]int32, int, []byte, int64) (int, error) }
Int32DictConverter is a helper for dictionary handling which is used for converting run length encoded indexes into the actual values that are stored in the dictionary index page.
type Int32DictConverter struct {
// contains filtered or unexported fields
}
func (dc *Int32DictConverter) Copy(out interface{}, vals []utils.IndexType) error
Copy populates the slice provided with the values in the dictionary at the indexes in the vals slice.
func (dc *Int32DictConverter) Fill(out interface{}, val utils.IndexType) error
Fill populates the slice passed in entirely with the value at dictionary index indicated by val
func (dc *Int32DictConverter) FillZero(out interface{})
FillZero populates the entire slice of out with the zero value for int32
func (dc *Int32DictConverter) IsValid(idxes ...utils.IndexType) bool
IsValid verifies that the set of indexes passed in are all valid indexes in the dictionary and if necessary decodes dictionary indexes up to the index requested.
Int32Encoder is the interface for all encoding types that implement encoding int32 values.
type Int32Encoder interface { TypedEncoder Put([]int32) PutSpaced([]int32, []byte, int64) }
Int64Decoder is the interface for all encoding types that implement decoding int64 values.
type Int64Decoder interface { TypedDecoder Decode([]int64) (int, error) DecodeSpaced([]int64, int, []byte, int64) (int, error) }
Int64DictConverter is a helper for dictionary handling which is used for converting run length encoded indexes into the actual values that are stored in the dictionary index page.
type Int64DictConverter struct {
// contains filtered or unexported fields
}
func (dc *Int64DictConverter) Copy(out interface{}, vals []utils.IndexType) error
Copy populates the slice provided with the values in the dictionary at the indexes in the vals slice.
func (dc *Int64DictConverter) Fill(out interface{}, val utils.IndexType) error
Fill populates the slice passed in entirely with the value at dictionary index indicated by val
func (dc *Int64DictConverter) FillZero(out interface{})
FillZero populates the entire slice of out with the zero value for int64
func (dc *Int64DictConverter) IsValid(idxes ...utils.IndexType) bool
IsValid verifies that the set of indexes passed in are all valid indexes in the dictionary and if necessary decodes dictionary indexes up to the index requested.
Int64Encoder is the interface for all encoding types that implement encoding int64 values.
type Int64Encoder interface { TypedEncoder Put([]int64) PutSpaced([]int64, []byte, int64) }
Int96Decoder is the interface for all encoding types that implement decoding parquet.Int96 values.
type Int96Decoder interface { TypedDecoder Decode([]parquet.Int96) (int, error) DecodeSpaced([]parquet.Int96, int, []byte, int64) (int, error) }
Int96DictConverter is a helper for dictionary handling which is used for converting run length encoded indexes into the actual values that are stored in the dictionary index page.
type Int96DictConverter struct {
// contains filtered or unexported fields
}
func (dc *Int96DictConverter) Copy(out interface{}, vals []utils.IndexType) error
Copy populates the slice provided with the values in the dictionary at the indexes in the vals slice.
func (dc *Int96DictConverter) Fill(out interface{}, val utils.IndexType) error
Fill populates the slice passed in entirely with the value at dictionary index indicated by val
func (dc *Int96DictConverter) FillZero(out interface{})
FillZero populates the entire slice of out with the zero value for parquet.Int96
func (dc *Int96DictConverter) IsValid(idxes ...utils.IndexType) bool
IsValid verifies that the set of indexes passed in are all valid indexes in the dictionary and if necessary decodes dictionary indexes up to the index requested.
Int96Encoder is the interface for all encoding types that implement encoding parquet.Int96 values.
type Int96Encoder interface { TypedEncoder Put([]parquet.Int96) PutSpaced([]parquet.Int96, []byte, int64) }
LevelDecoder handles the decoding of repetition and definition levels from a parquet file supporting bit packed and run length encoded values.
type LevelDecoder struct {
// contains filtered or unexported fields
}
func (l *LevelDecoder) Decode(levels []int16) (int, int64)
Decode decodes the bytes that were set with SetData into the slice of levels returning the total number of levels that were decoded and the number of values which had a level equal to the max level, indicating how many physical values exist to be read.
func (l *LevelDecoder) SetData(encoding parquet.Encoding, maxLvl int16, nbuffered int, data []byte) (int, error)
SetData sets in the data to be decoded by subsequent calls by specifying the encoding type the maximum level (which is what determines the bit width), the number of values expected and the raw bytes to decode. Returns the number of bytes expected to be decoded.
func (l *LevelDecoder) SetDataV2(nbytes int32, maxLvl int16, nbuffered int, data []byte) error
SetDataV2 is the same as SetData but only for DataPageV2 pages and only supports run length encoding.
LevelEncoder is for handling the encoding of Definition and Repetition levels to parquet files.
type LevelEncoder struct {
// contains filtered or unexported fields
}
func (l *LevelEncoder) Encode(lvls []int16) (nencoded int, err error)
Encode encodes the slice of definition or repetition levels based on the currently configured encoding type and returns the number of values that were encoded.
func (l *LevelEncoder) EncodeNoFlush(lvls []int16) (nencoded int, err error)
EncodeNoFlush encodes the provided levels in the encoder, but doesn't flush the buffer and return it yet, appending these encoded values. Returns the number of values encoded and any error encountered or nil. If err is not nil, nencoded will be the number of values encoded before the error was encountered
func (l *LevelEncoder) Flush()
Flush flushes out any encoded data to the underlying writer.
func (l *LevelEncoder) Init(encoding parquet.Encoding, maxLvl int16, w utils.WriterAtWithLen)
Init is called to set up the desired encoding type, max level and underlying writer for a level encoder to control where the resulting encoded buffer will end up.
func (l *LevelEncoder) Len() int
Len returns the number of bytes that were written as Run Length encoded levels, this is only valid for run length encoding and will panic if using deprecated bit packed encoding.
func (l *LevelEncoder) Reset(maxLvl int16)
Reset resets the encoder allowing it to be reused and updating the maxlevel to the new specified value.
MemoTable interface that can be used to swap out implementations of the hash table used for handling dictionary encoding. Dictionary encoding is built against this interface to make it easy for code generation and changing implementations.
Values should remember the order they are inserted to generate a valid dictionary index
type MemoTable interface { // Reset drops everything in the table allowing it to be reused Reset() // Size returns the current number of unique values stored in the table // including whether or not a null value has been passed in using GetOrInsertNull Size() int // CopyValues populates out with the values currently in the table, out must // be a slice of the appropriate type for the table type. CopyValues(out interface{}) // CopyValuesSubset is like CopyValues but only copies a subset of values starting // at the indicated index. CopyValuesSubset(start int, out interface{}) WriteOut(out []byte) WriteOutSubset(start int, out []byte) // Get returns the index of the table the specified value is, and a boolean indicating // whether or not the value was found in the table. Will panic if val is not the appropriate // type for the underlying table. Get(val interface{}) (int, bool) // GetOrInsert is the same as Get, except if the value is not currently in the table it will // be inserted into the table. GetOrInsert(val interface{}) (idx int, existed bool, err error) // GetNull returns the index of the null value and whether or not it was found in the table GetNull() (int, bool) // GetOrInsertNull returns the index of the null value, if it didn't already exist in the table, // it is inserted. GetOrInsertNull() (idx int, existed bool) }
func NewFloat32Dictionary() MemoTable
NewFloat32Dictionary returns a memotable interface for use with Float32 values only
func NewFloat32MemoTable(memory.Allocator) MemoTable
func NewFloat64Dictionary() MemoTable
NewFloat64Dictionary returns a memotable interface for use with Float64 values only
func NewFloat64MemoTable(memory.Allocator) MemoTable
func NewInt32Dictionary() MemoTable
NewInt32Dictionary returns a memotable interface for use with Int32 values only
func NewInt32MemoTable(memory.Allocator) MemoTable
func NewInt64Dictionary() MemoTable
NewInt64Dictionary returns a memotable interface for use with Int64 values only
func NewInt64MemoTable(memory.Allocator) MemoTable
func NewInt96MemoTable(memory.Allocator) MemoTable
type NumericMemoTable interface { MemoTable // WriteOutLE writes the contents of the memo table out to the byteslice // but ensures the values are little-endian before writing them (converting // if on a big endian system). WriteOutLE(out []byte) // WriteOutSubsetLE writes the contents of the memo table out to the byteslice // starting with the index indicated by start, but ensures the values are little // endian before writing them (converting if on a big-endian system). WriteOutSubsetLE(start int, out []byte) }
PlainBooleanDecoder is for the Plain Encoding type, there is no dictionary decoding for bools.
type PlainBooleanDecoder struct {
// contains filtered or unexported fields
}
func (dec *PlainBooleanDecoder) Decode(out []bool) (int, error)
Decode fills out with bools decoded from the data at the current point or until we reach the end of the data.
Returns the number of values decoded
func (dec *PlainBooleanDecoder) DecodeSpaced(out []bool, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
DecodeSpaced is like Decode except it expands the values to leave spaces for null as determined by the validBits bitmap.
func (d *PlainBooleanDecoder) Encoding() parquet.Encoding
Encoding returns the encoding type used by this decoder to decode the bytes.
func (dec *PlainBooleanDecoder) SetData(nvals int, data []byte) error
func (PlainBooleanDecoder) Type() parquet.Type
Type for the PlainBooleanDecoder is parquet.Types.Boolean
func (d *PlainBooleanDecoder) ValuesLeft() int
ValuesLeft returns the number of remaining values that can be decoded
PlainBooleanEncoder encodes bools as a bitmap as per the Plain Encoding
type PlainBooleanEncoder struct {
// contains filtered or unexported fields
}
func (e *PlainBooleanEncoder) Allocator() memory.Allocator
func (e *PlainBooleanEncoder) Bytes() []byte
Bytes returns the current bytes that have been written to the encoder's buffer but doesn't transfer ownership.
func (e *PlainBooleanEncoder) Encoding() parquet.Encoding
func (enc *PlainBooleanEncoder) EstimatedDataEncodedSize() int64
EstimatedDataEncodedSize returns the current number of bytes that have been buffered so far
func (enc *PlainBooleanEncoder) FlushValues() (Buffer, error)
FlushValues returns the buffered data, the responsibility is on the caller to release the buffer memory
func (enc *PlainBooleanEncoder) Put(in []bool)
Put encodes the contents of in into the underlying data buffer.
func (enc *PlainBooleanEncoder) PutSpaced(in []bool, validBits []byte, validBitsOffset int64)
PutSpaced will use the validBits bitmap to determine which values are nulls and can be left out from the slice, and the encoded without those nulls.
func (e *PlainBooleanEncoder) Release()
func (e *PlainBooleanEncoder) ReserveForWrite(n int)
ReserveForWrite allocates n bytes so that the next n bytes written do not require new allocations.
func (e *PlainBooleanEncoder) Reset()
Reset drops the data currently in the encoder and resets for new use.
func (PlainBooleanEncoder) Type() parquet.Type
Type for the PlainBooleanEncoder is parquet.Types.Boolean
PlainByteArrayDecoder decodes a data chunk for bytearrays according to the plain encoding. The byte arrays will use slices to reference the data rather than copying it.
The parquet spec defines Plain encoding for ByteArrays as a 4 byte little endian integer containing the length of the bytearray followed by that many bytes being the raw data of the byte array.
type PlainByteArrayDecoder struct {
// contains filtered or unexported fields
}
func (pbad *PlainByteArrayDecoder) Decode(out []parquet.ByteArray) (int, error)
Decode will populate the slice of bytearrays in full or until the number of values is consumed.
Returns the number of values that were decoded.
func (pbad *PlainByteArrayDecoder) DecodeSpaced(out []parquet.ByteArray, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
DecodeSpaced is like Decode, but expands the slice out to leave empty values where the validBits bitmap has 0s
func (d *PlainByteArrayDecoder) Encoding() parquet.Encoding
Encoding returns the encoding type used by this decoder to decode the bytes.
func (d *PlainByteArrayDecoder) SetData(nvals int, data []byte) error
SetData sets the data for decoding into the decoder to update the available data bytes and number of values available.
func (PlainByteArrayDecoder) Type() parquet.Type
Type returns parquet.Types.ByteArray for this decoder
func (d *PlainByteArrayDecoder) ValuesLeft() int
ValuesLeft returns the number of remaining values that can be decoded
PlainByteArrayEncoder encodes byte arrays according to the spec for Plain encoding by encoding the length as a int32 followed by the bytes of the value.
type PlainByteArrayEncoder struct {
// contains filtered or unexported fields
}
func (e *PlainByteArrayEncoder) Allocator() memory.Allocator
func (e *PlainByteArrayEncoder) Bytes() []byte
Bytes returns the current bytes that have been written to the encoder's buffer but doesn't transfer ownership.
func (e *PlainByteArrayEncoder) Encoding() parquet.Encoding
func (e *PlainByteArrayEncoder) EstimatedDataEncodedSize() int64
func (e *PlainByteArrayEncoder) FlushValues() (Buffer, error)
FlushValues flushes any unwritten data to the buffer and returns the finished encoded buffer of data. This also clears the encoder, ownership of the data belongs to whomever called FlushValues, Release should be called on the resulting Buffer when done.
func (enc *PlainByteArrayEncoder) Put(in []parquet.ByteArray)
Put writes out all of the values in this slice to the encoding sink
func (enc *PlainByteArrayEncoder) PutByteArray(val parquet.ByteArray)
PutByteArray writes out the 4 bytes for the length followed by the data
func (enc *PlainByteArrayEncoder) PutSpaced(in []parquet.ByteArray, validBits []byte, validBitsOffset int64)
PutSpaced uses the bitmap of validBits to leave out anything that is null according to the bitmap.
If validBits is nil, this is equivalent to calling Put
func (e *PlainByteArrayEncoder) Release()
func (e *PlainByteArrayEncoder) ReserveForWrite(n int)
ReserveForWrite allocates n bytes so that the next n bytes written do not require new allocations.
func (e *PlainByteArrayEncoder) Reset()
Reset drops the data currently in the encoder and resets for new use.
func (PlainByteArrayEncoder) Type() parquet.Type
Type returns parquet.Types.ByteArray for the bytearray encoder
PlainFixedLenByteArrayDecoder is a plain encoding decoder for Fixed Length Byte Arrays
type PlainFixedLenByteArrayDecoder struct {
// contains filtered or unexported fields
}
func (pflba *PlainFixedLenByteArrayDecoder) Decode(out []parquet.FixedLenByteArray) (int, error)
Decode populates out with fixed length byte array values until either there are no more values to decode or the length of out has been filled. Then returns the total number of values that were decoded.
func (pflba *PlainFixedLenByteArrayDecoder) DecodeSpaced(out []parquet.FixedLenByteArray, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
DecodeSpaced does the same as Decode but spaces out the resulting slice according to the bitmap leaving space for null values
func (d *PlainFixedLenByteArrayDecoder) Encoding() parquet.Encoding
Encoding returns the encoding type used by this decoder to decode the bytes.
func (d *PlainFixedLenByteArrayDecoder) SetData(nvals int, data []byte) error
SetData sets the data for decoding into the decoder to update the available data bytes and number of values available.
func (PlainFixedLenByteArrayDecoder) Type() parquet.Type
Type returns the physical type this decoder operates on, FixedLength Byte Arrays
func (d *PlainFixedLenByteArrayDecoder) ValuesLeft() int
ValuesLeft returns the number of remaining values that can be decoded
PlainFixedLenByteArrayEncoder writes the raw bytes of the byte array always writing typeLength bytes for each value.
type PlainFixedLenByteArrayEncoder struct {
// contains filtered or unexported fields
}
func (e *PlainFixedLenByteArrayEncoder) Allocator() memory.Allocator
func (e *PlainFixedLenByteArrayEncoder) Bytes() []byte
Bytes returns the current bytes that have been written to the encoder's buffer but doesn't transfer ownership.
func (e *PlainFixedLenByteArrayEncoder) Encoding() parquet.Encoding
func (e *PlainFixedLenByteArrayEncoder) EstimatedDataEncodedSize() int64
func (e *PlainFixedLenByteArrayEncoder) FlushValues() (Buffer, error)
FlushValues flushes any unwritten data to the buffer and returns the finished encoded buffer of data. This also clears the encoder, ownership of the data belongs to whomever called FlushValues, Release should be called on the resulting Buffer when done.
func (enc *PlainFixedLenByteArrayEncoder) Put(in []parquet.FixedLenByteArray)
Put writes the provided values to the encoder
func (enc *PlainFixedLenByteArrayEncoder) PutSpaced(in []parquet.FixedLenByteArray, validBits []byte, validBitsOffset int64)
PutSpaced is like Put but works with data that is spaced out according to the passed in bitmap
func (e *PlainFixedLenByteArrayEncoder) Release()
func (e *PlainFixedLenByteArrayEncoder) ReserveForWrite(n int)
ReserveForWrite allocates n bytes so that the next n bytes written do not require new allocations.
func (e *PlainFixedLenByteArrayEncoder) Reset()
Reset drops the data currently in the encoder and resets for new use.
func (PlainFixedLenByteArrayEncoder) Type() parquet.Type
Type returns the underlying physical type this encoder works with, Fixed Length byte arrays.
PlainFloat32Decoder is a decoder specifically for decoding Plain Encoding data of float32 type.
type PlainFloat32Decoder struct {
// contains filtered or unexported fields
}
func (dec *PlainFloat32Decoder) Decode(out []float32) (int, error)
Decode populates the given slice with values from the data to be decoded, decoding the min(len(out), remaining values). It returns the number of values actually decoded and any error encountered.
func (dec *PlainFloat32Decoder) DecodeSpaced(out []float32, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
DecodeSpaced is the same as decode, except it expands the data out to leave spaces for null values as defined by the bitmap provided.
func (d *PlainFloat32Decoder) Encoding() parquet.Encoding
Encoding returns the encoding type used by this decoder to decode the bytes.
func (d *PlainFloat32Decoder) SetData(nvals int, data []byte) error
SetData sets the data for decoding into the decoder to update the available data bytes and number of values available.
func (PlainFloat32Decoder) Type() parquet.Type
Type returns the physical type this decoder is able to decode for
func (d *PlainFloat32Decoder) ValuesLeft() int
ValuesLeft returns the number of remaining values that can be decoded
PlainFloat32Encoder is an encoder for float32 values using Plain Encoding which in general is just storing the values as raw bytes of the appropriate size
type PlainFloat32Encoder struct {
// contains filtered or unexported fields
}
func (e *PlainFloat32Encoder) Allocator() memory.Allocator
func (e *PlainFloat32Encoder) Bytes() []byte
Bytes returns the current bytes that have been written to the encoder's buffer but doesn't transfer ownership.
func (e *PlainFloat32Encoder) Encoding() parquet.Encoding
func (e *PlainFloat32Encoder) EstimatedDataEncodedSize() int64
func (e *PlainFloat32Encoder) FlushValues() (Buffer, error)
FlushValues flushes any unwritten data to the buffer and returns the finished encoded buffer of data. This also clears the encoder, ownership of the data belongs to whomever called FlushValues, Release should be called on the resulting Buffer when done.
func (enc *PlainFloat32Encoder) Put(in []float32)
Put encodes a slice of values into the underlying buffer
func (enc *PlainFloat32Encoder) PutSpaced(in []float32, validBits []byte, validBitsOffset int64)
PutSpaced encodes a slice of values into the underlying buffer which are spaced out including null values defined by the validBits bitmap starting at a given bit offset. the values are first compressed by having the null slots removed before writing to the buffer
func (e *PlainFloat32Encoder) Release()
func (e *PlainFloat32Encoder) ReserveForWrite(n int)
ReserveForWrite allocates n bytes so that the next n bytes written do not require new allocations.
func (e *PlainFloat32Encoder) Reset()
Reset drops the data currently in the encoder and resets for new use.
func (PlainFloat32Encoder) Type() parquet.Type
Type returns the underlying physical type this encoder is able to encode
PlainFloat64Decoder is a decoder specifically for decoding Plain Encoding data of float64 type.
type PlainFloat64Decoder struct {
// contains filtered or unexported fields
}
func (dec *PlainFloat64Decoder) Decode(out []float64) (int, error)
Decode populates the given slice with values from the data to be decoded, decoding the min(len(out), remaining values). It returns the number of values actually decoded and any error encountered.
func (dec *PlainFloat64Decoder) DecodeSpaced(out []float64, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
DecodeSpaced is the same as decode, except it expands the data out to leave spaces for null values as defined by the bitmap provided.
func (d *PlainFloat64Decoder) Encoding() parquet.Encoding
Encoding returns the encoding type used by this decoder to decode the bytes.
func (d *PlainFloat64Decoder) SetData(nvals int, data []byte) error
SetData sets the data for decoding into the decoder to update the available data bytes and number of values available.
func (PlainFloat64Decoder) Type() parquet.Type
Type returns the physical type this decoder is able to decode for
func (d *PlainFloat64Decoder) ValuesLeft() int
ValuesLeft returns the number of remaining values that can be decoded
PlainFloat64Encoder is an encoder for float64 values using Plain Encoding which in general is just storing the values as raw bytes of the appropriate size
type PlainFloat64Encoder struct {
// contains filtered or unexported fields
}
func (e *PlainFloat64Encoder) Allocator() memory.Allocator
func (e *PlainFloat64Encoder) Bytes() []byte
Bytes returns the current bytes that have been written to the encoder's buffer but doesn't transfer ownership.
func (e *PlainFloat64Encoder) Encoding() parquet.Encoding
func (e *PlainFloat64Encoder) EstimatedDataEncodedSize() int64
func (e *PlainFloat64Encoder) FlushValues() (Buffer, error)
FlushValues flushes any unwritten data to the buffer and returns the finished encoded buffer of data. This also clears the encoder, ownership of the data belongs to whomever called FlushValues, Release should be called on the resulting Buffer when done.
func (enc *PlainFloat64Encoder) Put(in []float64)
Put encodes a slice of values into the underlying buffer
func (enc *PlainFloat64Encoder) PutSpaced(in []float64, validBits []byte, validBitsOffset int64)
PutSpaced encodes a slice of values into the underlying buffer which are spaced out including null values defined by the validBits bitmap starting at a given bit offset. the values are first compressed by having the null slots removed before writing to the buffer
func (e *PlainFloat64Encoder) Release()
func (e *PlainFloat64Encoder) ReserveForWrite(n int)
ReserveForWrite allocates n bytes so that the next n bytes written do not require new allocations.
func (e *PlainFloat64Encoder) Reset()
Reset drops the data currently in the encoder and resets for new use.
func (PlainFloat64Encoder) Type() parquet.Type
Type returns the underlying physical type this encoder is able to encode
PlainInt32Decoder is a decoder specifically for decoding Plain Encoding data of int32 type.
type PlainInt32Decoder struct {
// contains filtered or unexported fields
}
func (dec *PlainInt32Decoder) Decode(out []int32) (int, error)
Decode populates the given slice with values from the data to be decoded, decoding the min(len(out), remaining values). It returns the number of values actually decoded and any error encountered.
func (dec *PlainInt32Decoder) DecodeSpaced(out []int32, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
DecodeSpaced is the same as decode, except it expands the data out to leave spaces for null values as defined by the bitmap provided.
func (d *PlainInt32Decoder) Encoding() parquet.Encoding
Encoding returns the encoding type used by this decoder to decode the bytes.
func (d *PlainInt32Decoder) SetData(nvals int, data []byte) error
SetData sets the data for decoding into the decoder to update the available data bytes and number of values available.
func (PlainInt32Decoder) Type() parquet.Type
Type returns the physical type this decoder is able to decode for
func (d *PlainInt32Decoder) ValuesLeft() int
ValuesLeft returns the number of remaining values that can be decoded
PlainInt32Encoder is an encoder for int32 values using Plain Encoding which in general is just storing the values as raw bytes of the appropriate size
type PlainInt32Encoder struct {
// contains filtered or unexported fields
}
func (e *PlainInt32Encoder) Allocator() memory.Allocator
func (e *PlainInt32Encoder) Bytes() []byte
Bytes returns the current bytes that have been written to the encoder's buffer but doesn't transfer ownership.
func (e *PlainInt32Encoder) Encoding() parquet.Encoding
func (e *PlainInt32Encoder) EstimatedDataEncodedSize() int64
func (e *PlainInt32Encoder) FlushValues() (Buffer, error)
FlushValues flushes any unwritten data to the buffer and returns the finished encoded buffer of data. This also clears the encoder, ownership of the data belongs to whomever called FlushValues, Release should be called on the resulting Buffer when done.
func (enc *PlainInt32Encoder) Put(in []int32)
Put encodes a slice of values into the underlying buffer
func (enc *PlainInt32Encoder) PutSpaced(in []int32, validBits []byte, validBitsOffset int64)
PutSpaced encodes a slice of values into the underlying buffer which are spaced out including null values defined by the validBits bitmap starting at a given bit offset. the values are first compressed by having the null slots removed before writing to the buffer
func (e *PlainInt32Encoder) Release()
func (e *PlainInt32Encoder) ReserveForWrite(n int)
ReserveForWrite allocates n bytes so that the next n bytes written do not require new allocations.
func (e *PlainInt32Encoder) Reset()
Reset drops the data currently in the encoder and resets for new use.
func (PlainInt32Encoder) Type() parquet.Type
Type returns the underlying physical type this encoder is able to encode
PlainInt64Decoder is a decoder specifically for decoding Plain Encoding data of int64 type.
type PlainInt64Decoder struct {
// contains filtered or unexported fields
}
func (dec *PlainInt64Decoder) Decode(out []int64) (int, error)
Decode populates the given slice with values from the data to be decoded, decoding the min(len(out), remaining values). It returns the number of values actually decoded and any error encountered.
func (dec *PlainInt64Decoder) DecodeSpaced(out []int64, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
DecodeSpaced is the same as decode, except it expands the data out to leave spaces for null values as defined by the bitmap provided.
func (d *PlainInt64Decoder) Encoding() parquet.Encoding
Encoding returns the encoding type used by this decoder to decode the bytes.
func (d *PlainInt64Decoder) SetData(nvals int, data []byte) error
SetData sets the data for decoding into the decoder to update the available data bytes and number of values available.
func (PlainInt64Decoder) Type() parquet.Type
Type returns the physical type this decoder is able to decode for
func (d *PlainInt64Decoder) ValuesLeft() int
ValuesLeft returns the number of remaining values that can be decoded
PlainInt64Encoder is an encoder for int64 values using Plain Encoding which in general is just storing the values as raw bytes of the appropriate size
type PlainInt64Encoder struct {
// contains filtered or unexported fields
}
func (e *PlainInt64Encoder) Allocator() memory.Allocator
func (e *PlainInt64Encoder) Bytes() []byte
Bytes returns the current bytes that have been written to the encoder's buffer but doesn't transfer ownership.
func (e *PlainInt64Encoder) Encoding() parquet.Encoding
func (e *PlainInt64Encoder) EstimatedDataEncodedSize() int64
func (e *PlainInt64Encoder) FlushValues() (Buffer, error)
FlushValues flushes any unwritten data to the buffer and returns the finished encoded buffer of data. This also clears the encoder, ownership of the data belongs to whomever called FlushValues, Release should be called on the resulting Buffer when done.
func (enc *PlainInt64Encoder) Put(in []int64)
Put encodes a slice of values into the underlying buffer
func (enc *PlainInt64Encoder) PutSpaced(in []int64, validBits []byte, validBitsOffset int64)
PutSpaced encodes a slice of values into the underlying buffer which are spaced out including null values defined by the validBits bitmap starting at a given bit offset. the values are first compressed by having the null slots removed before writing to the buffer
func (e *PlainInt64Encoder) Release()
func (e *PlainInt64Encoder) ReserveForWrite(n int)
ReserveForWrite allocates n bytes so that the next n bytes written do not require new allocations.
func (e *PlainInt64Encoder) Reset()
Reset drops the data currently in the encoder and resets for new use.
func (PlainInt64Encoder) Type() parquet.Type
Type returns the underlying physical type this encoder is able to encode
PlainInt96Decoder is a decoder specifically for decoding Plain Encoding data of parquet.Int96 type.
type PlainInt96Decoder struct {
// contains filtered or unexported fields
}
func (dec *PlainInt96Decoder) Decode(out []parquet.Int96) (int, error)
Decode populates the given slice with values from the data to be decoded, decoding the min(len(out), remaining values). It returns the number of values actually decoded and any error encountered.
func (dec *PlainInt96Decoder) DecodeSpaced(out []parquet.Int96, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
DecodeSpaced is the same as decode, except it expands the data out to leave spaces for null values as defined by the bitmap provided.
func (d *PlainInt96Decoder) Encoding() parquet.Encoding
Encoding returns the encoding type used by this decoder to decode the bytes.
func (d *PlainInt96Decoder) SetData(nvals int, data []byte) error
SetData sets the data for decoding into the decoder to update the available data bytes and number of values available.
func (PlainInt96Decoder) Type() parquet.Type
Type returns the physical type this decoder is able to decode for
func (d *PlainInt96Decoder) ValuesLeft() int
ValuesLeft returns the number of remaining values that can be decoded
PlainInt96Encoder is an encoder for parquet.Int96 values using Plain Encoding which in general is just storing the values as raw bytes of the appropriate size
type PlainInt96Encoder struct {
// contains filtered or unexported fields
}
func (e *PlainInt96Encoder) Allocator() memory.Allocator
func (e *PlainInt96Encoder) Bytes() []byte
Bytes returns the current bytes that have been written to the encoder's buffer but doesn't transfer ownership.
func (e *PlainInt96Encoder) Encoding() parquet.Encoding
func (e *PlainInt96Encoder) EstimatedDataEncodedSize() int64
func (e *PlainInt96Encoder) FlushValues() (Buffer, error)
FlushValues flushes any unwritten data to the buffer and returns the finished encoded buffer of data. This also clears the encoder, ownership of the data belongs to whomever called FlushValues, Release should be called on the resulting Buffer when done.
func (enc *PlainInt96Encoder) Put(in []parquet.Int96)
Put encodes a slice of values into the underlying buffer
func (enc *PlainInt96Encoder) PutSpaced(in []parquet.Int96, validBits []byte, validBitsOffset int64)
PutSpaced encodes a slice of values into the underlying buffer which are spaced out including null values defined by the validBits bitmap starting at a given bit offset. the values are first compressed by having the null slots removed before writing to the buffer
func (e *PlainInt96Encoder) Release()
func (e *PlainInt96Encoder) ReserveForWrite(n int)
ReserveForWrite allocates n bytes so that the next n bytes written do not require new allocations.
func (e *PlainInt96Encoder) Reset()
Reset drops the data currently in the encoder and resets for new use.
func (PlainInt96Encoder) Type() parquet.Type
Type returns the underlying physical type this encoder is able to encode
PooledBufferWriter uses buffers from the buffer pool to back it while implementing io.Writer and io.WriterAt interfaces
type PooledBufferWriter struct {
// contains filtered or unexported fields
}
func NewPooledBufferWriter(initial int) *PooledBufferWriter
NewPooledBufferWriter returns a new buffer with 'initial' bytes reserved and pre-allocated to guarantee that writing that many more bytes will not require another allocation.
func (b *PooledBufferWriter) Bytes() []byte
Bytes returns the current bytes slice of slice Len
func (b *PooledBufferWriter) Finish() Buffer
Finish returns the current buffer, with the responsibility for releasing the memory on the caller, resetting this writer to be re-used
func (b *PooledBufferWriter) Len() int
Len provides the current Length of the byte slice
func (b *PooledBufferWriter) Reserve(nbytes int)
Reserve pre-allocates nbytes to ensure that the next write of that many bytes will not require another allocation.
func (b *PooledBufferWriter) Reset(initial int)
Reset will release any current memory and initialize it with the new allocated bytes.
func (b *PooledBufferWriter) SetOffset(offset int)
SetOffset sets an offset in the buffer which will ensure that all references to offsets and sizes in the buffer will be offset by this many bytes, allowing the writer to reserve space in the buffer.
func (b *PooledBufferWriter) Tell() int64
func (b *PooledBufferWriter) UnsafeWrite(buf []byte) (n int, err error)
UnsafeWrite does not check the capacity / length before writing.
func (b *PooledBufferWriter) UnsafeWriteCopy(ncopies int, pattern []byte) (int, error)
func (b *PooledBufferWriter) Write(buf []byte) (int, error)
func (b *PooledBufferWriter) WriteAt(p []byte, offset int64) (n int, err error)
WriteAt writes the bytes from p into this buffer starting at offset.
Does not affect the internal position of the writer.
type RleBooleanDecoder struct {
// contains filtered or unexported fields
}
func (dec *RleBooleanDecoder) Decode(out []bool) (int, error)
func (dec *RleBooleanDecoder) DecodeSpaced(out []bool, nullCount int, validBits []byte, validBitsOffset int64) (int, error)
func (d *RleBooleanDecoder) Encoding() parquet.Encoding
Encoding returns the encoding type used by this decoder to decode the bytes.
func (dec *RleBooleanDecoder) SetData(nvals int, data []byte) error
func (RleBooleanDecoder) Type() parquet.Type
func (d *RleBooleanDecoder) ValuesLeft() int
ValuesLeft returns the number of remaining values that can be decoded
type RleBooleanEncoder struct {
// contains filtered or unexported fields
}
func (e *RleBooleanEncoder) Allocator() memory.Allocator
func (e *RleBooleanEncoder) Bytes() []byte
Bytes returns the current bytes that have been written to the encoder's buffer but doesn't transfer ownership.
func (e *RleBooleanEncoder) Encoding() parquet.Encoding
func (enc *RleBooleanEncoder) EstimatedDataEncodedSize() int64
func (enc *RleBooleanEncoder) FlushValues() (Buffer, error)
func (enc *RleBooleanEncoder) Put(in []bool)
func (enc *RleBooleanEncoder) PutSpaced(in []bool, validBits []byte, validBitsOffset int64)
func (e *RleBooleanEncoder) Release()
func (e *RleBooleanEncoder) ReserveForWrite(n int)
ReserveForWrite allocates n bytes so that the next n bytes written do not require new allocations.
func (e *RleBooleanEncoder) Reset()
Reset drops the data currently in the encoder and resets for new use.
func (RleBooleanEncoder) Type() parquet.Type
TypedDecoder is the general interface for all decoder types which can then be type asserted to a specific Type Decoder
type TypedDecoder interface { // SetData updates the data in the decoder with the passed in byte slice and the // stated number of values as expected to be decoded. SetData(buffered int, buf []byte) error // Encoding returns the encoding type that this decoder decodes data of Encoding() parquet.Encoding // ValuesLeft returns the number of remaining values to be decoded ValuesLeft() int // Type returns the physical type this can decode. Type() parquet.Type }
func NewDecoder(t parquet.Type, e parquet.Encoding, descr *schema.Column, mem memory.Allocator) TypedDecoder
NewDecoder constructs a decoder for a given type and encoding
TypedEncoder is the general interface for all encoding types which can then be type asserted to a specific Type Encoder
type TypedEncoder interface { // Bytes returns the current slice of bytes that have been encoded but does not pass ownership Bytes() []byte // Reset resets the encoder and dumps all the data to let it be reused. Reset() // ReserveForWrite reserves n bytes in the buffer so that the next n bytes written will not // cause a memory allocation. ReserveForWrite(n int) // EstimatedDataEncodedSize returns the estimated number of bytes in the buffer // so far. EstimatedDataEncodedSize() int64 // FlushValues finishes up any unwritten data and returns the buffer of data passing // ownership to the caller, Release needs to be called on the Buffer to free the memory // if error is nil FlushValues() (Buffer, error) // Encoding returns the type of encoding that this encoder operates with Encoding() parquet.Encoding // Allocator returns the allocator that was used when creating this encoder Allocator() memory.Allocator // Type returns the underlying physical type this encodes. Type() parquet.Type Release() }
func NewEncoder(t parquet.Type, e parquet.Encoding, useDict bool, descr *schema.Column, mem memory.Allocator) TypedEncoder
NewEncoder will return the appropriately typed encoder for the requested physical type and encoding.
If mem is nil, memory.DefaultAllocator will be used.