...

Text file src/github.com/peterbourgon/diskv/v3/README.md

Documentation: github.com/peterbourgon/diskv/v3

     1# What is diskv?
     2
     3Diskv (disk-vee) is a simple, persistent key-value store written in the Go
     4language. It starts with an incredibly simple API for storing arbitrary data on
     5a filesystem by key, and builds several layers of performance-enhancing
     6abstraction on top.  The end result is a conceptually simple, but highly
     7performant, disk-backed storage system.
     8
     9[![Build Status][1]][2]
    10
    11[1]: https://drone.io/github.com/peterbourgon/diskv/status.png
    12[2]: https://drone.io/github.com/peterbourgon/diskv/latest
    13
    14
    15# Installing
    16
    17Install [Go 1][3], either [from source][4] or [with a prepackaged binary][5].
    18Then,
    19
    20```bash
    21$ go get github.com/peterbourgon/diskv/v3
    22```
    23
    24[3]: http://golang.org
    25[4]: http://golang.org/doc/install/source
    26[5]: http://golang.org/doc/install
    27
    28
    29# Usage
    30
    31```go
    32package main
    33
    34import (
    35	"fmt"
    36	"github.com/peterbourgon/diskv/v3"
    37)
    38
    39func main() {
    40	// Simplest transform function: put all the data files into the base dir.
    41	flatTransform := func(s string) []string { return []string{} }
    42
    43	// Initialize a new diskv store, rooted at "my-data-dir", with a 1MB cache.
    44	d := diskv.New(diskv.Options{
    45		BasePath:     "my-data-dir",
    46		Transform:    flatTransform,
    47		CacheSizeMax: 1024 * 1024,
    48	})
    49
    50	// Write three bytes to the key "alpha".
    51	key := "alpha"
    52	d.Write(key, []byte{'1', '2', '3'})
    53
    54	// Read the value back out of the store.
    55	value, _ := d.Read(key)
    56	fmt.Printf("%v\n", value)
    57
    58	// Erase the key+value from the store (and the disk).
    59	d.Erase(key)
    60}
    61```
    62
    63More complex examples can be found in the "examples" subdirectory.
    64
    65
    66# Theory
    67
    68## Basic idea
    69
    70At its core, diskv is a map of a key (`string`) to arbitrary data (`[]byte`).
    71The data is written to a single file on disk, with the same name as the key.
    72The key determines where that file will be stored, via a user-provided
    73`TransformFunc`, which takes a key and returns a slice (`[]string`)
    74corresponding to a path list where the key file will be stored. The simplest
    75TransformFunc,
    76
    77```go
    78func SimpleTransform (key string) []string {
    79    return []string{}
    80}
    81```
    82
    83will place all keys in the same, base directory. The design is inspired by
    84[Redis diskstore][6]; a TransformFunc which emulates the default diskstore
    85behavior is available in the content-addressable-storage example.
    86
    87[6]: http://groups.google.com/group/redis-db/browse_thread/thread/d444bc786689bde9?pli=1
    88
    89**Note** that your TransformFunc should ensure that one valid key doesn't
    90transform to a subset of another valid key. That is, it shouldn't be possible
    91to construct valid keys that resolve to directory names. As a concrete example,
    92if your TransformFunc splits on every 3 characters, then
    93
    94```go
    95d.Write("abcabc", val) // OK: written to <base>/abc/abc/abcabc
    96d.Write("abc", val)    // Error: attempted write to <base>/abc/abc, but it's a directory
    97```
    98
    99This will be addressed in an upcoming version of diskv.
   100
   101Probably the most important design principle behind diskv is that your data is
   102always flatly available on the disk. diskv will never do anything that would
   103prevent you from accessing, copying, backing up, or otherwise interacting with
   104your data via common UNIX commandline tools.
   105
   106## Advanced path transformation
   107
   108If you need more control over the file name written to disk or if you want to support
   109slashes in your key name or special characters in the keys, you can use the
   110AdvancedTransform property. You must supply a function that returns
   111a special PathKey structure, which is a breakdown of a path and a file name. Strings
   112returned must be clean of any slashes or special characters:
   113
   114```go
   115func AdvancedTransformExample(key string) *diskv.PathKey {
   116	path := strings.Split(key, "/")
   117	last := len(path) - 1
   118	return &diskv.PathKey{
   119		Path:     path[:last],
   120		FileName: path[last] + ".txt",
   121	}
   122}
   123
   124// If you provide an AdvancedTransform, you must also provide its
   125// inverse:
   126
   127func InverseTransformExample(pathKey *diskv.PathKey) (key string) {
   128	txt := pathKey.FileName[len(pathKey.FileName)-4:]
   129	if txt != ".txt" {
   130		panic("Invalid file found in storage folder!")
   131	}
   132	return strings.Join(pathKey.Path, "/") + pathKey.FileName[:len(pathKey.FileName)-4]
   133}
   134
   135func main() {
   136	d := diskv.New(diskv.Options{
   137		BasePath:          "my-data-dir",
   138		AdvancedTransform: AdvancedTransformExample,
   139		InverseTransform:  InverseTransformExample,
   140		CacheSizeMax:      1024 * 1024,
   141	})
   142	// Write some text to the key "alpha/beta/gamma".
   143	key := "alpha/beta/gamma"
   144	d.WriteString(key, "¡Hola!") // will be stored in "<basedir>/alpha/beta/gamma.txt"
   145	fmt.Println(d.ReadString("alpha/beta/gamma"))
   146}
   147```
   148
   149
   150## Adding a cache
   151
   152An in-memory caching layer is provided by combining the BasicStore
   153functionality with a simple map structure, and keeping it up-to-date as
   154appropriate. Since the map structure in Go is not threadsafe, it's combined
   155with a RWMutex to provide safe concurrent access.
   156
   157## Adding order
   158
   159diskv is a key-value store and therefore inherently unordered. An ordering
   160system can be injected into the store by passing something which satisfies the
   161diskv.Index interface. (A default implementation, using Google's
   162[btree][7] package, is provided.) Basically, diskv keeps an ordered (by a
   163user-provided Less function) index of the keys, which can be queried.
   164
   165[7]: https://github.com/google/btree
   166
   167## Adding compression
   168
   169Something which implements the diskv.Compression interface may be passed
   170during store creation, so that all Writes and Reads are filtered through
   171a compression/decompression pipeline. Several default implementations,
   172using stdlib compression algorithms, are provided. Note that data is cached
   173compressed; the cost of decompression is borne with each Read.
   174
   175## Streaming
   176
   177diskv also now provides ReadStream and WriteStream methods, to allow very large
   178data to be handled efficiently.
   179
   180
   181# Future plans
   182
   183 * Needs plenty of robust testing: huge datasets, etc...
   184 * More thorough benchmarking
   185 * Your suggestions for use-cases I haven't thought of
   186
   187
   188# Credits and contributions
   189
   190Original idea, design and implementation: [Peter Bourgon](https://github.com/peterbourgon)
   191Other collaborations: [Javier Peletier](https://github.com/jpeletier) ([Epic Labs](https://www.epiclabs.io))

View as plain text