mirror of
https://github.com/willnorris/imageproxy.git
synced 2025-01-20 22:53:00 -05:00
b5984d2822
no specific features I'm looking to add, just keeping thing up to date. Unit tests and my manual testing seems like everything is still working as expected.
141 lines
4.4 KiB
Markdown
141 lines
4.4 KiB
Markdown
# What is diskv?
|
|
|
|
Diskv (disk-vee) is a simple, persistent key-value store written in the Go
|
|
language. It starts with an incredibly simple API for storing arbitrary data on
|
|
a filesystem by key, and builds several layers of performance-enhancing
|
|
abstraction on top. The end result is a conceptually simple, but highly
|
|
performant, disk-backed storage system.
|
|
|
|
[![Build Status][1]][2]
|
|
|
|
[1]: https://drone.io/github.com/peterbourgon/diskv/status.png
|
|
[2]: https://drone.io/github.com/peterbourgon/diskv/latest
|
|
|
|
|
|
# Installing
|
|
|
|
Install [Go 1][3], either [from source][4] or [with a prepackaged binary][5].
|
|
Then,
|
|
|
|
```bash
|
|
$ go get github.com/peterbourgon/diskv
|
|
```
|
|
|
|
[3]: http://golang.org
|
|
[4]: http://golang.org/doc/install/source
|
|
[5]: http://golang.org/doc/install
|
|
|
|
|
|
# Usage
|
|
|
|
```go
|
|
package main
|
|
|
|
import (
|
|
"fmt"
|
|
"github.com/peterbourgon/diskv"
|
|
)
|
|
|
|
func main() {
|
|
// Simplest transform function: put all the data files into the base dir.
|
|
flatTransform := func(s string) []string { return []string{} }
|
|
|
|
// Initialize a new diskv store, rooted at "my-data-dir", with a 1MB cache.
|
|
d := diskv.New(diskv.Options{
|
|
BasePath: "my-data-dir",
|
|
Transform: flatTransform,
|
|
CacheSizeMax: 1024 * 1024,
|
|
})
|
|
|
|
// Write three bytes to the key "alpha".
|
|
key := "alpha"
|
|
d.Write(key, []byte{'1', '2', '3'})
|
|
|
|
// Read the value back out of the store.
|
|
value, _ := d.Read(key)
|
|
fmt.Printf("%v\n", value)
|
|
|
|
// Erase the key+value from the store (and the disk).
|
|
d.Erase(key)
|
|
}
|
|
```
|
|
|
|
More complex examples can be found in the "examples" subdirectory.
|
|
|
|
|
|
# Theory
|
|
|
|
## Basic idea
|
|
|
|
At its core, diskv is a map of a key (`string`) to arbitrary data (`[]byte`).
|
|
The data is written to a single file on disk, with the same name as the key.
|
|
The key determines where that file will be stored, via a user-provided
|
|
`TransformFunc`, which takes a key and returns a slice (`[]string`)
|
|
corresponding to a path list where the key file will be stored. The simplest
|
|
TransformFunc,
|
|
|
|
```go
|
|
func SimpleTransform (key string) []string {
|
|
return []string{}
|
|
}
|
|
```
|
|
|
|
will place all keys in the same, base directory. The design is inspired by
|
|
[Redis diskstore][6]; a TransformFunc which emulates the default diskstore
|
|
behavior is available in the content-addressable-storage example.
|
|
|
|
[6]: http://groups.google.com/group/redis-db/browse_thread/thread/d444bc786689bde9?pli=1
|
|
|
|
**Note** that your TransformFunc should ensure that one valid key doesn't
|
|
transform to a subset of another valid key. That is, it shouldn't be possible
|
|
to construct valid keys that resolve to directory names. As a concrete example,
|
|
if your TransformFunc splits on every 3 characters, then
|
|
|
|
```go
|
|
d.Write("abcabc", val) // OK: written to <base>/abc/abc/abcabc
|
|
d.Write("abc", val) // Error: attempted write to <base>/abc/abc, but it's a directory
|
|
```
|
|
|
|
This will be addressed in an upcoming version of diskv.
|
|
|
|
Probably the most important design principle behind diskv is that your data is
|
|
always flatly available on the disk. diskv will never do anything that would
|
|
prevent you from accessing, copying, backing up, or otherwise interacting with
|
|
your data via common UNIX commandline tools.
|
|
|
|
## Adding a cache
|
|
|
|
An in-memory caching layer is provided by combining the BasicStore
|
|
functionality with a simple map structure, and keeping it up-to-date as
|
|
appropriate. Since the map structure in Go is not threadsafe, it's combined
|
|
with a RWMutex to provide safe concurrent access.
|
|
|
|
## Adding order
|
|
|
|
diskv is a key-value store and therefore inherently unordered. An ordering
|
|
system can be injected into the store by passing something which satisfies the
|
|
diskv.Index interface. (A default implementation, using Google's
|
|
[btree][7] package, is provided.) Basically, diskv keeps an ordered (by a
|
|
user-provided Less function) index of the keys, which can be queried.
|
|
|
|
[7]: https://github.com/google/btree
|
|
|
|
## Adding compression
|
|
|
|
Something which implements the diskv.Compression interface may be passed
|
|
during store creation, so that all Writes and Reads are filtered through
|
|
a compression/decompression pipeline. Several default implementations,
|
|
using stdlib compression algorithms, are provided. Note that data is cached
|
|
compressed; the cost of decompression is borne with each Read.
|
|
|
|
## Streaming
|
|
|
|
diskv also now provides ReadStream and WriteStream methods, to allow very large
|
|
data to be handled efficiently.
|
|
|
|
|
|
# Future plans
|
|
|
|
* Needs plenty of robust testing: huge datasets, etc...
|
|
* More thorough benchmarking
|
|
* Your suggestions for use-cases I haven't thought of
|