Because s3 doesn't support hard links we store duplicated blobs
as empty files. When the original blob is deleted its content is
moved to the the next duplicated blob and so on.
Signed-off-by: Petu Eusebiu <peusebiu@cisco.com>
The directory created by `T.TempDir` is automatically removed when the
test and all its subtests complete.
Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
In a production use case we found that the actual rootdir can be moved.
Currently, cache entries for dedupe record the full blob path which
doesn't work in the move use case.
Only for dedupe cache entries, record relative blob paths.
As the number of repos and layers increases, the greater the probability
that layers are duplicated. We dedupe using hard links when content is
the same. This is intended to be purely a storage layer optimization.
Access control when available is orthogonal this optimization.
Add a durable cache to help speed up layer lookups.
Update README.
Add more unit tests.