The storage layer is protected with read-write locks.
However, we may be holding the locks over unnecessarily large critical
sections.
The typical workflow is that a blob is first uploaded via a per-client
private session-id meaning the blob is not publicly visible yet. When
the blob being uploaded is very large, the transfer takes a long time
while holding the lock.
Private session-id based uploads don't really need locks, and hold locks
only when blobs are published after the upload is complete.
dstRecord :- blob path stored in cache.
dst :- blob path that is trying to be uploaded.
Currently, if the actual blob on disk may have been removed by GC/delete, during syncing the cache dst is being passed to DeleteBlob function and retry section is being continuously called because DeleteBlob function never deletes dst path (doesn't exist in db), dstRecord should be passed into DeleteBlob function because dstRecord is actual blob path stored in db.
If dst and dstRecord path value is same then this issue will not be produced and DeleteBlob method will delete the blob info from cache but if both are different then DeleteBlob method will try to delete dst path which is not in cache.
Note:- boltdb delete method return nil even when value doesn't exist (https://godoc.org/github.com/boltdb/bolt#Bucket.Delete)
We perform inline garbage collection of orphan blobs. However, the
dist-spec poses a problem because blobs begin their life as orphan blobs
and then a manifest is add which refers to these blobs.
We use umoci's GC() to perform garbage collection and policy support
has been added recently which can control whether a blob can be skipped
for GC.
In this patch, we use a time-based policy to skip blobs.
Go version changed to 1.14.4
Golangci-lint changed to 1.26.0
Bazel version changed to 3.0.0
Bazel rules_go version changed to 0.23.3
Bazel gazelle version changed to v0.21.0
Bazel build tools version changed to 0.25.1
Bazel skylib version changed to 1.0.2
An existing manifest descriptor in index.json can be updated with
different manifest contents for the same/existing tag. We were updating
the digest but not the size field causing GC to report an error.
Add a unit test case to cover this.
Add logs.
In a production use case we found that the actual rootdir can be moved.
Currently, cache entries for dedupe record the full blob path which
doesn't work in the move use case.
Only for dedupe cache entries, record relative blob paths.
Since we want to conform to dist-spec, sometimes the gc and dedupe
optimizations conflict with the conformance tests that are being run.
So allow them to be turned off via configuration params.
Upstream conformance tests are being updated, so we need to align along
with our internal GC and dedupe features.
Add a new example config file which plays nice with conformance tests.
DeleteImageManifest() updated to deal with the case where the same
manifest can be created with multiple tags and deleted with the same
digest - so all entries must be deleted.
DeleteBlob() delete the digest key (bucket) when last reference is
dropped
As the number of repos and layers increases, the greater the probability
that layers are duplicated. We dedupe using hard links when content is
the same. This is intended to be purely a storage layer optimization.
Access control when available is orthogonal this optimization.
Add a durable cache to help speed up layer lookups.
Update README.
Add more unit tests.
Clients today expect the repo to clean up if there are unused blobs, not to
manually delete things they think are unused. Let's do that, and use
umoci's code to do it since it's tested and works.
v2: also run GC on update as well as delete
v3: fix up error return paths needing two args
Signed-off-by: Serge Hallyn <shallyn@cisco.com>
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Fixes issue #67.
As per dist spec, DELETE of a image manifest can only be done with
digest as <reference> param. Previously, tags were being allowed as
well. This is not conformant to the spec.
dist-spec compliance tests are now becoming a part of dist-spec repo
itself - we want to be compliant
pkg/api/regex.go:
* revert uppercasing in repository names
pkg/api/routes.go:
* ListTags() should support the URL params 'n' and 'last'
for pagination
* s/uuid/session_id/g to use the dist-spec's naming
* Fix off-by-one error in GetBlobUpload()'s http response "Range" header
* DeleteManifest() success status code is 202
* Fix PatchBlobUpload() to account for "streamed" use case
where neither "Content-Length" nor "Content-Range" headers are set
pkg/storage/storage.go:
* Add a "streamed" version of PutBlobChunk() called PutBlobChunkStreamed()
pkg/compliance/v1_0_0/check.go:
* fix unit tests to account for changed response status codes