0
Fork 0
mirror of https://github.com/project-zot/zot.git synced 2024-12-16 21:56:37 -05:00
zot/examples
LaurentiuNiculae f408df0dac
feat(repodb): Implement RepoDB for image specific information using boltdb/dynamodb (#979)
* feat(repodb): implement a DB for image specific information using boltdb

(cherry picked from commit e3cb60b856)

Some other fixes/improvements on top (Andrei)

Global search: The last updated attribute on repo level is now computed correctly.
Global search: Fix and enhance tests: validate more fields, and fix CVE verification logic
RepoListWithNewestImage: The vendors and platforms at repo level are no longer containing duplicate entries
CVE: scan OCIUncompressedLayer instead of skiping them (used in tests)
bug(repodb): do no try to increment download counters for signatures

Signed-off-by: Andrei Aaron <andaaron@cisco.com>

Add filtering to global search API (Laurentiu)

(cherry picked from commit a87976d635ea876fe8ced532e8adb7c3bb24098f)

Original work by Laurentiu Niculae <niculae.laurentiu1@gmail.com>

Fix pagination bug

 - when limit was bigger than the repo count result contained empty results
 - now correctly returns only maximum available number of repo results

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>

Add history to the fields returned from RepoDB

Consolidate fields used in packages
- pkg/extensions/search/common/common_test
- pkg/extensions/search/common/common
Refactor duplicate code in GlobalSearch verification
Add vulnerability scan results to image:tag reply

Signed-off-by: Andrei Aaron <andaaron@cisco.com>

Refactor ExpandedRepoInfo to using RepoDB

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>
(cherry picked from commit fd7dc85c3a9d028fd8860d3791cad4df769ed005)

Init RepoDB at startup
 - sync with storage
 - ignore images without a tag

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>
(cherry picked from commit 359898facd6541b2aa99ee95080f7aabf28c2650)

Update request to get image:tag to use repodb

Signed-off-by: Andrei Aaron <andaaron@cisco.com>

Sync RepoDB logging
 - added logging for errors

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>
(cherry picked from commit 2e128f4d01712b34c70b5468285100b0657001bb)

sync-repodb minor error checking fix

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>

Improve tests for syncing RepoDB with storage

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>
(cherry picked from commit b18408c6d64e01312849fc18b929e3a2a7931e9e)

Update scoring rule for repos
  - now prioritize matches to the end of the repo name

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>
(cherry picked from commit 6961346ccf02223132b3b12a2132c80bd1b6b33c)

Upgrade search filters to permit multiple values
  - multiple values for os and arch

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>
(cherry picked from commit 3ffb72c6fc0587ff827a03fe4f76a13b27b876a0)

feature(repodb): add pagination for RepoListWithNewestImage

Signed-off-by: Alex Stan <alexandrustan96@yahoo.ro>
(cherry picked from commit 32c917f2dc65363b0856345289353559a8027aee)

test(fix): fix tests failing since repodb is used for listing all repos

1. One of the tests was verifying disk/oci related erros and is not applicable
2. Another test was actually broken in an older PR, the default store and
the substore were using the same repo names (the substore ones were unprefixed),
which should not be the case, this was causing a single entry to show
in the RepoDB instead of two separate entries for each test image
Root cause in: b61aff62cd (diff-b86e11fa5a3102b336caebec3b30a9d35e26af554dd8658f124dba2404b7d24aR88)

Signed-off-by: Andrei Aaron <andaaron@cisco.com>

chore: move code reponsible for transforming objects to gql_generated types to separate package

Signed-off-by: Andrei Aaron <andaaron@cisco.com>

Process input for global search
  - Clean input: query, filter strings
  - Add validation for global search input

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>
(cherry picked from commit f1ca8670fbe4a4a327ea25cf459237dbf23bb78a)

fix: only call cve scanning for data shown to the user

Signed-off-by: Andrei Aaron <andaaron@cisco.com>

GQL omit scanning for CVE if field is not required

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>
(cherry picked from commit 5479ce45d6cb2abcf5fbccadeaf6f3393c3f6bf1)

Fix filtering logic in RepoDB
  - filter parameter was set to false instead of being calculator from the later image

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>
(cherry picked from commit a82d2327e34e5da617af0b7ca78a2dba90999f0a)

bug(repodb): Checking signature returns error if signed image is not found
  - we considere a signature image orfan when the image it signs is not found
  - we need this to ignore such signatures in certain cases

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>
(cherry picked from commit d0418505f76467accd8e1ee34fcc2b2a165efae5)

feat(repodb): CVE logic to use repoDB

Also update some method signatures to remove usage of:
github.com/google/go-containerregistry/pkg/v1

Signed-off-by: Andrei Aaron <andaaron@cisco.com>

* feat(repodb): refactor repodb update logic

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>

* fix(repodb): minor fixes

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>

* feat(repodb): move repodb logic inside meta directory under pkg

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>

* feat(repodb): change factory class for repodb initialization with factory metrod

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>

* feat(repodb): simplify repodb configuration
  - repodb now shares config parameters with the cache
  - config taken directly from storage config

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>

* fix(authors): fix authors information to work properly with repodb

Ideally this commit would be squshed in the repodb commit
but as-is it is easier to cherry-pick on other branches

Signed-off-by: Andrei Aaron <andaaron@cisco.com>

* feat(repodb): dynamodb support for repodb
  - clean-up repodb code + coverage improvements

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>

* feat(dynamo): tables used by dynamo are created automatically if they don't exists
  - if the table exists nothing happens

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>

* test(repodb): coverage tests
  - minor fix for CVEListForImage to fix the tests
Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>

* feat(repodb): add descriptor with media type

  - to represent images and multi-arch images

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>

* feat(repodb): support signatures on repo level

  - added to follow the behavior of signing and signature verification tools
    that work on a manifest level for each repo
  - all images with different tags but the same manifest will be signed at once

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>

* feat(repodb): old repodb version migration support

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>

* feat(repodb): tests for coverage

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>

* feat(repodb): WIP fixing tests

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>

* feat(repodb): work on patchRepoDB tests

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>

* fix(repodb): create dynamo tables only for linux amd

Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>

* fix(ci): fix a typo in ci-cd.yml

Signed-off-by: Andrei Aaron <aaaron@luxoft.com>

Signed-off-by: Andrei Aaron <andaaron@cisco.com>
Signed-off-by: Laurentiu Niculae <niculae.laurentiu1@gmail.com>
Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
Co-authored-by: Andrei Aaron <andaaron@cisco.com>
Co-authored-by: Andrei Aaron <aaaron@luxoft.com>
2023-01-09 12:37:44 -08:00
..
cluster fix(s3): remove tracking multipart uploads (#883) 2022-10-20 09:36:58 -07:00
metrics build: move build artifacts into build/ (#986) 2022-11-10 12:09:39 -08:00
config-all-remote.json feat(repodb): Implement RepoDB for image specific information using boltdb/dynamodb (#979) 2023-01-09 12:37:44 -08:00
config-allextensions.json add enable/disable option for scrub extension (#827) 2022-09-27 18:06:50 -07:00
config-anonymous-authz.json Remove AllowReadOnly and ReadOnly 2022-08-10 14:27:21 -07:00
config-bearer-auth.json ext: use distribution spec route prefix for extension api 2022-05-22 16:35:16 -07:00
config-bench.json Remove AllowReadOnly and ReadOnly 2022-08-10 14:27:21 -07:00
config-boltdb.json refactor(cache): rewrote/refactored cachedb functionality to use interface (#667) 2022-11-02 15:53:08 -07:00
config-commit.json Remove AllowReadOnly and ReadOnly 2022-08-10 14:27:21 -07:00
config-conformance.json ext: use distribution spec route prefix for extension api 2022-05-22 16:35:16 -07:00
config-cve.json ext: use distribution spec route prefix for extension api 2022-05-22 16:35:16 -07:00
config-dynamodb.json feat(repodb): Implement RepoDB for image specific information using boltdb/dynamodb (#979) 2023-01-09 12:37:44 -08:00
config-example.json Remove AllowReadOnly and ReadOnly 2022-08-10 14:27:21 -07:00
config-example.yaml Remove AllowReadOnly and ReadOnly 2022-08-10 14:27:21 -07:00
config-gc-periodic.json ext: use distribution spec route prefix for extension api 2022-05-22 16:35:16 -07:00
config-gc.json Remove AllowReadOnly and ReadOnly 2022-08-10 14:27:21 -07:00
config-lint.json fix(config): make all extension config consistent (#888) 2022-10-21 15:33:54 +03:00
config-metrics.json ext: use distribution spec route prefix for extension api 2022-05-22 16:35:16 -07:00
config-minimal.json Remove AllowReadOnly and ReadOnly 2022-08-10 14:27:21 -07:00
config-multiple-cve.json Remove AllowReadOnly and ReadOnly 2022-08-10 14:27:21 -07:00
config-multiple.json Remove AllowReadOnly and ReadOnly 2022-08-10 14:27:21 -07:00
config-policy.json fix(storage): deleting manifests with identical digests (#951) 2022-11-18 09:35:28 -08:00
config-ratelimit.json Remove AllowReadOnly and ReadOnly 2022-08-10 14:27:21 -07:00
config-s3.json feat(repodb): Implement RepoDB for image specific information using boltdb/dynamodb (#979) 2023-01-09 12:37:44 -08:00
config-scrub.json add enable/disable option for scrub extension (#827) 2022-09-27 18:06:50 -07:00
config-search.json feat(GraphQL): playground, served by zot in specific binary (#753) 2022-10-05 12:56:41 -07:00
config-sync-localhost.json Manage builds with different combinations of extensions 2022-06-30 09:53:52 -07:00
config-sync.json ext: use distribution spec route prefix for extension api 2022-05-22 16:35:16 -07:00
config-test.json ext: use distribution spec route prefix for extension api 2022-05-22 16:35:16 -07:00
config-tls.json graphql: Apply authorization on /_search endpoint 2022-08-26 21:31:26 +03:00
README.md fix(storage): deleting manifests with identical digests (#951) 2022-11-18 09:35:28 -08:00
sync-auth-filepath.json Changed sync behaviour, it used to copy images over http interface 2021-11-15 09:32:43 -08:00
zot.service move references to zotregistry.io and project-zot 2021-12-05 10:52:27 -08:00

The behavior of zot registry is controlled via its configuration file, which can either be a JSON (used in details below) or YAML file.

zot serve <config-file>

A candidate configuration file can be verified via:

zot verify <config-file>

Examples of working configurations for various use cases are available here

Configuration Parameters

Network

Configure network params with:

"http": {

Configure address and port to listen on with:

        "address": "127.0.0.1",
        "port": "5000",

Additionally, TLS configuration can be specified with:

        "tls": {
            "cert":"test/data/server.cert",
            "key":"test/data/server.key"
        },

Storage

Configure storage with:

"storage": {

Configure storage root directory with:

        "rootDirectory": "/tmp/zot",

Often, container images have shared layers and blobs and for filesystems that support hard links, inline deduplication can be enabled with:

        "dedupe": true,

When an image is deleted (either by tag or reference), orphaned blobs can lead to wasted storage, and background garbage collection can be enabled with:

        "gc": true,

It is also possible to store and serve images from multiple filesystems with their own repository paths, dedupe and garbage collection settings with:

        "subPaths": {
            "/a": {
                "rootDirectory": "/tmp/zot1",
                "dedupe": true,
                "gc": true
            },
            "/b": {
                "rootDirectory": "/tmp/zot2",
                "dedupe": true
            },
            "/c": {
                "rootDirectory": "/tmp/zot3",
                "dedupe": false
            }
        }
    },

Authentication

TLS mutual authentication and passphrase-based authentication are supported.

TLS Mutual Authentication

Apart from the server cert and key specified under network configuration, specifying the cacert field enables TLS mutual authentication:

"http": {
    "tls": {
      "cert":"test/data/server.cert",
      "key":"test/data/server.key",
      "cacert":"test/data/cacert.cert"
    },

Passphrase Authentication

Local authentication is supported via htpasswd file with:

  "http": {
    "auth": {
      "htpasswd": {
        "path": "test/data/htpasswd"
      },

LDAP authentication can be configured with:

  "http": {
    "auth": {
      "ldap": {
        "address":"ldap.example.org",
        "port":389,
        "startTLS":false,
        "baseDN":"ou=Users,dc=example,dc=org",
        "userAttribute":"uid",
        "bindDN":"cn=ldap-searcher,ou=Users,dc=example,dc=org",
        "bindPassword":"ldap-searcher-password",
        "skipVerify":false,
        "subtreeSearch":true
      },

NOTE: When both htpasswd and LDAP configuration are specified, LDAP authentication is given preference.

OAuth2 authentication (client credentials grant type) support via Bearer Token configured with:

  "http": {
    "auth": {
      "bearer": {
        "realm": "https://auth.myreg.io/auth/token",
        "service": "myauth",
        "cert": "/etc/zot/auth.crt"
      }

Authentication Failures

Should authentication fail, to prevent automated attacks, a delayed response can be configured with:

  "http": {
    "auth": {
      "failDelay": 5

Identity-based Authorization

Allowing actions on one or more repository paths can be tied to user identities. Two additional per-repository policies can be specified for identities not in the whitelist:

  • anonymousPolicy - applied for unathenticated users.
  • defaultPolicy - applied for authenticated users.

Furthermore, a global admin policy can also be specified which can override per-repository policies.

Glob patterns can also be used as repository paths.

Authorization is granted based on the longest path matched. For example repos2/repo repository will match both "**" and "repos2/repo" keys, in such case repos2/repo policy will be used because it's longer.

Because we use longest path matching we need a way to specify a global policy to override all the other policies. For example, we can specify a global policy with "**" (will match all repos), but any other policy will overwrite it, because it will be longer. So that's why we have the option to specify an adminPolicy.

Basically '**' means repositories not matched by any other per-repository policy.

Method-based action list:

  • "read" - list/pull images
  • "create" - push images (needs "read")
  • "update" - overwrite tags (needs "read" and "create")
  • "delete" - delete images (needs "read")

Behaviour-based action list

  • "detectManifestCollision" - delete manifest by digest will throw an error if multiple manifests have the same digest (needs "read" and "delete")
"accessControl": {
    "**": {                                                    # matches all repos (which are not matched by any other per-repository policy)
      "policies": [                                            # user based policies
        {
          "users": ["charlie"],
          "actions": ["read", "create", "update"]
        }
      ],
      "defaultPolicy": ["read", "create", "delete", "detectManifestCollision"], # default policy which is applied for authenticated users, other than "charlie"=> so these users can read/create/delete repositories and also can detect manifests collision.
      "anonymousPolicy": ["read"]                               # anonymous policy which is applied for unauthenticated users => so they can read repositories
    },
    "tmp/**": {                                                # matches all repos under tmp/ recursively
      "defaultPolicy": ["read", "create", "update"]            # so all users have read/create/update on all repos under tmp/ eg: tmp/infra/repo
    },
    "infra/*": {                                               # matches all repos directly under infra/ (not recursively)
        "policies": [
          {
              "users": ["alice", "bob"],
              "actions": ["create", "read", "update", "delete"]
          },
          {
              "users": ["mallory"],
              "actions": ["create", "read"]
          }
        ],
        "defaultPolicy": ["read"]
    },
    "repos2/repo": {                                           # matches only repos2/repo repository
        "policies": [
          {
              "users": ["bob"],
              "actions": ["read", "create"]
          },
          {
              "users": ["mallory"],
              "actions": ["create", "read"]
          }
        ],
        "defaultPolicy": ["read"]
    },
    "adminPolicy": {                                            # global admin policy (overrides per-repo policy)
        "users": ["admin"],
        "actions": ["read", "create", "update", "delete"]
    }
}

Logging

Enable and configure logging with:

"log":{

Set log level with:

    "level":"debug",

Set output file (default is stdout) with:

    "output":"/tmp/zot.log",

Enable audit logs and set output file with:

    "audit": "/tmp/zot-audit.log"
  }

Metrics

Enable and configure metrics with:

"metrics":{
    "enable":"true",

Set server path on which metrics will be exposed:

    "prometheus": {
      "path": "/metrics"
    }
}

In order to test the Metrics feature locally in a Kind cluster, folow this guide.

Storage Drivers

Beside filesystem storage backend, zot also supports S3 storage backend, check below url to see how to configure it:

  • s3: A driver storing objects in an Amazon Simple Storage Service (S3) bucket.

For an s3 zot configuration with multiple storage drivers see: s3-config.

zot also supports different storage drivers for each subpath.

Specifying S3 credentials

  • Config file:
    "storage": {
        "rootDirectory": "/tmp/zot",  # local path used to store dedupe cache database
        "dedupe": true,
        "storageDriver": {
            "name": "s3",
            "rootdirectory": "/zot",  # this is a prefix that is applied to all S3 keys to allow you to segment data in your bucket if necessary.
            "region": "us-east-2",
            "bucket": "zot-storage",
            "secure": true,
            "skipverify": false,
            "accesskey": "<YOUR_ACCESS_KEY_ID>",
            "secretkey": "<YOUR_SECRET_ACCESS_KEY>"
        }

There are multiple ways to specify S3 credentials besides config file:

  • Environment variables:

SDK looks for credentials in the following environment variables:

    AWS_ACCESS_KEY_ID
    AWS_SECRET_ACCESS_KEY
    AWS_SESSION_TOKEN (optional)
  • Credentials file:

A credential file is a plaintext file that contains your access keys. The file must be on the same machine on which youre running your application. The file must be named credentials and located in the .aws/ folder in your home directory.

    [default]
    aws_access_key_id = <YOUR_DEFAULT_ACCESS_KEY_ID>
    aws_secret_access_key = <YOUR_DEFAULT_SECRET_ACCESS_KEY>

    [test-account]
    aws_access_key_id = <YOUR_TEST_ACCESS_KEY_ID>
    aws_secret_access_key = <YOUR_TEST_SECRET_ACCESS_KEY>

    [prod-account]
    ; work profile
    aws_access_key_id = <YOUR_PROD_ACCESS_KEY_ID>
    aws_secret_access_key = <YOUR_PROD_SECRET_ACCESS_KEY>

The [default] heading defines credentials for the default profile, which the SDK will use unless you configure it to use another profile.

To specify a profile use AWS_PROFILE environment variable:

AWS_PROFILE=test-account

For more details see https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/configuring-sdk.html#specifying-credentials

Sync

Enable and configure sync with:

		"sync": {

Configure credentials for upstream registries:

			"credentialsFile": "./examples/sync-auth-filepath.json",

Configure each registry sync:

			"registries": [{
				"urls": ["https://registry1:5000"],
				"onDemand": false,                  # pull any image which the local registry doesn't have
				"pollInterval": "6h",               # polling interval, if not set then periodically polling will not run
				"tlsVerify": true,                  # whether or not to verify tls (default is true)
				"certDir": "/home/user/certs",      # use certificates at certDir path, if not specified then use the default certs dir
				"maxRetries": 5,                    # maxRetries in case of temporary errors (default: no retries)
				"retryDelay": "10m",                # delay between retries, retry options are applied for both on demand and periodically sync and retryDelay is mandatory when using maxRetries.
				"onlySigned": true,                 # sync only signed images (either notary or cosign)
				"content":[                         # which content to periodically pull, also it's used for filtering ondemand images, if not set then periodically polling will not run
					{
						"prefix":"/repo1/repo",         # pull image repo1/repo
						"tags":{                        # filter by tags
							"regex":"4.*",                # filter tags by regex
							"semver":true                 # filter tags by semver compliance
						}
					},
					{
						"prefix":"/repo2/repo*"         # pull all images that matches repo2/repo.*
					},
					{
						"prefix":"/repo3/**"            # pull all images under repo3/ (matches recursively all repos under repo3/)
					},
          {
            "prefix":"/repo1/repo",          # pull /repo1/repo
            "destination":"/localrepo",      # put /repo1/repo under /localrepo
            "stripPrefix":true               # strip the path specified in "prefix", if true resulting /localpath, if false resulting /localrepo/repo1/repo"
          }
          {
            "prefix":"/repo1/**",           # pull all images under repo1/ (matches recursively all repos under repo1/)
            "destination":"/localrepo",     # put all images found under /localrepo.
            "stripPrefix":true              # strip the path specified in "prefix" until meta-characters like "**". If we match /repo1/repo the local repo will be /localrepo/repo.
          }
				]
			},
			{
				"urls": ["https://registry2:5000", "https://registry3:5000"], // specify multiple URLs in case first encounters an error
				"pollInterval": "12h",
				"tlsVerify": false,
				"onDemand": false,
				"content":[
					{
						"prefix":"/repo2",
						"tags":{
							"semver":true
						}
					}
				]
			},
			{
				"urls": ["https://docker.io/library"],
				"onDemand": true,                     # doesn't have content, don't periodically pull, pull just on demand.
				"tlsVerify": true,
				"maxRetries": 3,                      
				"retryDelay": "15m"
			}
		]
		}

Prefixes can be strings that exactly match repositories or they can be glob patterns.