2021-12-06 02:19:28 -05:00
|
|
|
// Copyright 2020 The Gitea Authors. All rights reserved.
|
2022-11-27 13:20:29 -05:00
|
|
|
// SPDX-License-Identifier: MIT
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
|
|
|
|
package archiver
|
|
|
|
|
|
|
|
import (
|
2021-12-06 02:19:28 -05:00
|
|
|
"context"
|
2021-06-23 16:12:38 -05:00
|
|
|
"errors"
|
|
|
|
"fmt"
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
"io"
|
|
|
|
"os"
|
|
|
|
"strings"
|
2021-12-06 02:19:28 -05:00
|
|
|
"time"
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
|
2021-09-19 06:49:59 -05:00
|
|
|
"code.gitea.io/gitea/models/db"
|
2021-12-06 02:19:28 -05:00
|
|
|
repo_model "code.gitea.io/gitea/models/repo"
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
"code.gitea.io/gitea/modules/git"
|
|
|
|
"code.gitea.io/gitea/modules/graceful"
|
|
|
|
"code.gitea.io/gitea/modules/log"
|
2022-01-19 18:26:57 -05:00
|
|
|
"code.gitea.io/gitea/modules/process"
|
2021-06-23 16:12:38 -05:00
|
|
|
"code.gitea.io/gitea/modules/queue"
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
"code.gitea.io/gitea/modules/setting"
|
2021-06-23 16:12:38 -05:00
|
|
|
"code.gitea.io/gitea/modules/storage"
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
)
|
|
|
|
|
|
|
|
// ArchiveRequest defines the parameters of an archive request, which notably
|
|
|
|
// includes the specific repository being archived as well as the commit, the
|
|
|
|
// name by which it was requested, and the kind of archive being requested.
|
|
|
|
// This is entirely opaque to external entities, though, and mostly used as a
|
|
|
|
// handle elsewhere.
|
|
|
|
type ArchiveRequest struct {
|
2021-06-23 16:12:38 -05:00
|
|
|
RepoID int64
|
|
|
|
refName string
|
|
|
|
Type git.ArchiveType
|
|
|
|
CommitID string
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
}
|
|
|
|
|
2021-11-17 14:47:35 -05:00
|
|
|
// ErrUnknownArchiveFormat request archive format is not supported
|
|
|
|
type ErrUnknownArchiveFormat struct {
|
|
|
|
RequestFormat string
|
|
|
|
}
|
|
|
|
|
|
|
|
// Error implements error
|
|
|
|
func (err ErrUnknownArchiveFormat) Error() string {
|
|
|
|
return fmt.Sprintf("unknown format: %s", err.RequestFormat)
|
|
|
|
}
|
|
|
|
|
|
|
|
// Is implements error
|
|
|
|
func (ErrUnknownArchiveFormat) Is(err error) bool {
|
|
|
|
_, ok := err.(ErrUnknownArchiveFormat)
|
|
|
|
return ok
|
|
|
|
}
|
|
|
|
|
2022-08-29 04:45:20 -05:00
|
|
|
// RepoRefNotFoundError is returned when a requested reference (commit, tag) was not found.
|
|
|
|
type RepoRefNotFoundError struct {
|
|
|
|
RefName string
|
|
|
|
}
|
|
|
|
|
|
|
|
// Error implements error.
|
|
|
|
func (e RepoRefNotFoundError) Error() string {
|
|
|
|
return fmt.Sprintf("unrecognized repository reference: %s", e.RefName)
|
|
|
|
}
|
|
|
|
|
|
|
|
func (e RepoRefNotFoundError) Is(err error) bool {
|
|
|
|
_, ok := err.(RepoRefNotFoundError)
|
|
|
|
return ok
|
|
|
|
}
|
|
|
|
|
2021-06-23 16:12:38 -05:00
|
|
|
// NewRequest creates an archival request, based on the URI. The
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
// resulting ArchiveRequest is suitable for being passed to ArchiveRepository()
|
|
|
|
// if it's determined that the request still needs to be satisfied.
|
2021-06-23 16:12:38 -05:00
|
|
|
func NewRequest(repoID int64, repo *git.Repository, uri string) (*ArchiveRequest, error) {
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
r := &ArchiveRequest{
|
2021-06-23 16:12:38 -05:00
|
|
|
RepoID: repoID,
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
}
|
|
|
|
|
2021-06-23 16:12:38 -05:00
|
|
|
var ext string
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
switch {
|
|
|
|
case strings.HasSuffix(uri, ".zip"):
|
2021-06-23 16:12:38 -05:00
|
|
|
ext = ".zip"
|
|
|
|
r.Type = git.ZIP
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
case strings.HasSuffix(uri, ".tar.gz"):
|
2021-06-23 16:12:38 -05:00
|
|
|
ext = ".tar.gz"
|
|
|
|
r.Type = git.TARGZ
|
2021-08-24 11:47:09 -05:00
|
|
|
case strings.HasSuffix(uri, ".bundle"):
|
|
|
|
ext = ".bundle"
|
|
|
|
r.Type = git.BUNDLE
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
default:
|
2021-11-17 14:47:35 -05:00
|
|
|
return nil, ErrUnknownArchiveFormat{RequestFormat: uri}
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
}
|
|
|
|
|
2021-06-23 16:12:38 -05:00
|
|
|
r.refName = strings.TrimSuffix(uri, ext)
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
|
|
|
|
// Get corresponding commit.
|
2023-12-13 16:02:00 -05:00
|
|
|
commitID, err := repo.ConvertToGitID(r.refName)
|
|
|
|
if err != nil {
|
2022-08-29 04:45:20 -05:00
|
|
|
return nil, RepoRefNotFoundError{RefName: r.refName}
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
}
|
|
|
|
|
2023-12-13 16:02:00 -05:00
|
|
|
r.CommitID = commitID.String()
|
2021-06-23 16:12:38 -05:00
|
|
|
return r, nil
|
|
|
|
}
|
|
|
|
|
|
|
|
// GetArchiveName returns the name of the caller, based on the ref used by the
|
|
|
|
// caller to create this request.
|
|
|
|
func (aReq *ArchiveRequest) GetArchiveName() string {
|
|
|
|
return strings.ReplaceAll(aReq.refName, "/", "-") + "." + aReq.Type.String()
|
|
|
|
}
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
|
2022-08-29 04:45:20 -05:00
|
|
|
// Await awaits the completion of an ArchiveRequest. If the archive has
|
|
|
|
// already been prepared the method returns immediately. Otherwise an archiver
|
|
|
|
// process will be started and its completion awaited. On success the returned
|
|
|
|
// RepoArchiver may be used to download the archive. Note that even if the
|
|
|
|
// context is cancelled/times out a started archiver will still continue to run
|
|
|
|
// in the background.
|
|
|
|
func (aReq *ArchiveRequest) Await(ctx context.Context) (*repo_model.RepoArchiver, error) {
|
|
|
|
archiver, err := repo_model.GetRepoArchiver(ctx, aReq.RepoID, aReq.Type, aReq.CommitID)
|
|
|
|
if err != nil {
|
2022-10-24 14:29:17 -05:00
|
|
|
return nil, fmt.Errorf("models.GetRepoArchiver: %w", err)
|
2022-08-29 04:45:20 -05:00
|
|
|
}
|
|
|
|
|
|
|
|
if archiver != nil && archiver.Status == repo_model.ArchiverReady {
|
|
|
|
// Archive already generated, we're done.
|
|
|
|
return archiver, nil
|
|
|
|
}
|
|
|
|
|
|
|
|
if err := StartArchive(aReq); err != nil {
|
2022-10-24 14:29:17 -05:00
|
|
|
return nil, fmt.Errorf("archiver.StartArchive: %w", err)
|
2022-08-29 04:45:20 -05:00
|
|
|
}
|
|
|
|
|
|
|
|
poll := time.NewTicker(time.Second * 1)
|
|
|
|
defer poll.Stop()
|
|
|
|
|
|
|
|
for {
|
|
|
|
select {
|
|
|
|
case <-graceful.GetManager().HammerContext().Done():
|
|
|
|
// System stopped.
|
|
|
|
return nil, graceful.GetManager().HammerContext().Err()
|
|
|
|
case <-ctx.Done():
|
|
|
|
return nil, ctx.Err()
|
|
|
|
case <-poll.C:
|
|
|
|
archiver, err = repo_model.GetRepoArchiver(ctx, aReq.RepoID, aReq.Type, aReq.CommitID)
|
|
|
|
if err != nil {
|
2022-10-24 14:29:17 -05:00
|
|
|
return nil, fmt.Errorf("repo_model.GetRepoArchiver: %w", err)
|
2022-08-29 04:45:20 -05:00
|
|
|
}
|
|
|
|
if archiver != nil && archiver.Status == repo_model.ArchiverReady {
|
|
|
|
return archiver, nil
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2023-10-10 23:24:07 -05:00
|
|
|
func doArchive(ctx context.Context, r *ArchiveRequest) (*repo_model.RepoArchiver, error) {
|
|
|
|
txCtx, committer, err := db.TxContext(ctx)
|
2020-11-27 21:42:08 -05:00
|
|
|
if err != nil {
|
2021-06-23 16:12:38 -05:00
|
|
|
return nil, err
|
2020-11-27 21:42:08 -05:00
|
|
|
}
|
2021-09-19 06:49:59 -05:00
|
|
|
defer committer.Close()
|
2022-01-19 18:26:57 -05:00
|
|
|
ctx, _, finished := process.GetManager().AddContext(txCtx, fmt.Sprintf("ArchiveRequest[%d]: %s", r.RepoID, r.GetArchiveName()))
|
|
|
|
defer finished()
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
|
2021-12-06 02:19:28 -05:00
|
|
|
archiver, err := repo_model.GetRepoArchiver(ctx, r.RepoID, r.Type, r.CommitID)
|
2020-11-27 21:42:08 -05:00
|
|
|
if err != nil {
|
2021-06-23 16:12:38 -05:00
|
|
|
return nil, err
|
2020-11-27 21:42:08 -05:00
|
|
|
}
|
2021-06-23 16:12:38 -05:00
|
|
|
|
|
|
|
if archiver != nil {
|
|
|
|
// FIXME: If another process are generating it, we think it's not ready and just return
|
|
|
|
// Or we should wait until the archive generated.
|
2021-12-06 02:19:28 -05:00
|
|
|
if archiver.Status == repo_model.ArchiverGenerating {
|
2021-06-23 16:12:38 -05:00
|
|
|
return nil, nil
|
|
|
|
}
|
|
|
|
} else {
|
2021-12-06 02:19:28 -05:00
|
|
|
archiver = &repo_model.RepoArchiver{
|
2021-06-23 16:12:38 -05:00
|
|
|
RepoID: r.RepoID,
|
|
|
|
Type: r.Type,
|
|
|
|
CommitID: r.CommitID,
|
2021-12-06 02:19:28 -05:00
|
|
|
Status: repo_model.ArchiverGenerating,
|
2021-06-23 16:12:38 -05:00
|
|
|
}
|
2023-12-25 15:25:29 -05:00
|
|
|
if err := db.Insert(ctx, archiver); err != nil {
|
2021-06-23 16:12:38 -05:00
|
|
|
return nil, err
|
|
|
|
}
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
}
|
|
|
|
|
2022-08-29 04:45:20 -05:00
|
|
|
rPath := archiver.RelativePath()
|
2021-06-23 16:12:38 -05:00
|
|
|
_, err = storage.RepoArchives.Stat(rPath)
|
|
|
|
if err == nil {
|
2021-12-06 02:19:28 -05:00
|
|
|
if archiver.Status == repo_model.ArchiverGenerating {
|
|
|
|
archiver.Status = repo_model.ArchiverReady
|
|
|
|
if err = repo_model.UpdateRepoArchiverStatus(ctx, archiver); err != nil {
|
2021-09-25 16:29:25 -05:00
|
|
|
return nil, err
|
|
|
|
}
|
2021-06-23 16:12:38 -05:00
|
|
|
}
|
2021-09-25 16:29:25 -05:00
|
|
|
return archiver, committer.Commit()
|
2021-06-23 16:12:38 -05:00
|
|
|
}
|
|
|
|
|
|
|
|
if !errors.Is(err, os.ErrNotExist) {
|
2022-10-24 14:29:17 -05:00
|
|
|
return nil, fmt.Errorf("unable to stat archive: %w", err)
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
}
|
2021-06-23 16:12:38 -05:00
|
|
|
|
|
|
|
rd, w := io.Pipe()
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
defer func() {
|
2021-06-23 16:12:38 -05:00
|
|
|
w.Close()
|
|
|
|
rd.Close()
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
}()
|
2022-04-26 18:22:26 -05:00
|
|
|
done := make(chan error, 1) // Ensure that there is some capacity which will ensure that the goroutine below can always finish
|
2022-12-02 21:48:26 -05:00
|
|
|
repo, err := repo_model.GetRepositoryByID(ctx, archiver.RepoID)
|
2021-06-23 16:12:38 -05:00
|
|
|
if err != nil {
|
2022-10-24 14:29:17 -05:00
|
|
|
return nil, fmt.Errorf("archiver.LoadRepo failed: %w", err)
|
2021-06-23 16:12:38 -05:00
|
|
|
}
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
|
2022-03-29 14:13:41 -05:00
|
|
|
gitRepo, err := git.OpenRepository(ctx, repo.RepoPath())
|
2021-06-23 16:12:38 -05:00
|
|
|
if err != nil {
|
|
|
|
return nil, err
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
}
|
2021-06-23 16:12:38 -05:00
|
|
|
defer gitRepo.Close()
|
|
|
|
|
2021-12-06 02:19:28 -05:00
|
|
|
go func(done chan error, w *io.PipeWriter, archiver *repo_model.RepoArchiver, gitRepo *git.Repository) {
|
2021-06-23 16:12:38 -05:00
|
|
|
defer func() {
|
|
|
|
if r := recover(); r != nil {
|
|
|
|
done <- fmt.Errorf("%v", r)
|
|
|
|
}
|
|
|
|
}()
|
|
|
|
|
2021-08-24 11:47:09 -05:00
|
|
|
if archiver.Type == git.BUNDLE {
|
|
|
|
err = gitRepo.CreateBundle(
|
2022-01-19 18:26:57 -05:00
|
|
|
ctx,
|
2021-08-24 11:47:09 -05:00
|
|
|
archiver.CommitID,
|
|
|
|
w,
|
|
|
|
)
|
|
|
|
} else {
|
|
|
|
err = gitRepo.CreateArchive(
|
2022-01-19 18:26:57 -05:00
|
|
|
ctx,
|
2021-08-24 11:47:09 -05:00
|
|
|
archiver.Type,
|
|
|
|
w,
|
|
|
|
setting.Repository.PrefixArchiveFiles,
|
|
|
|
archiver.CommitID,
|
|
|
|
)
|
|
|
|
}
|
2021-06-23 16:12:38 -05:00
|
|
|
_ = w.CloseWithError(err)
|
|
|
|
done <- err
|
|
|
|
}(done, w, archiver, gitRepo)
|
|
|
|
|
|
|
|
// TODO: add lfs data to zip
|
|
|
|
// TODO: add submodule data to zip
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
|
2021-06-23 16:12:38 -05:00
|
|
|
if _, err := storage.RepoArchives.Save(rPath, rd, -1); err != nil {
|
2022-10-24 14:29:17 -05:00
|
|
|
return nil, fmt.Errorf("unable to write archive: %w", err)
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
}
|
2021-06-23 16:12:38 -05:00
|
|
|
|
|
|
|
err = <-done
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
if err != nil {
|
2021-06-23 16:12:38 -05:00
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
|
2021-12-06 02:19:28 -05:00
|
|
|
if archiver.Status == repo_model.ArchiverGenerating {
|
|
|
|
archiver.Status = repo_model.ArchiverReady
|
|
|
|
if err = repo_model.UpdateRepoArchiverStatus(ctx, archiver); err != nil {
|
2021-06-23 16:12:38 -05:00
|
|
|
return nil, err
|
|
|
|
}
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
}
|
|
|
|
|
2021-09-19 06:49:59 -05:00
|
|
|
return archiver, committer.Commit()
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
}
|
|
|
|
|
|
|
|
// ArchiveRepository satisfies the ArchiveRequest being passed in. Processing
|
|
|
|
// will occur in a separate goroutine, as this phase may take a while to
|
|
|
|
// complete. If the archive already exists, ArchiveRepository will not do
|
|
|
|
// anything. In all cases, the caller should be examining the *ArchiveRequest
|
|
|
|
// being returned for completion, as it may be different than the one they passed
|
|
|
|
// in.
|
2023-10-10 23:24:07 -05:00
|
|
|
func ArchiveRepository(ctx context.Context, request *ArchiveRequest) (*repo_model.RepoArchiver, error) {
|
|
|
|
return doArchive(ctx, request)
|
2021-06-23 16:12:38 -05:00
|
|
|
}
|
|
|
|
|
Rewrite queue (#24505)
# ⚠️ Breaking
Many deprecated queue config options are removed (actually, they should
have been removed in 1.18/1.19).
If you see the fatal message when starting Gitea: "Please update your
app.ini to remove deprecated config options", please follow the error
messages to remove these options from your app.ini.
Example:
```
2023/05/06 19:39:22 [E] Removed queue option: `[indexer].ISSUE_INDEXER_QUEUE_TYPE`. Use new options in `[queue.issue_indexer]`
2023/05/06 19:39:22 [E] Removed queue option: `[indexer].UPDATE_BUFFER_LEN`. Use new options in `[queue.issue_indexer]`
2023/05/06 19:39:22 [F] Please update your app.ini to remove deprecated config options
```
Many options in `[queue]` are are dropped, including:
`WRAP_IF_NECESSARY`, `MAX_ATTEMPTS`, `TIMEOUT`, `WORKERS`,
`BLOCK_TIMEOUT`, `BOOST_TIMEOUT`, `BOOST_WORKERS`, they can be removed
from app.ini.
# The problem
The old queue package has some legacy problems:
* complexity: I doubt few people could tell how it works.
* maintainability: Too many channels and mutex/cond are mixed together,
too many different structs/interfaces depends each other.
* stability: due to the complexity & maintainability, sometimes there
are strange bugs and difficult to debug, and some code doesn't have test
(indeed some code is difficult to test because a lot of things are mixed
together).
* general applicability: although it is called "queue", its behavior is
not a well-known queue.
* scalability: it doesn't seem easy to make it work with a cluster
without breaking its behaviors.
It came from some very old code to "avoid breaking", however, its
technical debt is too heavy now. It's a good time to introduce a better
"queue" package.
# The new queue package
It keeps using old config and concept as much as possible.
* It only contains two major kinds of concepts:
* The "base queue": channel, levelqueue, redis
* They have the same abstraction, the same interface, and they are
tested by the same testing code.
* The "WokerPoolQueue", it uses the "base queue" to provide "worker
pool" function, calls the "handler" to process the data in the base
queue.
* The new code doesn't do "PushBack"
* Think about a queue with many workers, the "PushBack" can't guarantee
the order for re-queued unhandled items, so in new code it just does
"normal push"
* The new code doesn't do "pause/resume"
* The "pause/resume" was designed to handle some handler's failure: eg:
document indexer (elasticsearch) is down
* If a queue is paused for long time, either the producers blocks or the
new items are dropped.
* The new code doesn't do such "pause/resume" trick, it's not a common
queue's behavior and it doesn't help much.
* If there are unhandled items, the "push" function just blocks for a
few seconds and then re-queue them and retry.
* The new code doesn't do "worker booster"
* Gitea's queue's handlers are light functions, the cost is only the
go-routine, so it doesn't make sense to "boost" them.
* The new code only use "max worker number" to limit the concurrent
workers.
* The new "Push" never blocks forever
* Instead of creating more and more blocking goroutines, return an error
is more friendly to the server and to the end user.
There are more details in code comments: eg: the "Flush" problem, the
strange "code.index" hanging problem, the "immediate" queue problem.
Almost ready for review.
TODO:
* [x] add some necessary comments during review
* [x] add some more tests if necessary
* [x] update documents and config options
* [x] test max worker / active worker
* [x] re-run the CI tasks to see whether any test is flaky
* [x] improve the `handleOldLengthConfiguration` to provide more
friendly messages
* [x] fine tune default config values (eg: length?)
## Code coverage:
![image](https://user-images.githubusercontent.com/2114189/236620635-55576955-f95d-4810-b12f-879026a3afdf.png)
2023-05-08 06:49:59 -05:00
|
|
|
var archiverQueue *queue.WorkerPoolQueue[*ArchiveRequest]
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
|
Improve queue and logger context (#24924)
Before there was a "graceful function": RunWithShutdownFns, it's mainly
for some modules which doesn't support context.
The old queue system doesn't work well with context, so the old queues
need it.
After the queue refactoring, the new queue works with context well, so,
use Golang context as much as possible, the `RunWithShutdownFns` could
be removed (replaced by RunWithCancel for context cancel mechanism), the
related code could be simplified.
This PR also fixes some legacy queue-init problems, eg:
* typo : archiver: "unable to create codes indexer queue" => "unable to
create repo-archive queue"
* no nil check for failed queues, which causes unfriendly panic
After this PR, many goroutines could have better display name:
![image](https://github.com/go-gitea/gitea/assets/2114189/701b2a9b-8065-4137-aeaa-0bda2b34604a)
![image](https://github.com/go-gitea/gitea/assets/2114189/f1d5f50f-0534-40f0-b0be-f2c9daa5fe92)
2023-05-26 02:31:55 -05:00
|
|
|
// Init initializes archiver
|
2023-10-10 23:24:07 -05:00
|
|
|
func Init(ctx context.Context) error {
|
Rewrite queue (#24505)
# ⚠️ Breaking
Many deprecated queue config options are removed (actually, they should
have been removed in 1.18/1.19).
If you see the fatal message when starting Gitea: "Please update your
app.ini to remove deprecated config options", please follow the error
messages to remove these options from your app.ini.
Example:
```
2023/05/06 19:39:22 [E] Removed queue option: `[indexer].ISSUE_INDEXER_QUEUE_TYPE`. Use new options in `[queue.issue_indexer]`
2023/05/06 19:39:22 [E] Removed queue option: `[indexer].UPDATE_BUFFER_LEN`. Use new options in `[queue.issue_indexer]`
2023/05/06 19:39:22 [F] Please update your app.ini to remove deprecated config options
```
Many options in `[queue]` are are dropped, including:
`WRAP_IF_NECESSARY`, `MAX_ATTEMPTS`, `TIMEOUT`, `WORKERS`,
`BLOCK_TIMEOUT`, `BOOST_TIMEOUT`, `BOOST_WORKERS`, they can be removed
from app.ini.
# The problem
The old queue package has some legacy problems:
* complexity: I doubt few people could tell how it works.
* maintainability: Too many channels and mutex/cond are mixed together,
too many different structs/interfaces depends each other.
* stability: due to the complexity & maintainability, sometimes there
are strange bugs and difficult to debug, and some code doesn't have test
(indeed some code is difficult to test because a lot of things are mixed
together).
* general applicability: although it is called "queue", its behavior is
not a well-known queue.
* scalability: it doesn't seem easy to make it work with a cluster
without breaking its behaviors.
It came from some very old code to "avoid breaking", however, its
technical debt is too heavy now. It's a good time to introduce a better
"queue" package.
# The new queue package
It keeps using old config and concept as much as possible.
* It only contains two major kinds of concepts:
* The "base queue": channel, levelqueue, redis
* They have the same abstraction, the same interface, and they are
tested by the same testing code.
* The "WokerPoolQueue", it uses the "base queue" to provide "worker
pool" function, calls the "handler" to process the data in the base
queue.
* The new code doesn't do "PushBack"
* Think about a queue with many workers, the "PushBack" can't guarantee
the order for re-queued unhandled items, so in new code it just does
"normal push"
* The new code doesn't do "pause/resume"
* The "pause/resume" was designed to handle some handler's failure: eg:
document indexer (elasticsearch) is down
* If a queue is paused for long time, either the producers blocks or the
new items are dropped.
* The new code doesn't do such "pause/resume" trick, it's not a common
queue's behavior and it doesn't help much.
* If there are unhandled items, the "push" function just blocks for a
few seconds and then re-queue them and retry.
* The new code doesn't do "worker booster"
* Gitea's queue's handlers are light functions, the cost is only the
go-routine, so it doesn't make sense to "boost" them.
* The new code only use "max worker number" to limit the concurrent
workers.
* The new "Push" never blocks forever
* Instead of creating more and more blocking goroutines, return an error
is more friendly to the server and to the end user.
There are more details in code comments: eg: the "Flush" problem, the
strange "code.index" hanging problem, the "immediate" queue problem.
Almost ready for review.
TODO:
* [x] add some necessary comments during review
* [x] add some more tests if necessary
* [x] update documents and config options
* [x] test max worker / active worker
* [x] re-run the CI tasks to see whether any test is flaky
* [x] improve the `handleOldLengthConfiguration` to provide more
friendly messages
* [x] fine tune default config values (eg: length?)
## Code coverage:
![image](https://user-images.githubusercontent.com/2114189/236620635-55576955-f95d-4810-b12f-879026a3afdf.png)
2023-05-08 06:49:59 -05:00
|
|
|
handler := func(items ...*ArchiveRequest) []*ArchiveRequest {
|
|
|
|
for _, archiveReq := range items {
|
2021-06-23 16:12:38 -05:00
|
|
|
log.Trace("ArchiverData Process: %#v", archiveReq)
|
2023-10-10 23:24:07 -05:00
|
|
|
if _, err := doArchive(ctx, archiveReq); err != nil {
|
Rewrite queue (#24505)
# ⚠️ Breaking
Many deprecated queue config options are removed (actually, they should
have been removed in 1.18/1.19).
If you see the fatal message when starting Gitea: "Please update your
app.ini to remove deprecated config options", please follow the error
messages to remove these options from your app.ini.
Example:
```
2023/05/06 19:39:22 [E] Removed queue option: `[indexer].ISSUE_INDEXER_QUEUE_TYPE`. Use new options in `[queue.issue_indexer]`
2023/05/06 19:39:22 [E] Removed queue option: `[indexer].UPDATE_BUFFER_LEN`. Use new options in `[queue.issue_indexer]`
2023/05/06 19:39:22 [F] Please update your app.ini to remove deprecated config options
```
Many options in `[queue]` are are dropped, including:
`WRAP_IF_NECESSARY`, `MAX_ATTEMPTS`, `TIMEOUT`, `WORKERS`,
`BLOCK_TIMEOUT`, `BOOST_TIMEOUT`, `BOOST_WORKERS`, they can be removed
from app.ini.
# The problem
The old queue package has some legacy problems:
* complexity: I doubt few people could tell how it works.
* maintainability: Too many channels and mutex/cond are mixed together,
too many different structs/interfaces depends each other.
* stability: due to the complexity & maintainability, sometimes there
are strange bugs and difficult to debug, and some code doesn't have test
(indeed some code is difficult to test because a lot of things are mixed
together).
* general applicability: although it is called "queue", its behavior is
not a well-known queue.
* scalability: it doesn't seem easy to make it work with a cluster
without breaking its behaviors.
It came from some very old code to "avoid breaking", however, its
technical debt is too heavy now. It's a good time to introduce a better
"queue" package.
# The new queue package
It keeps using old config and concept as much as possible.
* It only contains two major kinds of concepts:
* The "base queue": channel, levelqueue, redis
* They have the same abstraction, the same interface, and they are
tested by the same testing code.
* The "WokerPoolQueue", it uses the "base queue" to provide "worker
pool" function, calls the "handler" to process the data in the base
queue.
* The new code doesn't do "PushBack"
* Think about a queue with many workers, the "PushBack" can't guarantee
the order for re-queued unhandled items, so in new code it just does
"normal push"
* The new code doesn't do "pause/resume"
* The "pause/resume" was designed to handle some handler's failure: eg:
document indexer (elasticsearch) is down
* If a queue is paused for long time, either the producers blocks or the
new items are dropped.
* The new code doesn't do such "pause/resume" trick, it's not a common
queue's behavior and it doesn't help much.
* If there are unhandled items, the "push" function just blocks for a
few seconds and then re-queue them and retry.
* The new code doesn't do "worker booster"
* Gitea's queue's handlers are light functions, the cost is only the
go-routine, so it doesn't make sense to "boost" them.
* The new code only use "max worker number" to limit the concurrent
workers.
* The new "Push" never blocks forever
* Instead of creating more and more blocking goroutines, return an error
is more friendly to the server and to the end user.
There are more details in code comments: eg: the "Flush" problem, the
strange "code.index" hanging problem, the "immediate" queue problem.
Almost ready for review.
TODO:
* [x] add some necessary comments during review
* [x] add some more tests if necessary
* [x] update documents and config options
* [x] test max worker / active worker
* [x] re-run the CI tasks to see whether any test is flaky
* [x] improve the `handleOldLengthConfiguration` to provide more
friendly messages
* [x] fine tune default config values (eg: length?)
## Code coverage:
![image](https://user-images.githubusercontent.com/2114189/236620635-55576955-f95d-4810-b12f-879026a3afdf.png)
2023-05-08 06:49:59 -05:00
|
|
|
log.Error("Archive %v failed: %v", archiveReq, err)
|
2021-06-23 16:12:38 -05:00
|
|
|
}
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
}
|
2022-01-22 16:22:14 -05:00
|
|
|
return nil
|
2021-06-23 16:12:38 -05:00
|
|
|
}
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
|
Improve queue and logger context (#24924)
Before there was a "graceful function": RunWithShutdownFns, it's mainly
for some modules which doesn't support context.
The old queue system doesn't work well with context, so the old queues
need it.
After the queue refactoring, the new queue works with context well, so,
use Golang context as much as possible, the `RunWithShutdownFns` could
be removed (replaced by RunWithCancel for context cancel mechanism), the
related code could be simplified.
This PR also fixes some legacy queue-init problems, eg:
* typo : archiver: "unable to create codes indexer queue" => "unable to
create repo-archive queue"
* no nil check for failed queues, which causes unfriendly panic
After this PR, many goroutines could have better display name:
![image](https://github.com/go-gitea/gitea/assets/2114189/701b2a9b-8065-4137-aeaa-0bda2b34604a)
![image](https://github.com/go-gitea/gitea/assets/2114189/f1d5f50f-0534-40f0-b0be-f2c9daa5fe92)
2023-05-26 02:31:55 -05:00
|
|
|
archiverQueue = queue.CreateUniqueQueue(graceful.GetManager().ShutdownContext(), "repo-archive", handler)
|
2021-06-23 16:12:38 -05:00
|
|
|
if archiverQueue == nil {
|
Improve queue and logger context (#24924)
Before there was a "graceful function": RunWithShutdownFns, it's mainly
for some modules which doesn't support context.
The old queue system doesn't work well with context, so the old queues
need it.
After the queue refactoring, the new queue works with context well, so,
use Golang context as much as possible, the `RunWithShutdownFns` could
be removed (replaced by RunWithCancel for context cancel mechanism), the
related code could be simplified.
This PR also fixes some legacy queue-init problems, eg:
* typo : archiver: "unable to create codes indexer queue" => "unable to
create repo-archive queue"
* no nil check for failed queues, which causes unfriendly panic
After this PR, many goroutines could have better display name:
![image](https://github.com/go-gitea/gitea/assets/2114189/701b2a9b-8065-4137-aeaa-0bda2b34604a)
![image](https://github.com/go-gitea/gitea/assets/2114189/f1d5f50f-0534-40f0-b0be-f2c9daa5fe92)
2023-05-26 02:31:55 -05:00
|
|
|
return errors.New("unable to create repo-archive queue")
|
2021-06-23 16:12:38 -05:00
|
|
|
}
|
Improve queue and logger context (#24924)
Before there was a "graceful function": RunWithShutdownFns, it's mainly
for some modules which doesn't support context.
The old queue system doesn't work well with context, so the old queues
need it.
After the queue refactoring, the new queue works with context well, so,
use Golang context as much as possible, the `RunWithShutdownFns` could
be removed (replaced by RunWithCancel for context cancel mechanism), the
related code could be simplified.
This PR also fixes some legacy queue-init problems, eg:
* typo : archiver: "unable to create codes indexer queue" => "unable to
create repo-archive queue"
* no nil check for failed queues, which causes unfriendly panic
After this PR, many goroutines could have better display name:
![image](https://github.com/go-gitea/gitea/assets/2114189/701b2a9b-8065-4137-aeaa-0bda2b34604a)
![image](https://github.com/go-gitea/gitea/assets/2114189/f1d5f50f-0534-40f0-b0be-f2c9daa5fe92)
2023-05-26 02:31:55 -05:00
|
|
|
go graceful.GetManager().RunWithCancel(archiverQueue)
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
|
2021-06-23 16:12:38 -05:00
|
|
|
return nil
|
|
|
|
}
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
|
2021-06-23 16:12:38 -05:00
|
|
|
// StartArchive push the archive request to the queue
|
|
|
|
func StartArchive(request *ArchiveRequest) error {
|
|
|
|
has, err := archiverQueue.Has(request)
|
|
|
|
if err != nil {
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
if has {
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
return archiverQueue.Push(request)
|
[RFC] Make archival asynchronous (#11296)
* Make archival asynchronous
The prime benefit being sought here is for large archives to not
clog up the rendering process and cause unsightly proxy timeouts.
As a secondary benefit, archive-in-progress is moved out of the
way into a /tmp file so that new archival requests for the same
commit will not get fulfilled based on an archive that isn't yet
finished.
This asynchronous system is fairly primitive; request comes in, we'll
spawn off a new goroutine to handle it, then we'll mark it as done.
Status requests will see if the file exists in the final location,
and report the archival as done when it exists.
Fixes #11265
* Archive links: drop initial delay to three-quarters of a second
Some, or perhaps even most, archives will not take all that long to archive.
The archive process starts as soon as the download button is initially
clicked, so in theory they could be done quite quickly. Drop the initial
delay down to three-quarters of a second to make it more responsive in the
common case of the archive being quickly created.
* archiver: restructure a little bit to facilitate testing
This introduces two sync.Cond pointers to the archiver package. If they're
non-nil when we go to process a request, we'll wait until signalled (at all)
to proceed. The tests will then create the sync.Cond so that it can signal
at-will and sanity-check the state of the queue at different phases.
The author believes that nil-checking these two sync.Cond pointers on every
archive processing will introduce minimal overhead with no impact on
maintainability.
* gofmt nit: no space around binary + operator
* services: archiver: appease golangci-lint, lock queueMutex
Locking/unlocking the queueMutex is allowed, but not required, for
Cond.Signal() and Cond.Broadcast(). The magic at play here is just a little
too much for golangci-lint, as we take the address of queueMutex and this is
mostly used in archiver.go; the variable still gets flagged as unused.
* archiver: tests: fix several timing nits
Once we've signaled a cond var, it may take some small amount of time for
the goroutines released to hit the spot we're wanting them to be at. Give
them an appropriate amount of time.
* archiver: tests: no underscore in var name, ungh
* archiver: tests: Test* is run in a separate context than TestMain
We must setup the mutex/cond variables at the beginning of any test that's
going to use it, or else these will be nil when the test is actually ran.
* archiver: tests: hopefully final tweak
Things got shuffled around such that we carefully build up and release
requests from the queue, so we can validate the state of the queue at each
step. Fix some assertions that no longer hold true as fallout.
* repo: Download: restore some semblance of previous behavior
When archival was made async, the GET endpoint was only useful if a previous
POST had initiated the download. This commit restores the previous behavior,
to an extent; we'll now submit the archive request there and return a
"202 Accepted" to indicate that it's processing if we didn't manage to
complete the request within ~2 seconds of submission.
This lets a client directly GET the archive, and gives them some indication
that they may attempt to GET it again at a later time.
* archiver: tests: simplify a bit further
We don't need to risk failure and use time.ParseDuration to get 2 *
time.Second.
else if isn't really necessary if the conditions are simple enough and lead
to the same result.
* archiver: tests: resolve potential source of flakiness
Increase all timeouts to 10 seconds; these aren't hard-coded sleeps, so
there's no guarantee we'll actually take that long. If we need longer to
not have a false-positive, then so be it.
While here, various assert.{Not,}Equal arguments are flipped around so that
the wording in error output reflects reality, where the expected argument is
second and actual third.
* archiver: setup infrastructure for notifying consumers of completion
This API will *not* allow consumers to subscribe to specific requests being
completed, just *any* request being completed. The caller is responsible for
determining if their request is satisfied and waiting again if needed.
* repo: archive: make GET endpoint synchronous again
If the request isn't complete, this endpoint will now submit the request and
wait for completion using the new API. This may still be susceptible to
timeouts for larger repos, but other endpoints now exist that the web
interface will use to negotiate its way through larger archive processes.
* archiver: tests: amend test to include WaitForCompletion()
This is a trivial one, so go ahead and include it.
* archiver: tests: fix test by calling NewContext()
The mutex is otherwise uninitialized, so we need to ensure that we're
actually initializing it if we plan to test it.
* archiver: tests: integrate new WaitForCompletion a little better
We can use this to wait for archives to come in, rather than spinning and
hoping with a timeout.
* archiver: tests: combine numQueued declaration with next-instruction assignment
* routers: repo: reap unused archiving flag from DownloadStatus()
This had some planned usage before, indicating whether this request
initiated the archival process or not. After several rounds of refactoring,
this use was deemed not necessary for much of anything and got boiled down
to !complete in all cases.
* services: archiver: restructure to use a channel
We now offer two forms of waiting for a request:
- WaitForCompletion: wait for completion with no timeout
- TimedWaitForCompletion: wait for completion with timeout
In both cases, we wait for the given request's cchan to close; in the latter
case, we do so with the caller-provided timeout. This completely removes the
need for busy-wait loops in Download/InitiateDownload, as it's fairly clean
to wait on a channel with timeout.
* services: archiver: use defer to unlock now that we can
This previously carried the lock into the goroutine, but an intermediate
step just added the request to archiveInProgress outside of the new
goroutine and removed the need for the goroutine to start out with it.
* Revert "archiver: tests: combine numQueued declaration with next-instruction assignment"
This reverts commit bcc52140238e16680f2e05e448e9be51372afdf5.
Revert "archiver: tests: integrate new WaitForCompletion a little better"
This reverts commit 9fc8bedb5667d24d3a3c7843dc28a229efffb1e6.
Revert "archiver: tests: fix test by calling NewContext()"
This reverts commit 709c35685eaaf261ebbb7d3420e3376a4ee8e7f2.
Revert "archiver: tests: amend test to include WaitForCompletion()"
This reverts commit 75261f56bc05d1fa8ff7e81dcbc0ccd93fdc9d50.
* archiver: tests: first attempt at WaitForCompletion() tests
* archiver: tests: slight improvement, less busy-loop
Just wait for the requests to complete in order, instead of busy-waiting
with a timeout. This is slightly less fragile.
While here, reverse the arguments of a nearby assert.Equal() so that
expected/actual are correct in any test output.
* archiver: address lint nits
* services: archiver: only close the channel once
* services: archiver: use a struct{} for the wait channel
This makes it obvious that the channel is only being used as a signal,
rather than anything useful being piped through it.
* archiver: tests: fix expectations
Move the close of the channel into doArchive() itself; notably, before these
goroutines move on to waiting on the Release cond.
The tests are adjusted to reflect that we can't WaitForCompletion() after
they've already completed, as WaitForCompletion() doesn't indicate that
they've been released from the queue yet.
* archiver: tests: set cchan to nil for comparison
* archiver: move ctx.Error's back into the route handlers
We shouldn't be setting this in a service, we should just be validating the
request that we were handed.
* services: archiver: use regex to match a hash
This makes sure we don't try and use refName as a hash when it's clearly not
one, e.g. heads/pull/foo.
* routers: repo: remove the weird /archive/status endpoint
We don't need to do this anymore, we can just continue POSTing to the
archive/* endpoint until we're told the download's complete. This avoids a
potential naming conflict, where a ref could start with "status/"
* archiver: tests: bump reasonable timeout to 15s
* archiver: tests: actually release timedReq
* archiver: tests: run through inFlight instead of manually checking
While we're here, add a test for manually re-processing an archive that's
already been complete. Re-open the channel and mark it incomplete, so that
doArchive can just mark it complete again.
* initArchiveLinks: prevent default behavior from clicking
* archiver: alias gitea's context, golang context import pending
* archiver: simplify logic, just reconstruct slices
While the previous logic was perhaps slightly more efficient, the
new variant's readability is much improved.
* archiver: don't block shutdown on waiting for archive
The technique established launches a goroutine to do the wait,
which will close a wait channel upon termination. For the timeout
case, we also send back a value indicating whether the timeout was
hit or not.
The timeouts are expected to be relatively small, but still a multi-
second delay to shutdown due to this could be unfortunate.
* archiver: simplify shutdown logic
We can just grab the shutdown channel from the graceful manager instead of
constructing a channel to halt the caller and/or pass a result back.
* Style issues
* Fix mis-merge
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-11-07 15:27:28 -05:00
|
|
|
}
|
2021-12-06 02:19:28 -05:00
|
|
|
|
|
|
|
func deleteOldRepoArchiver(ctx context.Context, archiver *repo_model.RepoArchiver) error {
|
2023-12-25 15:25:29 -05:00
|
|
|
if _, err := db.DeleteByID[repo_model.RepoArchiver](ctx, archiver.ID); err != nil {
|
2021-12-06 02:19:28 -05:00
|
|
|
return err
|
|
|
|
}
|
2022-08-29 04:45:20 -05:00
|
|
|
p := archiver.RelativePath()
|
2021-12-06 02:19:28 -05:00
|
|
|
if err := storage.RepoArchives.Delete(p); err != nil {
|
|
|
|
log.Error("delete repo archive file failed: %v", err)
|
|
|
|
}
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
|
|
|
|
// DeleteOldRepositoryArchives deletes old repository archives.
|
|
|
|
func DeleteOldRepositoryArchives(ctx context.Context, olderThan time.Duration) error {
|
|
|
|
log.Trace("Doing: ArchiveCleanup")
|
|
|
|
|
|
|
|
for {
|
2023-09-16 09:39:12 -05:00
|
|
|
archivers, err := repo_model.FindRepoArchives(ctx, repo_model.FindRepoArchiversOption{
|
2021-12-06 02:19:28 -05:00
|
|
|
ListOptions: db.ListOptions{
|
|
|
|
PageSize: 100,
|
|
|
|
Page: 1,
|
|
|
|
},
|
|
|
|
OlderThan: olderThan,
|
|
|
|
})
|
|
|
|
if err != nil {
|
|
|
|
log.Trace("Error: ArchiveClean: %v", err)
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
|
|
|
|
for _, archiver := range archivers {
|
|
|
|
if err := deleteOldRepoArchiver(ctx, archiver); err != nil {
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if len(archivers) < 100 {
|
|
|
|
break
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
log.Trace("Finished: ArchiveCleanup")
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
|
|
|
|
// DeleteRepositoryArchives deletes all repositories' archives.
|
|
|
|
func DeleteRepositoryArchives(ctx context.Context) error {
|
2023-09-16 09:39:12 -05:00
|
|
|
if err := repo_model.DeleteAllRepoArchives(ctx); err != nil {
|
2021-12-06 02:19:28 -05:00
|
|
|
return err
|
|
|
|
}
|
|
|
|
return storage.Clean(storage.RepoArchives)
|
|
|
|
}
|