The main issue was the long running gc operation that
affects storage objects with deduplication. The long running
transacion ends locking some storage object rows which collaterally
made operations like import-binfile become blocked indefinitelly
because of the same rows (because of deduplication).
The solution used in this commit is split operations on small
chunks so we no longer use long running transactions that holds
too many locks. With this approach we will make a window to work
concurrently all operarate the distinct operations that requires
locks on the same rows.
Relevant changes:
- Add the ability to create migration in both directions, defaulting
to identity if not provided
- Move the version attribute to file table column for to make it more
accessible (previously it was on data blob)
- Reduce db update operations on file-update rpc method
The climit previously of this commit is heavily used inside a
transactions, so in heavy contention operation such that file thumbnail
creation can cause a db pool exhaust.
This commit fixes this issue setting up a better resource limiting
mechanism that works outside the transactions so, contention will
no longer hold an open connection/transaction.
It also adds general improvement to the traceability to the climit
mechanism: it now properly logs the profile-id that is currently
cause some contention on specific resources.
It also add a general/root climit that is applied to all requests
so if someone start making abussive requests, we can clearly detect
it.
The main objective is prevent deletion of objects that can leave
unreachable orphan objects which we are unable to correctly track.
Additionally, this commit includes:
1. Properly implement safe cascade deletion of all participating
tables on soft deletion in the objects-gc task;
2. Make the file thumbnail related tables also participate in the
touch/refcount mechanism applyign to the same safety checks;
3. Add helper for db query lazy iteration using PostgreSQL support
for server side cursors;
4. Fix efficiency issues on gc related task using server side
cursors instead of custom chunked iteration for processing data.
The problem resided when a large chunk of rows that has identical
value on the deleted_at column and the chunk size is small (the
default); when the custom chunked iteration only reads a first N
items and skip the rest of the set to the next run.
This has caused many objects to remain pending to be eliminated,
taking up space for longer than expected. The server side cursor
based iteration does not has this problem and iterates correctly
over all objects.
5. Fix refcount issues on font variant deletion RPC methods