Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC data residuals when with large amount of artifact trash #20711

Open
chlins opened this issue Jul 8, 2024 · 0 comments · May be fixed by #20735
Open

GC data residuals when with large amount of artifact trash #20711

chlins opened this issue Jul 8, 2024 · 0 comments · May be fixed by #20735
Labels

Comments

@chlins
Copy link
Member

chlins commented Jul 8, 2024

How can we help you?

Scenario

When there is a large amount of artifact trash data in the harbor, if an external database is used at this time, then when the filter artifact trash time is too long, the user deletes the artifact, which will cause the blob of this artifact to be deleted in advance, and the residual artifact trash cannot be deleted, resulting in artifact_blob and distribution manifest and its revisions cannot be cleaned up.

These resources will be left behind:

  • artifact_trash
  • artifact_blob
  • manifest and revisions(distribution)

Explanation

arts, err := gc.deletedArt(ctx)
this step took long time and in this period if user deleted the artifact, there will be a new record in artifact_trash but will not be covered in this time return, but the blob referenced by it will be captured in the following step
blobs, err := gc.uselessBlobs(ctx)
As the blob belong to it has been deleted, so there will be no chance to delete the artifact_trash as the clean mechanism in
if _, exist := gc.trashedArts[blob.Digest]; exist && blob.IsManifest() {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 participant