-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
blob sync takes 18 minutes, replicate from guangzhou to shanghai #20704
Comments
If you encounter the problem of frequent replication failures due to the large size of the blob, you can consider using replication by chunk, which will break the blob into smaller chunks for transmission to improve the success rate. FYI: https://goharbor.io/docs/2.7.0/administration/configuring-replication/create-replication-rules/ |
it seems more slowly, when change to use replication by chunk; @chlins so what's the best practices chunkSize value? 100M? |
i have read the source code, seems each chunk of a blob is pulled, processed, and pushed in sequence, one after another? why not use goroutine ? is it based on any consideration? thanks @chlins |
There is no uniform best practice for this value because it depends entirely on the environment and needs to be adjusted to the appropriate value according to the network environment. |
It defined in the distribution spec(https://github.com/opencontainers/distribution-spec/blob/main/spec.md), all the chunked must be uploaded in order, because the next chunk location relies on the previous chunk response header, so cannot use goroutines to push chunks parallelly. |
In my scenario, most of blobs are under 1g, now i have added a new feature in harbor: support replication by chunk when blob size is too largre(defalut value is 2g), and can override by setting env REPLICATION_CHUNK_BLOB_SIZE in the jobservice , and works well in prod env. thanks very much @chlins |
i have deploy two harbor(version 2.7.3), one in guangzhou another one deploy in shanghai, use repicate feature to sync images;
now i have see the sync task log(replicate from guangzhou to shanghai), an bold about 3g size, It takes 18 minutes to replicate from guangzhou to shanghai;
and occurs three times connection reset error, is there any way to avoid connection reset error?
The text was updated successfully, but these errors were encountered: