Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

garbage collect has deleted all my images #19308

Closed
badsmoke opened this issue Sep 5, 2023 · 12 comments
Closed

garbage collect has deleted all my images #19308

badsmoke opened this issue Sep 5, 2023 · 12 comments
Assignees

Comments

@badsmoke
Copy link

badsmoke commented Sep 5, 2023

If you are reporting a problem, please make sure the following information are provided:

Expected behavior and actual behavior:
gc only deletes unnecessary blobs/tags/acrtefacts

gc has deleted everything, you can see the tags in the ui

Steps to reproduce the problem:

Update to harbor 2.9.0, perform a manual cleanup via the ui

Versions:
Please specify the versions of following systems.

  • harbor version: [2.9.0]
  • docker engine version: [24.0.5]
  • docker-compose version: [1.29.2]

Additional context:

Harbor config files:

harbor.yml
# Configuration file of Harbor

# The IP address or hostname to access admin UI and registry service.
# DO NOT use localhost or 127.0.0.1, because Harbor needs to be accessed by external clients.
hostname: backhub.domain.com

# http related config
http:
  # port for http, default is 80. If https enabled, this port will redirect to https port
  port: 8081
  relativeurls: true
# https related config
# https:
#   # https port for harbor, default is 443
#   port: 443
#   # The path of cert and key files for nginx
#   certificate: /your/certificate/path
#   private_key: /your/private/key/path

# # Uncomment following will enable tls communication between all harbor components
# internal_tls:
#   # set enabled to true means internal tls is enabled
#   enabled: true
#   # put your cert and key files on dir
#   dir: /etc/harbor/tls/internal

# Uncomment external_url if you want to enable external proxy
# And when it enabled the hostname will no longer used
external_url: https://backhub.domain.com

# The initial password of Harbor admin
# It only works in first time to install harbor
# Remember Change the admin password from UI after launching Harbor.
harbor_admin_password: asd

# Harbor DB configuration
database:
  # The password for the root user of Harbor DB. Change this before any production use.
  password: asd
  # The maximum number of connections in the idle connection pool. If it <=0, no idle connections are retained.
  max_idle_conns: 50
  # The maximum number of open connections to the database. If it <= 0, then there is no limit on the number of open connections.
  # Note: the default number of connections is 1024 for postgres of harbor.
  max_open_conns: 900
  # The maximum amount of time a connection may be reused. Expired connections may be closed lazily before reuse. If it <= 0, connections are not closed due to a connection's age.
  # The value is a duration string. A duration string is a possibly signed sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".
  conn_max_lifetime: 5m
  # The maximum amount of time a connection may be idle. Expired connections may be closed lazily before reuse. If it <= 0, connections are not closed due to a connection's idle time.
  # The value is a duration string. A duration string is a possibly signed sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".
  conn_max_idle_time: 0

# The default data volume
data_volume: /srv/harbor/data

# Harbor Storage settings by default is using /data dir on local filesystem
# Uncomment storage_service setting If you want to using external storage
# Harbor Storage settings by default is using /data dir on local filesystem
# Uncomment storage_service setting If you want to using external storage
# storage_service:
#   # ca_bundle is the path to the custom root ca certificate, which will be injected into the truststore
#   # of registry's and chart repository's containers.  This is usually needed when the user hosts a internal storage with self signed certificate.
#   ca_bundle:

#   # storage backend, default is filesystem, options include filesystem, azure, gcs, s3, swift and oss
#   # for more info about this configuration please refer https://docs.docker.com/registry/configuration/
#   filesystem:
#     maxthreads: 100
#   # set disable to true when you want to disable registry redirect
#   redirect:
#     disabled: false

# Clair configuration
#clair:
  # The interval of clair updaters, the unit is hour, set to 0 to disable the updaters.
#  updaters_interval: 12

# Trivy configuration
#
# Trivy DB contains vulnerability information from NVD, Red Hat, and many other upstream vulnerability databases.
# It is downloaded by Trivy from the GitHub release page https://github.com/aquasecurity/trivy-db/releases and cached
# in the local file system. In addition, the database contains the update timestamp so Trivy can detect whether it
# should download a newer version from the Internet or use the cached one. Currently, the database is updated every
# 12 hours and published as a new release to GitHub.
trivy:
  # ignoreUnfixed The flag to display only fixed vulnerabilities
  ignore_unfixed: false
  # skipUpdate The flag to enable or disable Trivy DB downloads from GitHub
  #
  # You might want to enable this flag in test or CI/CD environments to avoid GitHub rate limiting issues.
  # If the flag is enabled you have to download the `trivy-offline.tar.gz` archive manually, extract `trivy.db` and
  # `metadata.json` files and mount them in the `/home/scanner/.cache/trivy/db` path.
  skip_update: false
  #
  # The offline_scan option prevents Trivy from sending API requests to identify dependencies.
  # Scanning JAR files and pom.xml may require Internet access for better detection, but this option tries to avoid it.
  # For example, the offline mode will not try to resolve transitive dependencies in pom.xml when the dependency doesn't
  # exist in the local repositories. It means a number of detected vulnerabilities might be fewer in offline mode.
  # It would work if all the dependencies are in local.
  # This option doesn't affect DB download. You need to specify "skip-update" as well as "offline-scan" in an air-gapped environment.
  offline_scan: false
  #
  # Comma-separated list of what security issues to detect. Possible values are `vuln`, `config` and `secret`. Defaults to `vuln`.
  security_check: vuln
  #
  # insecure The flag to skip verifying registry certificate
  insecure: false
  #
  # Anonymous downloads from GitHub are subject to the limit of 60 requests per hour. Normally such rate limit is enough
  # for production operations. If, for any reason, it's not enough, you could increase the rate limit to 5000
  # requests per hour by specifying the GitHub access token. For more details on GitHub rate limiting please consult
  # https://developer.github.com/v3/#rate-limiting
  #
  # You can create a GitHub token by following the instuctions in
  # https://help.github.com/en/github/authenticating-to-github/creating-a-personal-access-token-for-the-command-line
  #
  # github_token: xxx

jobservice:
  # Maximum number of job workers in job service
  max_job_workers: 10
  # The jobLoggers backend name, only support "STD_OUTPUT", "FILE" and/or "DB"
  job_loggers:
    - STD_OUTPUT
    - FILE
    # - DB
  # The jobLogger sweeper duration (ignored if `jobLogger` is `stdout`)
  logger_sweeper_duration: 1 #days

notification:
  # Maximum retry count for webhook job
  webhook_job_max_retry: 3
  # HTTP client timeout for webhook job
  webhook_job_http_client_timeout: 3 #seconds

chart:
  # Change the value of absolute_url to enabled can enable absolute url in chart
  absolute_url: disabled

# Log configurations
log:
  # options are debug, info, warning, error, fatal
  level: info
  # configs for logs in local storage
  local:
    # Log files are rotated log_rotate_count times before being removed. If count is 0, old versions are removed rather than rotated.
    rotate_count: 50
    # Log files are rotated only if they grow bigger than log_rotate_size bytes. If size is followed by k, the size is assumed to be in kilobytes.
    # If the M is used, the size is in megabytes, and if G is used, the size is in gigabytes. So size 100, size 100k, size 100M and size 100G
    # are all valid.
    rotate_size: 200M
    # The directory on your host that store log
    location: /var/log/harbor
    # Uncomment following lines to enable external syslog endpoint.
    # external_endpoint:
    #   # protocol used to transmit log to external endpoint, options is tcp or udp
    #   protocol: tcp
    #   # The host of external endpoint
    #   host: localhost
    #   # Port of external endpoint
    #   port: 5140


#This attribute is for migrator to detect the version of the .cfg file, DO NOT MODIFY!
_version: 2.0.0
# Uncomment external_database if using external database.
# external_database:
#   harbor:
#     host: harbor_db_host
#     port: harbor_db_port
#     db_name: harbor_db_name
#     username: harbor_db_username
#     password: harbor_db_password
#     ssl_mode: disable
#   clair:
#     host: clair_db_host
#     port: clair_db_port
#     db_name: clair_db_name
#     username: clair_db_username
#     password: clair_db_password
#     ssl_mode: disable
#   notary_signer:
#     host: notary_signer_db_host
#     port: notary_signer_db_port
#     db_name: notary_signer_db_name
#     username: notary_signer_db_username
#     password: notary_signer_db_password
#     ssl_mode: disable
#   notary_server:
#     host: notary_server_db_host
#     port: notary_server_db_port
#     db_name: notary_server_db_name
#     username: notary_server_db_username
#     password: notary_server_db_password
#     ssl_mode: disable

# Umcomments external_redis if using external Redis server
# external_redis:
#   host: redis
#   port: 6379
#   password:
#   # db_index 0 is for core, it's unchangeable
#   registry_db_index: 1
#   jobservice_db_index: 2
#   chartmuseum_db_index: 3
#   clair_db_index: 4
#   trivy_db_index: 5
#   idle_timeout_seconds: 30

# Uncomment uaa for trusting the certificate of uaa instance that is hosted via self-signed cert.
# uaa:
#   ca_file: /path/to/ca


# Global proxy
# Config http proxy for components, e.g. http://my.proxy.com:3128
# Components doesn't need to connect to each others via http proxy.
# Remove component from `components` array if want disable proxy
# for it. If you want use proxy for replication, MUST enable proxy
# for core and jobservice, and set `http_proxy` and `https_proxy`.
# Add domain to the `no_proxy` field, when you want disable proxy
# for some special registry.
proxy:
  http_proxy: 
  https_proxy: 
  no_proxy: 
  components:
    - core
    - jobservice
    - trivy


metric:
  enabled: true
  port: 9090
  path: /metrics

# Enable purge _upload directories
upload_purging:
  enabled: true
  # remove files in _upload directories which exist for a period of time, default is one week.
  age: 168h
  # the interval of the purge operations
  interval: 24h
  dryrun: false
cache:
  # not enabled by default
  enabled: true
  # keep cache for one day by default
  expire_hours: 48


Log files:

gc logs
lete blob from storage: sha256:c5c0f9c27c95961a34ec1a129fe96b477c30b82e61e7d87427a31d9e8a02090a
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:453]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65629/87485] delete blob record from database: 65209, sha256:c5c0f9c27c95961a34ec1a129fe96b477c30b82e61e7d87427a31d9e8a02090a
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:424]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65630/87485] delete blob from storage: sha256:e9ac97d76ca463997e1be13dfd47b2f2a25d8a6f08581d5f9f19f499999168c5
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:453]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65630/87485] delete blob record from database: 68275, sha256:e9ac97d76ca463997e1be13dfd47b2f2a25d8a6f08581d5f9f19f499999168c5
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:424]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65631/87485] delete blob from storage: sha256:bf3fb65910144c1db58f2f07580cdd75ee666a88bd16191b4c1697037ba5302a
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:453]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65631/87485] delete blob record from database: 65253, sha256:bf3fb65910144c1db58f2f07580cdd75ee666a88bd16191b4c1697037ba5302a
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:424]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65632/87485] delete blob from storage: sha256:c44a81397b19a751eb6ac657ac2ab10c2cbd967d0eb0d4cf07c0b3cb445b6bbb
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:453]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65632/87485] delete blob record from database: 67731, sha256:c44a81397b19a751eb6ac657ac2ab10c2cbd967d0eb0d4cf07c0b3cb445b6bbb
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:424]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65633/87485] delete blob from storage: sha256:3bab412bf8a0986590fbb3e5d63046525da023d7d629966d7a9b7e5ea1a04c6e
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:453]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65633/87485] delete blob record from database: 68172, sha256:3bab412bf8a0986590fbb3e5d63046525da023d7d629966d7a9b7e5ea1a04c6e
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:424]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65634/87485] delete blob from storage: sha256:8b6d016f0401b44993a4c3c77febd0196b0d29ad456b9378ce02859b2bb58012
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:453]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65634/87485] delete blob record from database: 68274, sha256:8b6d016f0401b44993a4c3c77febd0196b0d29ad456b9378ce02859b2bb58012
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:424]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65635/87485] delete blob from storage: sha256:e04e7077f439190286120a3714bd12d8ae1ce10053e329993e662a489091cab1
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:453]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65635/87485] delete blob record from database: 68283, sha256:e04e7077f439190286120a3714bd12d8ae1ce10053e329993e662a489091cab1
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:424]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65636/87485] delete blob from storage: sha256:dc9de5fbf4da52e866d2845f631e9da0471fffcd2c8130b12104f46682587ea8
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:453]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65636/87485] delete blob record from database: 67746, sha256:dc9de5fbf4da52e866d2845f631e9da0471fffcd2c8130b12104f46682587ea8
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:424]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65637/87485] delete blob from storage: sha256:04015007468013506b5147774d3615a75b0c64b4e265cbcf7d6775d79465b446
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:453]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65637/87485] delete blob record from database: 65257, sha256:04015007468013506b5147774d3615a75b0c64b4e265cbcf7d6775d79465b446
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:424]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65638/87485] delete blob from storage: sha256:a94c0cd029574df83e5d5b7b7fb27dd918b44ca0ac02f35c533e40b8dbd00c3a
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:453]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65638/87485] delete blob record from database: 65263, sha256:a94c0cd029574df83e5d5b7b7fb27dd918b44ca0ac02f35c533e40b8dbd00c3a
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:424]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65639/87485] delete blob from storage: sha256:24d02847d1abc15f4bc1264c2c6f3e3c94bbf03adbaeaeeb3439fda5ba30e49c
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:453]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65639/87485] delete blob record from database: 65268, sha256:24d02847d1abc15f4bc1264c2c6f3e3c94bbf03adbaeaeeb3439fda5ba30e49c
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:424]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65640/87485] delete blob from storage: sha256:3d28e9e317ca4fd7b957dbd1ef60ef9a454bc8de7e4c4a6fe14708ad67a1eede
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:453]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65640/87485] delete blob record from database: 65295, sha256:3d28e9e317ca4fd7b957dbd1ef60ef9a454bc8de7e4c4a6fe14708ad67a1eede
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:424]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65641/87485] delete blob from storage: sha256:876b3ef21a12bd4a75aab1185898e1736404f55e04a788bbb6dbad09ec93d0cd
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:453]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65641/87485] delete blob record from database: 66334, sha256:876b3ef21a12bd4a75aab1185898e1736404f55e04a788bbb6dbad09ec93d0cd
2023-09-05T07:15:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:424]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][65642/87485] delete blob from storage: sha256:7228729be8bfcf6585c7a77a71762df02b358514c98df2b01f1b792b621a0723
.
.
.
2023-09-05T07:18:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:424]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][87485/87485] delete blob from storage: sha256:a2f02aac7fabb4d567b6f713c51a016db7b0bcbdff5cbca5c3a37fe2ae2d536a
2023-09-05T07:18:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:453]: [ca53a4ad-dd6f-4e78-ae7b-f9d28514803a][87485/87485] delete blob record from database: 51704, sha256:a2f02aac7fabb4d567b6f713c51a016db7b0bcbdff5cbca5c3a37fe2ae2d536a
2023-09-05T07:18:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:482]: 73605 blobs and 13880 manifests are actually deleted
2023-09-05T07:18:44Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:483]: The GC job actual frees up 613326 MB space.
2023-09-05T07:18:50Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:520]: cache clean up completed
2023-09-05T07:18:50Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:200]: success to run gc in job.
@wy65701436
Copy link
Contributor

Are these images tagged? Have you enabled untagged deletion in the GC page?

@badsmoke
Copy link
Author

all images had/have tags

untagged deletion was not active

@HuKuToH
Copy link

HuKuToH commented Sep 13, 2023

I have the same problem.
GC deleted many blobs from exist images.
v2.7.2-a051373e

@stepanovmm1992
Copy link

stepanovmm1992 commented Sep 13, 2023

@wy65701436
We have the same problem with many blobs.
We have large instance harbor with 10Tb data and after upgrade harbor to version v2.7.1 and run GC a lot of blobs from tagged images was delete

@stepanovmm1992
Copy link

stepanovmm1992 commented Sep 13, 2023

We found related unresolved issues:
#15445
#12975

@wy65701436
Copy link
Contributor

@badsmoke can you reproduce this problem in your harbor?

@HuKuToH @stepanovmm1992 did you see any failure in the GC log?

@badsmoke
Copy link
Author

yes

the peculiarity with my system is that my replica instance is, i through a replication job it is regularly synced from my main instance.

but currently only 80gb of my nearly 600gb arrives.

if we want to replicate something, then gladly still this week

@HuKuToH
Copy link

HuKuToH commented Sep 19, 2023

@wy65701436
My log file:

gc.log.tgz
I didnt see any critical errors.

@dmitry-g
Copy link

Not exactly the case, but also related to data loss during GC #19401

Copy link

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

@github-actions github-actions bot added the Stale label Nov 27, 2023
Copy link

This issue was closed because it has been stalled for 30 days with no activity. If this issue is still relevant, please re-open a new issue.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 28, 2023
@volker-raschek
Copy link

Hi, I've the same issue. Are there any news to this topic?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Completed
Development

No branches or pull requests

6 participants