Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection to Postgres database closed #5458

Closed
jabbrwcky opened this issue Nov 1, 2024 · 6 comments · Fixed by #5488
Closed

Connection to Postgres database closed #5458

jabbrwcky opened this issue Nov 1, 2024 · 6 comments · Fixed by #5488

Comments

@jabbrwcky
Copy link

Description

After some time running the database connection to postgres is closed and the registry apparently is unable to handle it.
Also trying to use keepalive does not help:

jdbc:postgresql://postgres.apicurio.svc.cluster.local/apicurio?tcpKeepAlive=true

Registry Version: 3.0.3
Persistence type: sql

Environment

Kubernetes 1.30
Postgres v 12 (as StatefulSet)

Steps to Reproduce

apiVersion: apps/v1
kind: Deployment
metadata:
  name: registry
  namespace: apicurio
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  selector:
    matchLabels:
      app: registry
  template:
    metadata:
      labels:
        app: registry
    spec:
      containers:
      - env:
        - name: APICURIO_STORAGE_KIND
          value: sql
        - name: APICURIO_STORAGE_SQL_KIND
          value: postgresql
        - name: APICURIO_DATASOURCE_URL
          value: jdbc:postgresql://postgres.apicurio.svc.cluster.local/apicurio
        - name: JAVA_OPTIONS
          value: |
            -Dquarkus.http.host=0.0.0.0 -Djava.util.logging.manager=org.jboss.logmanager.LogManager -Xmx1024m
        envFrom:
        - secretRef:
            name: registry-db
        image: apicurio/apicurio-registry:3.0.3
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /health/live
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 30
          periodSeconds: 5
          successThreshold: 1
          timeoutSeconds: 4
        name: registry
        ports:
        - containerPort: 8080
          name: http
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /health/ready
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 30
          periodSeconds: 5
          successThreshold: 1
          timeoutSeconds: 4
        resources:
          limits:
            cpu: "2"
            memory: 1Gi
          requests:
            cpu: 100m
            memory: 150Mi
        startupProbe:
          failureThreshold: 20
          httpGet:
            path: /health/live
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 30
          periodSeconds: 15
          successThreshold: 1
          timeoutSeconds: 14

Expected vs Actual Behaviour

I expected the application not to drop the connection or at least to recover from it.

Logs

2024-11+00-01 10:54:22 ERROR [io.apicurio.registry.storage.impl.sql.AbstractHandleFactory] (executor-thread-50) Could not release database connection/transaction: org.postgresql.util.PSQLException: This connection has been closed.
    at org.postgresql.jdbc.PgConnection.checkClosed(PgConnection.java:1009)
    at org.postgresql.jdbc.PgConnection.rollback(PgConnection.java:1016)
    at io.agroal.pool.wrapper.ConnectionWrapper.rollback(ConnectionWrapper.java:208)
    at io.apicurio.registry.storage.impl.sql.AbstractHandleFactory.withHandle(AbstractHandleFactory.java:71)
    at io.apicurio.registry.storage.impl.sql.AbstractHandleFactory.withHandleNoException(AbstractHandleFactory.java:100)
    at io.apicurio.registry.storage.impl.sql.HandleFactoryProducer_ProducerMethod_produceHandleFactory_NU1A3ot6FkfWU5W0o_6TCeQ4KO4_ClientProxy.withHandleNoException(Unknown Source)
    at io.apicurio.registry.storage.impl.sql.AbstractSqlRegistryStorage.getGlobalRules(AbstractSqlRegistryStorage.java:2073)
    at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_Subclass.getGlobalRules$$superforward(Unknown Source)
    at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_Subclass$$function$$58.apply(Unknown Source)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:73)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
    at io.apicurio.common.apps.logging.LoggingInterceptor.logMethodEntry(LoggingInterceptor.java:53)
    at io.apicurio.common.apps.logging.LoggingInterceptor_Bean.intercept(Unknown Source)
    at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
    at io.apicurio.registry.metrics.health.readiness.PersistenceTimeoutReadinessInterceptor.intercept(PersistenceTimeoutReadinessInterceptor.java:29)
    at io.apicurio.registry.metrics.health.readiness.PersistenceTimeoutReadinessInterceptor_Bean.intercept(Unknown Source)
    at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
    at io.apicurio.registry.metrics.health.liveness.PersistenceExceptionLivenessInterceptor.intercept(PersistenceExceptionLivenessInterceptor.java:25)
    at io.apicurio.registry.metrics.health.liveness.PersistenceExceptionLivenessInterceptor_Bean.intercept(Unknown Source)
    at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:62)
    at io.apicurio.registry.metrics.StorageMetricsInterceptor.intercept(StorageMetricsInterceptor.java:41)
    at io.apicurio.registry.metrics.StorageMetricsInterceptor_Bean.intercept(Unknown Source)
    at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.perform(AroundInvokeInvocationContext.java:30)
    at io.quarkus.arc.impl.InvocationContexts.performAroundInvoke(InvocationContexts.java:27)
    at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_Subclass.getGlobalRules(Unknown Source)
    at io.apicurio.registry.storage.impl.sql.AbstractSqlRegistryStorage.isAlive(AbstractSqlRegistryStorage.java:441)
    at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_Subclass.isAlive$$superforward(Unknown Source)
    at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_Subclass$$function$$1.apply(Unknown Source)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:73)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
    at io.apicurio.common.apps.logging.LoggingInterceptor.logMethodEntry(LoggingInterceptor.java:53)
    at io.apicurio.common.apps.logging.LoggingInterceptor_Bean.intercept(Unknown Source)
    at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
    at io.apicurio.registry.metrics.health.readiness.PersistenceTimeoutReadinessInterceptor.intercept(PersistenceTimeoutReadinessInterceptor.java:29)
    at io.apicurio.registry.metrics.health.readiness.PersistenceTimeoutReadinessInterceptor_Bean.intercept(Unknown Source)
    at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
    at io.apicurio.registry.metrics.health.liveness.PersistenceExceptionLivenessInterceptor.intercept(PersistenceExceptionLivenessInterceptor.java:25)
    at io.apicurio.registry.metrics.health.liveness.PersistenceExceptionLivenessInterceptor_Bean.intercept(Unknown Source)
    at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:62)
    at io.apicurio.registry.metrics.StorageMetricsInterceptor.intercept(StorageMetricsInterceptor.java:41)
    at io.apicurio.registry.metrics.StorageMetricsInterceptor_Bean.intercept(Unknown Source)
    at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.perform(AroundInvokeInvocationContext.java:30)
    at io.quarkus.arc.impl.InvocationContexts.performAroundInvoke(InvocationContexts.java:27)
    at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_Subclass.isAlive(Unknown Source)
    at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_ClientProxy.isAlive(Unknown Source)
    at io.apicurio.registry.storage.decorator.RegistryStorageDecoratorReadOnlyBase.isAlive(RegistryStorageDecoratorReadOnlyBase.java:74)
    at io.apicurio.registry.config.RegistryStorageConfigCache_ClientProxy.isAlive(Unknown Source)
    at io.apicurio.registry.storage.decorator.RegistryStorageDecoratorReadOnlyBase.isAlive(RegistryStorageDecoratorReadOnlyBase.java:74)
    at io.apicurio.registry.limits.RegistryStorageLimitsEnforcer_ClientProxy.isAlive(Unknown Source)
    at io.apicurio.registry.storage.decorator.RegistryStorageDecoratorReadOnlyBase.isAlive(RegistryStorageDecoratorReadOnlyBase.java:74)
    at io.apicurio.registry.storage.decorator.ReadOnlyRegistryStorageDecorator_ClientProxy.isAlive(Unknown Source)
    at io.apicurio.registry.storage.RegistryStorageProducer_ProducerMethod_current_KUuvkPhY_4l3q6EULlF1GPXbKes_ClientProxy.isAlive(Unknown Source)
    at io.apicurio.registry.metrics.health.liveness.StorageLivenessCheck.call(StorageLivenessCheck.java:23)
    at io.apicurio.registry.metrics.health.liveness.StorageLivenessCheck_ClientProxy.call(Unknown Source)
    at io.smallrye.context.impl.wrappers.SlowContextualSupplier.get(SlowContextualSupplier.java:21)
    at io.smallrye.mutiny.operators.uni.builders.UniCreateFromItemSupplier.subscribe(UniCreateFromItemSupplier.java:28)
    at io.smallrye.mutiny.operators.AbstractUni.subscribe(AbstractUni.java:36)
    at io.smallrye.mutiny.operators.uni.UniOnFailureFlatMap.subscribe(UniOnFailureFlatMap.java:31)
    at io.smallrye.mutiny.operators.AbstractUni.subscribe(AbstractUni.java:36)
    at io.smallrye.mutiny.operators.uni.UniOnItemTransform.subscribe(UniOnItemTransform.java:22)
    at io.smallrye.mutiny.operators.AbstractUni.subscribe(AbstractUni.java:36)
    at io.smallrye.mutiny.operators.uni.UniRunSubscribeOn.lambda$subscribe$0(UniRunSubscribeOn.java:27)
    at io.smallrye.mutiny.vertx.MutinyHelper.lambda$blockingExecutor$6(MutinyHelper.java:62)
    at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$1(ContextImpl.java:191)
    at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:279)
    at io.vertx.core.impl.ContextImpl.lambda$internalExecuteBlocking$2(ContextImpl.java:210)
    at io.quarkus.vertx.core.runtime.VertxCoreRecorder$14.runWith(VertxCoreRecorder.java:599)
    at org.jboss.threads.EnhancedQueueExecutor$Task.doRunWith(EnhancedQueueExecutor.java:2516)
    at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2495)
    at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1521)
    at org.jboss.threads.DelegatingRunnable.run(DelegatingRunnable.java:11)
    at org.jboss.threads.ThreadLocalResettingRunnable.run(ThreadLocalResettingRunnable.java:11)
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.base/java.lang.Thread.run(Thread.java:840)
@apicurio-bot
Copy link

apicurio-bot bot commented Nov 1, 2024

Thank you for reporting an issue!

Pinging @jsenko to respond or triage.

@wolfchimneyrock
Copy link
Contributor

I suspect you need to do something like set quarkus.datasource.jdbc.max-lifetime to less than your postgres instance's idle timeout.
when they migrated to v3 they changed how the agroal datasource is instantiated so that you can no longer directly pass quarkus configs - now they have to be explicitly mentioned in this file and passed into the datasource constructor:

AgroalDataSource datasource = AgroalDataSource
.from(new AgroalPropertiesReader().readProperties(props).get());

@EricWittmann
Copy link
Member

And if you're wondering why we did that, it's so we can support multiple storage configurations from a single container image. Now with 3.0 we support the following storages but only have a single back-end container image:

  • Postgresql
  • SQL Server
  • MySQL
  • Kafka

Much easier for everyone hopefully. Thanks @wolfchimneyrock for suggesting a fix. @jabbrwcky can you confirm if it worked for you?

@Yoni-Weisberg
Copy link
Contributor

Hi, also encountered this issue. And already have a fix for this: #5507

@carlesarnal
Copy link
Member

As Eric said, this is mainly due to the support of multiple sql variants within the same container image. In previous versions Quarkus did not had the ability to deactivate a datasource at runtime. This capability has been introduced relatively recently, and I'm planning on exploring that to see if we can go back to use Quarkus standard properties so any standard jdbc config can be added without the need for configuring it manually in the producer. I'll get back to this as soon as I can get past some CI issues we're having right now.

@carlesarnal carlesarnal moved this from Backlog to Done in Registry 3.0 Nov 14, 2024
@carlesarnal carlesarnal linked a pull request Nov 14, 2024 that will close this issue
@carlesarnal
Copy link
Member

This has been addressed as part of the PR I have attached. The maximum lifetime for the datasource (and any other property, really) can be configured using Quarkus standards (see this) for more information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

5 participants