Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Certificate for catch-all site is used for requests to other sites #5933

Closed
SimJoSt opened this issue Nov 7, 2023 · 20 comments · Fixed by #6712
Closed

Certificate for catch-all site is used for requests to other sites #5933

SimJoSt opened this issue Nov 7, 2023 · 20 comments · Fixed by #6712
Labels
discussion 💬 The right solution needs to be found

Comments

@SimJoSt
Copy link

SimJoSt commented Nov 7, 2023

The issue is that the certificate loaded via the tls directive from files in the second site block for website2, containing a catch-all, is used for all request to caddy. Only the certificate is used.
The content is still used from the requested domain/site/block. Meaning, each request returns the wanted content from the requested application, just the certificate is from completely different site block.

For example, https://subdomain.website1.com returns certificate 2, when certificate 1 is expected.
Expected behaviour would be:

  • https://subdomain.website1.com -> certificate 1
  • https://www.website1.com -> certificate 2
  • https://www.website2.com -> certificate 2
  • https://www.example.com -> certificate 2
  • https://test.website2.com -> certificate 2
    However, certificate 2 is returned every time.

Caddy is only hit with https requests. http requests or http to https redirects don't play a role.

The following docs lead me to believe, I configured this correctly and like it was intended:

Things that were tried:

  • replacing :443 with https://
  • switching the import order in the main Caddyfile

Main Caddyfile:

import /home/deployer/website1/Caddyfile
import /home/deployer/website2/Caddyfile

website1 Caddyfile (uses automatic https, no tls directive is configured):

subdomain.website1.com {
    ...
}

website2 Caddyfile:

www.website2.com,
*.website2.com,
:443 {
    tls cert.pem key.pem
    ...
}

Caddy version: v2.7.5
Modules:

@SimJoSt
Copy link
Author

SimJoSt commented Nov 7, 2023

Had an oversight and missed a debugging step. Will close for now, until more accurate information is available.

@SimJoSt SimJoSt closed this as not planned Won't fix, can't repro, duplicate, stale Nov 7, 2023
@francislavoie francislavoie added the invalid ❓ This doesn't seem right label Nov 7, 2023
@SimJoSt SimJoSt changed the title Catch-all site captures all requests, even more specific site selectors are present Certificate for catch-all site is used for requests to other sites Nov 7, 2023
@SimJoSt
Copy link
Author

SimJoSt commented Nov 7, 2023

Finished up the missing debugging and adjusted the content, information, explanation, and title of the issue. Hopefully, I expressed this edge-case clearly and in a helpful way.
@francislavoie it would be lovely if you could remove the "invalid" tag 🙏

@SimJoSt SimJoSt reopened this Nov 7, 2023
@SimJoSt
Copy link
Author

SimJoSt commented Nov 7, 2023

I also tried to add a simple tls directive to the site blocks that didn't have any before:

tls email@example.com

Unfortunately, this didn't resolve the issue.

@mholt mholt removed the invalid ❓ This doesn't seem right label Nov 7, 2023
@mholt
Copy link
Member

mholt commented Nov 9, 2023

Thanks, will look into this soon! (A little busy with new baby, heh)

@SimJoSt
Copy link
Author

SimJoSt commented Nov 12, 2023

Sounds great! Both of the statements. So, congratulations.
And I hope I am not wasting your time with a configuration mistake. If you need any additional information, tests or debugging, I am happy to do so.
I can add that the wildcard site uses a Cloudflare origin server certificate, while the others are on automatic tls from Caddy.

@francislavoie
Copy link
Member

You might need to play with auto_https ignore_loaded_certificates global option https://caddyserver.com/docs/caddyfile/options#auto-https it might solve your problem.

@Matthieu-LAURENT39
Copy link

Matthieu-LAURENT39 commented Nov 19, 2023

Same issue here, i have my Caddyfile configured like this

{
        admin "unix//run/caddy/admin.socket"
        email contact@example.com
        auto_https disable_redirects
}


subdomain.example.com {
    reverse_proxy localhost:4500
}


*.example.com {
        tls internal

        @somesite host somesite.example.com
        handle @somesite {
                reverse_proxy 127.0.0.1:48767
        }

        handle {
                respond "That's a non-existant page!"
        }
}

anothersubdomain.example.com {
    reverse_proxy localhost:8080
}

:2020 {
    root * /opt/webservice
    file_server
}

Unfortunately, subdomain.example.com ends up being served by the local self-signed certificate.
Commenting out tls internal in the wildcard block makes it work, but obviously that's not a viable solution as i need it to be self-signed.
Even weirder though, if i comment out the :2020 block (even without tls internal commented out), then it also starts using the correct certificate again for subdomain.example.com

This happens on 2.7.5, and on all older versions i tested (i tested down to 2.5.2)

@mholt mholt added help wanted 🆘 Extra attention is needed bug 🐞 Something isn't working labels Nov 20, 2023
@SimJoSt
Copy link
Author

SimJoSt commented Nov 22, 2023

You might need to play with auto_https ignore_loaded_certificates global option caddyserver.com/docs/caddyfile/options#auto-https it might solve your problem.

I tried out auto_https ignore_loaded_certs as a global option, validated the config, reloaded it and tried again. No dice.
Seems like the issue is deeply rooted.

Maybe it's also because the Subject of the Cloudflare Origin Server Certificate is

CN = CloudFlare Origin Certificate
OU = CloudFlare Origin CA
O = CloudFlare, Inc.

Not the main domain, as it is usually the case.
The alternative name is

website2.com
*.website2.com

@SimJoSt
Copy link
Author

SimJoSt commented Feb 15, 2024

As a workaround we switched all hosted sites to using Cloudflares Origin Certificates and the one site that isn't using them doesn't have strict checking of certificates enabled, which also works.
However, it is a real shame that this case is broken this badly. If we wouldn't use Cloudflare as a proxy in front of it, we wouldn't have any recourse.

Edit: As long as we know the domains that need to be accepted by the site before the first request, we could also use the REST API to add them on the fly. This is the case for us, as customers register them in the application and we send to Cloudflare for SaaS via the API.
Not every setup might have this advantage though.

@mholt
Copy link
Member

mholt commented Feb 15, 2024

Thanks for the update. Sorry I've been very behind. Turns out an early baby can set you back a few months :) Someone else is welcome to tackle this for a faster resolution.

What I'd suggest doing is getting the JSON config, examining it to see if there's an obvious reason why this might be happening. If not, then sprinkle some log.Println() lines in relevant paths in the caddytls package or in CertMagic (handshake.go would be a good place to start). CertMagic actually powers the certificate cache and selection logic.

@SimJoSt
Copy link
Author

SimJoSt commented Nov 15, 2024

Thanks for giving this issue some visibility in the feature discussion #6146 (comment) @polarathene.

@mholt family comes first, I know this too well. Thanks for responding and providing some starting points for people with the skills and will to tackle this, even though nobody picked it up for now.
Do you see a chance that this issue might be resolved by the referenced changes or other advancements in the last updates?
We've been keeping up-to-date with the current version and the issue still persists for now. In most cases we could luckily solve with explicit Cloudflare Origin certificates, which still work.

@mholt
Copy link
Member

mholt commented Nov 15, 2024

Reproducing the behavior

Caddyfile config

{
	debug
	#auto_https ignore_loaded_certs
}

subdomain.example.localhost {
	respond "Subdomain!"
}

*.example.localhost {
	tls cert.pem key.pem
	respond "Wildcard!"
}

Generate key pair

Created a cert/key pair: openssl req -nodes -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -sha256 -days 365 (for *.example.localhost)

Running Caddy, no certificate is obtained for subdomain.example.localhost, because it is seeing the loaded wildcard certificate. The wildcard certificate is served in response to curl -v -k "https://subdomain.example.localhost".

Validating ignore_loaded_certs

Un-commenting the auto_https line and running again, Caddy indeed obtains two certificates (one wildcard, one subdomain) -- it can do this without a DNS challenge configured because the .localhost TLD is internal-only so it uses its own root CA:

2024/11/15 16:30:14.024 INFO    tls.obtain      obtaining certificate   {"identifier": "subdomain.example.localhost"}
2024/11/15 16:30:14.024 INFO    tls.obtain      obtaining certificate   {"identifier": "*.example.localhost"}
2024/11/15 16:30:14.024 DEBUG   events  event   {"name": "cert_obtaining", "id": "39c4c0c5-1df1-44ed-b88c-9853618dd6f8", "origin": "tls", "data": {"identifier":"subdomain.example.localhost"}}
2024/11/15 16:30:14.024 DEBUG   events  event   {"name": "cert_obtaining", "id": "57ade136-3efc-4ed6-9d11-f453733bb72a", "origin": "tls", "data": {"identifier":"*.example.localhost"}}
2024/11/15 16:30:14.024 DEBUG   tls.obtain      trying issuer 1/1       {"issuer": "local"}
2024/11/15 16:30:14.024 DEBUG   tls.obtain      trying issuer 1/1       {"issuer": "local"}
2024/11/15 16:30:14.025 DEBUG   pki.ca.local    using intermediate signer       {"serial": "222743116402370213372907501073948934027", "not_before": "2024-11-12 20:43:18 +0000 UTC", "not_after": "2024-11-19 20:43:18 +0000 UTC"}
2024/11/15 16:30:14.025 DEBUG   pki.ca.local    using intermediate signer       {"serial": "222743116402370213372907501073948934027", "not_before": "2024-11-12 20:43:18 +0000 UTC", "not_after": "2024-11-19 20:43:18 +0000 UTC"}
2024/11/15 16:30:14.025 INFO    tls.obtain      certificate obtained successfully       {"identifier": "subdomain.example.localhost", "issuer": "local"}
2024/11/15 16:30:14.025 INFO    tls.obtain      certificate obtained successfully       {"identifier": "*.example.localhost", "issuer": "local"}
2024/11/15 16:30:14.025 DEBUG   events  event   {"name": "cert_obtained", "id": "6cbf835d-d79b-4dc9-92b4-1f61e11b7910", "origin": "tls", "data": {"certificate_path":"certificates/local/subdomain.example.localhost/subdomain.example.localhost.crt","csr_pem":"LS0tLS1CRUdJTiBDRVJUSUZJQ0FURSBSRVFVRVNULS0tLS0KTUlIME1JR2JBZ0VBTUFBd1dUQVRCZ2NxaGtqT1BRSUJCZ2dxaGtqT1BRTUJCd05DQUFSOENnYUNPUkZwQ3JjVwp5RUMvSnlKYmtVKzRhblZhZzF6SlpVRUMxNXJDVUp6UW12T3ovTjFrUW1HcjdJQ2pCaVAxa0x0dWdGanhIN3F5ClhzQXBOMzJIb0Rrd053WUpLb1pJaHZjTkFRa09NU293S0RBbUJnTlZIUkVFSHpBZGdodHpkV0prYjIxaGFXNHUKWlhoaGJYQnNaUzVzYjJOaGJHaHZjM1F3Q2dZSUtvWkl6ajBFQXdJRFNBQXdSUUloQVBNOWUvVUpGYXQ1OW1NaApZdUpkODRzTThkZXUwNEhzWlFiVmYwNjBXdExRQWlBWlFLdWQrT2ZDZFBPL2lqTmhYY1M5b1d4eU5XRXYvcTJyCmNOQTlNeEUrckE9PQotLS0tLUVORCBDRVJUSUZJQ0FURSBSRVFVRVNULS0tLS0K","identifier":"subdomain.example.localhost","issuer":"local","metadata_path":"certificates/local/subdomain.example.localhost/subdomain.example.localhost.json","private_key_path":"certificates/local/subdomain.example.localhost/subdomain.example.localhost.key","renewal":false,"storage_path":"certificates/local/subdomain.example.localhost"}}
2024/11/15 16:30:14.025 INFO    tls.obtain      releasing lock  {"identifier": "subdomain.example.localhost"}
2024/11/15 16:30:14.025 DEBUG   events  event   {"name": "cert_obtained", "id": "cad3b61a-a7b2-4f52-b06c-0ef9ce071ed4", "origin": "tls", "data": {"certificate_path":"certificates/local/wildcard_.example.localhost/wildcard_.example.localhost.crt","csr_pem":"LS0tLS1CRUdJTiBDRVJUSUZJQ0FURSBSRVFVRVNULS0tLS0KTUlIdE1JR1RBZ0VBTUFBd1dUQVRCZ2NxaGtqT1BRSUJCZ2dxaGtqT1BRTUJCd05DQUFUcXdLajBYd29IZjVOTQpyZ09KMXFwVE1SYitiakdGVnhmd2xxTE1rSHdBWGdUOHZxcWNOYm1wNTUwMWc1UjNuUnRGbUNYbG43ZW1hTnk4CnBKb1Bpd0J4b0RFd0x3WUpLb1pJaHZjTkFRa09NU0l3SURBZUJnTlZIUkVFRnpBVmdoTXFMbVY0WVcxd2JHVXUKYkc5allXeG9iM04wTUFvR0NDcUdTTTQ5QkFNQ0Ewa0FNRVlDSVFDaFBMQ1dDZmszSjJtWTE4WndqUG1LbzJ5TApDYi93RXNZWnkzYzIvZWw1Q2dJaEFNYXJLS3NHMUd5RmpSSGFrck1Kb2Y5UHRvN2tnZFQwS0pMc0Ezcm84bDhLCi0tLS0tRU5EIENFUlRJRklDQVRFIFJFUVVFU1QtLS0tLQo=","identifier":"*.example.localhost","issuer":"local","metadata_path":"certificates/local/wildcard_.example.localhost/wildcard_.example.localhost.json","private_key_path":"certificates/local/wildcard_.example.localhost/wildcard_.example.localhost.key","renewal":false,"storage_path":"certificates/local/wildcard_.example.localhost"}}
2024/11/15 16:30:14.025 INFO    tls.obtain      releasing lock  {"identifier": "*.example.localhost"}
2024/11/15 16:30:14.025 WARN    tls     stapling OCSP   {"error": "no OCSP stapling for [subdomain.example.localhost]: no OCSP server specified in certificate", "identifiers": ["subdomain.example.localhost"]}
2024/11/15 16:30:14.025 WARN    tls     stapling OCSP   {"error": "no OCSP stapling for [*.example.localhost]: no OCSP server specified in certificate", "identifiers": ["*.example.localhost"]}
2024/11/15 16:30:14.025 DEBUG   tls.cache       added certificate to cache      {"subjects": ["subdomain.example.localhost"], "expiration": "2024/11/16 04:30:15.000", "managed": true, "issuer_key": "local", "hash": "ea030d22ee0884df69a1e3d1da764b8a1549945aba5c26b985f3a498e3e08e65", "cache_size": 2, "cache_capacity": 10000}
2024/11/15 16:30:14.025 DEBUG   events  event   {"name": "cached_managed_cert", "id": "b534a07e-3b4b-4810-820a-8e7d5ca8f52f", "origin": "tls", "data": {"sans":["subdomain.example.localhost"]}}
2024/11/15 16:30:14.025 DEBUG   tls.cache       added certificate to cache      {"subjects": ["*.example.localhost"], "expiration": "2024/11/16 04:30:15.000", "managed": true, "issuer_key": "local", "hash": "f10701095a194cded7b9eeca4756606088b69f8c21e2fc03e4b121c18a27d0fa", "cache_size": 3, "cache_capacity": 10000}
2024/11/15 16:30:14.025 DEBUG   events  event   {"name": "cached_managed_cert", "id": "b54fd602-5a32-4901-8a15-a2c193da16f5", "origin": "tls", "data": {"sans":["*.example.localhost"]}}
2024/11/15 16:30:17.004 DEBUG   events  event   {"name": "tls_get_certificate", "id": "2c67b506-70d8-438e-9bfc-376ac0b1d8f2", "origin": "tls", "data": {"client_hello":{"CipherSuites":[4866,4867,4865,49196,49200,159,52393,52392,52394,49195,49199,158,49188,49192,107,49187,49191,103,49162,49172,57,49161,49171,51,157,156,61,60,53,47],"ServerName":"subdomain.example.localhost","SupportedCurves":[29,23,30,25,24,256,257,258,259,260],"SupportedPoints":"AAEC","SignatureSchemes":[1027,1283,1539,2055,2056,2074,2075,2076,2057,2058,2059,2052,2053,2054,1025,1281,1537,771,769,770,1026,1282,1538],"SupportedProtos":["h2","http/1.1"],"SupportedVersions":[772,771],"RemoteAddr":{"IP":"::1","Port":55876,"Zone":""},"LocalAddr":{"IP":"::1","Port":2443,"Zone":""}}}}
2024/11/15 16:30:17.004 DEBUG   tls.handshake   choosing certificate    {"identifier": "subdomain.example.localhost", "num_choices": 1}
2024/11/15 16:30:17.004 DEBUG   tls.handshake   custom certificate selection results    {"error": "no certificates matched custom selection policy", "identifier": "subdomain.example.localhost", "subjects": [], "managed": false, "issuer_key": "", "hash": ""}
2024/11/15 16:30:17.004 DEBUG   tls.handshake   choosing certificate    {"identifier": "*.example.localhost", "num_choices": 2}
2024/11/15 16:30:17.004 DEBUG   tls.handshake   custom certificate selection results    {"identifier": "*.example.localhost", "subjects": ["*.example.localhost"], "managed": false, "issuer_key": "", "hash": "95968652858165e5de09caa8f30a3413e2e22b9ebdd9efaad5965b92278822e6"}
2024/11/15 16:30:17.004 DEBUG   tls.handshake   matched certificate in cache    {"remote_ip": "::1", "remote_port": "55876", "subjects": ["*.example.localhost"], "managed": false, "expiration": "2025/11/15 16:28:22.000", "hash": "95968652858165e5de09caa8f30a3413e2e22b9ebdd9efaad5965b92278822e6"}

This is expected because ignore_loaded_certs means that Caddy should obtain certificates regardless of the certificates loaded manually.

Adapted config

Then running: curl -v -k "https://subdomain.example.localhost" again serves the wildcard certificate that is loaded manually. This is because, when you see the adapted Caddyfile as JSON (truncated here for brevity):

...
          "tls_connection_policies": [
            {
              "match": {
                "sni": [
                  "*.example.localhost"
                ]
              },
              "certificate_selection": {
                "any_tag": [
                  "cert0"
                ]
              }
            },
            {}
          ],
          "automatic_https": {
            "ignore_loaded_certificates": true
          }
        }
      }
    },
    "tls": {
      "certificates": {
        "load_files": [
          {
            "certificate": "cert.pem",
            "key": "key.pem",
            "tags": [
              "cert0"
            ]
          }
        ]
      }
    }
  }
}

you can see the tls_connection_policies; the first one is applied for any SNI that matches *.example.localhost, which includes subdomain.example.localhost, and that policy is programmed to use the loaded certificate (tagged "cert0").

The second policy is empty/default, meaning "use automated certificates to handle everything else".

Analysis

auto_https ignore_loaded_certs is working as intended, because Caddy does indeed manage certificates regardless of the loaded certificates.

The issue may be unexpected behavior coming from the adapted JSON config, which programs Caddy to use the manually-loaded cert first, then try automated certs later.

Technically, I don't think there is a bug here. But I can see how it is unexpected. I am not sure I see a clear/obvious solution though. @francislavoie do you have any ideas?

@mholt mholt added discussion 💬 The right solution needs to be found and removed bug 🐞 Something isn't working help wanted 🆘 Extra attention is needed labels Nov 15, 2024
@SimJoSt
Copy link
Author

SimJoSt commented Nov 17, 2024

I am familiar with JSON in general, but have never used Caddy to configure it that way, and I am not skilled in reading/interpreting it. The config options of the Caddyfile have always been sufficient to me.

My assumption was, that the most specific configuration will "win" and a specific site's configuration will be used over a wildcard one.
If I need to add config options to force that behavior, I would love some advice. Though it sounds like, there doesn't seem to be an easy fix.

@francislavoie
Copy link
Member

francislavoie commented Nov 17, 2024

Yeah IMO that config is working as intended. But we should probably add an option in the Caddyfile which forces issuance & adds a connection policy for a particular domain, which would take priority over other loaded certs which might overlap.

I had started an implementation of that a couple months ago but never finished it (sidetracked, life happened, worked on other things). I don't think I pushed that branch anywhere yet. I had set up the Caddyfile wiring but I didn't finish applying it to the output JSON correctly yet. I'll see if I can do that in the coming days.

But anyway iirc the config looked something like tls force_automate which is a special keyword for this, similar to how internal is a keyword as a shortcut to using Caddy's internal CA for issuing a cert for that domain.

@SimJoSt
Copy link
Author

SimJoSt commented Nov 27, 2024

Thank you and great to hear, that there is already something in the works to get this working.
I played around with the global option auto_https ignore_loaded_certs as well and can confirm Matt's results. Even though certificates are automatically being issued, they are not used for the domains they are meant for.
Either forcing automatic certs over the loaded ones globally or on the site level would be very helpful.

I think the default behaviour of using loaded certs generally makes sense, but not if they have not been loaded for that specific site.
If example.com doesn't have a certificate loaded with the tls directive, I assume it would use the automatic cert for this domain.
Is my assumption wrong?

@francislavoie
Copy link
Member

If you run caddy adapt -p on your config (like Matt showed above) you'll see more clearly what's going on. Basically the *.example.localhost connection policy takes precedence over the fallback policy (i.e. the empty object {}) so any more specific sites in the Caddyfile config don't take precedence over a loaded wildcard. My proposal is to allow users to opt-in to forcibly put a policy above that wildcard one which will let it use an automated cert rather than the loaded one.

@francislavoie
Copy link
Member

Okay I have an implementation now: #6712, I'd appreciate if it could be tested out. Basically, just add tls force_automate in any/all sites that should not use the wildcard cert.

@SimJoSt
Copy link
Author

SimJoSt commented Nov 28, 2024

@francislavoie thank you for that. I have already built and tested it with success: #6712 (comment)
While I welcome any approach that restores usage of automatic certificates on a configuration with a wildcard site, I still think it is unintuitive, that loaded certificated from one site take precedence of another site that is set up with automated certificates. I would expect no settings in site a site block to interfere with any other site block.

To get a better understanding of the resulting JSON config, I tried the caddy's adapt -p command.
I couldn't find the following element you talked about:

fallback policy (i.e. the empty object {})

A curious thing I saw was, only the domains of "app2" being listed in tls -> certificates -> automate, even though the tls directive for that site (and all others), and its domain has been set to load certificate files, instead to automate the certificates:

"tls": {
	"certificates": {
		"automate": [
			"app2.com",
			"*.app2.com"
				],

At the same time, all domains from all sites are listed in automation -> policies -> subjects:

"automation": {
	"policies": [
		{
			"subjects": [
				"*.app2.com",
				"app2.com",
				"app1.com"

Here is a reduced (2 sites instead of 15) and redacted (PII removed) excerpt: https://gist.github.com/SimJoSt/fdeb88cc35333952d1e2b75a2c3a4658
And another one with the changed config and with the force_automate option of the tls directive: https://gist.github.com/SimJoSt/4df205f5c76197cd0e17e97a13f28d91

@arpitjindal97
Copy link

@SimJoSt How did you test it? I just tested it and it didn't work for me.

See my issue for context #6694

@polarathene
Copy link

polarathene commented Jan 13, 2025

I've responded with a full example at #6694 which matches the scenario that @arpitjindal97 described in their issue, unable to reproduce their issue, so probably user error (EDIT: was due to exact match for site-address in the same cert with wildcard, thus it'd get selected instead of provisioning a new one despite tls force_automate).

Mentioning here to save time for anyone else 😅 With that example and this other lengthy one (effectively the same) the feature should be covered fairly well 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion 💬 The right solution needs to be found
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants