-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check revoked certificates with OCSP (#2) #3
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great, thanks !
I've ran this locally against all updown.io URLs (using the rake crosstest:ssl
I made) to see how many do we have and verify if they are correct. I didn't get saw any revoked cert which is sad and good at the same time ;)
What I noticed is that hundreds of them failed with "SSL certificate test failed: Net::ReadTimeout" (or "end of file reached" or "Connection reset by peer" or "execution expired") so I guessed this is either because the OCSP endpoint is not working, or because I'm being throttled. By testing some of them again alone it worked so I think most of them are throttling. That's probably ok because in production the tests wouldn't run this fast and from 8 IPs.
One thing though is that it would be good to know that it's the OCSP test which failed and not the website which did a timeout, so maybe change the message a bit so we can know (with a grep for example) if it's an revocation test fail or a website fail.
And also there's the question of what to do when the OCSP endpoint doesn't work (not throttling). Because currently the certificate would always be considered untested (nil) and return this test error. I guess we'll have to see if that's a common case or not. And either tell the issuer to fix their OCSP, fix our code, or maybe hardcode that it's not working to ignore the OCSP test for them ☹. We'll check that once we have production numbers to back it up.
Nice work 👏 and welcome to the dirty/legacy world of PKI (Private Key Infrastructures)
I see. Maybe we should just not fail at all if the OCSP test isn't conclusive and keep going. It feels kinda lame to not be able to detect other problems when we can just because we can't see whether or not the cert was revoked. But as you said, we can check first if it's a common case or not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes if it happens too much it's ok to skip OCSP when it fails, but it's lame so we'll try the optimistic way first ;)
But as you said, we can check first if it's a common case or not.
Even if it's not though, slow queries and downtimes happen, and it doesn't feel right to not account for those cases.
Indeed, but if it's temporary it's ok, updown handles going from state valid to unknown to valid without notifications. If it's permanent that's another story.
Also one risk (for updown) if we do ignore OCSP result when down, is that if the cert is revoked (updown sends alert) and then the OCSP endpoint die, we'll say the certificate is valid and send another alert, which means false positive and a lot of spam if the OCSP endpoints flaps.
Oh and by the way if we could support OCSP stapling that would be great (and reduce the number of requests). OCSP stapling basically attach the OCSP response server side during the TLS handshake so you don't have to query the OCSP server, you can just validate the embedded OCSP signature. It's enabled on https://updown.io for example. It's a TLS extension field so we may be able to find it somewhere in the |
Co-Authored-By: Adrien Rey-Jarthon <jobs@adrienjarthon.com>
In the current version of the code, we'll only explicitly state that the OCSP test failed if there was a connection error. |
I couldn't find a way to support it. This seems to say that it doesn't. |
Ok fine
Well that's good to start with but I still want to be able to monitor this. Here is an idea: maybe we can return the validation as ok but still return a message with "OCSP error..." (should say where/why it failed). So when valid = true the message can be considered a notice or warning by the user. In my case updown could instrument this message in kibana first to measure the number of occurrences of each error (and on which domain) and maybe later display this message to users as a warning without considering the certificate invalid. WDYT? |
Sounds reasonable. As long as we're not hard-failing just when we can't check the OCSP, I'm ok with it. So in the current state:
Google and Updown are already websites in the last category. Here's a comparison with also a revoked cert and a test without any problem: irb(main):001:0> SSLTest.test("https://google.com/")
=> [true, "OCSP test couldn't be performed: OCSP response failed: Your request is unauthorized", #<OpenSSL::X509::Certificate: subject=#<OpenSSL::X509::Name CN=*.google.com,O=Google LLC,L=Mountain View,ST=California,C=US>, issuer=#<OpenSSL::X509::Name CN=GTS CA 1O1,O=Google Trust Services,C=US>, serial=#<OpenSSL::BN:0x0000000007714920>, not_before="2020-04-07T09:35:04.000Z", not_after="2020-06-30T09:35:04.000Z">]
irb(main):002:0> SSLTest.test("https://updown.io/")
=> [true, "OCSP test couldn't be performed: OCSP response failed: Your request is unauthorized", #<OpenSSL::X509::Certificate: subject=#<OpenSSL::X509::Name CN=updown.io>, issuer=#<OpenSSL::X509::Name CN=Let's Encrypt Authority X3,O=Let's Encrypt,C=US>, serial=#<OpenSSL::BN:0x0000000007752018>, not_before="2020-03-29T20:39:27.000Z", not_after="2020-06-27T20:39:27.000Z">]
irb(main):003:0> SSLTest.test("https://github.com/")
=> [true, nil, #<OpenSSL::X509::Certificate: subject=#<OpenSSL::X509::Name CN=github.com,O=GitHub\, Inc.,L=San Francisco,ST=California,C=US,serialNumber=5157550,jurisdictionST=Delaware,jurisdictionC=US,businessCategory=Private Organization>, issuer=#<OpenSSL::X509::Name CN=DigiCert SHA2 Extended Validation Server CA,OU=www.digicert.com,O=DigiCert Inc,C=US>, serial=#<OpenSSL::BN:0x000000000776f208>, not_before="2018-05-08T00:00:00.000Z", not_after="2020-06-03T12:00:00.000Z">]
irb(main):004:0> SSLTest.test("https://revoked.badssl.com/")
=> [false, "SSL certificate revoked: The certificate was revoked for an unknown reason (revocation date: 2019-10-07 20:30:39 UTC)", #<OpenSSL::X509::Certificate: subject=#<OpenSSL::X509::Name CN=revoked.badssl.com,O=Lucas Garron Torres,L=Walnut Creek,ST=California,C=US>, issuer=#<OpenSSL::X509::Name CN=DigiCert SHA2 Secure Server CA,O=DigiCert Inc,C=US>, serial=#<OpenSSL::BN:0x000000000778b020>, not_before="2019-10-04T00:00:00.000Z", not_after="2021-10-08T12:00:00.000Z">] I think it's coming from the fact that the chain has 3 certs:
And currently, I'm asking the root OCSP responder to check if updown's certificate was revoked. But they don't know it. I probably need to ask Let's Encrypt OCSP responder to check if updown's certificate was revoked and then the root OCSP responder to check if Let's Encrypt's certificate was revoked (handled in 1511899). |
Thanks, for the error case (
(reproduced that by spamming a bit ^^) I've ran my So looking pretty good so far. One thing I don't like is that's we're gonna spam the endpoints even more with intermediate certificates, which will always be the same, that's really a case where caching would help :/ OCSP is made to be cached heavily, and that's why they include the values you called [#<OpenSSL::OCSP::CertificateId:0x00007f7791e86fd8>, 0, 21996, nil, "2020-04-20T04:43:55.000Z", "2020-04-27T04:43:55.000Z", []] We can see they say they cache for 7 days, which means that we should be able to cache the result until 2020-04-27T04:43:55.000Z without loosing any speed. IMO we should do that at least for intermediary certs (to keep the cache small) but why no for first certs later. Something really simple in-memory with a Ruby hash is enough IMO, the key would be a hash |
Ok for returning the cert when the check fails (66f6e59). The cache is handled in b91d3a6 and is also there for the first cert. Unless you really think it's better to only do it for the intermediate certs to preserve the memory? I'm not 100% convinced but it'd probably be quick to remove anyway so up to you ( |
- Avoid performing useless operation when request cached
lib/ssl-test.rb
Outdated
chain[0..-2].each_with_index do |current_checked_cert, i| | ||
# https://tools.ietf.org/html/rfc5280#section-4.1.2.2 | ||
# The serial number [...] MUST be unique for each certificate issued by a given CA (i.e., the issuer name and serial number identify a unique certificate) | ||
unicity_key = [current_checked_cert.issuer.to_s, current_checked_cert.serial.to_s] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure but had the feeling that constructing an array here as hash key would be less efficient than concatenating the strings so I checked and it's the case:
require 'benchmark'
str1 = "yolo"
str2 = "yata"
N = 1_000_000
Benchmark.bm do |x|
x.report("array") {
h = {}
N.times {
key = [str1, str2]
h[key] = {}
}
}
x.report("string") {
h = {}
N.times {
key = "#{str1}/#{str2}"
h[key] = {}
}
}
end
# $ ruby bench.rb
# user system total real
# array 0.714008 0.000000 0.714008 ( 0.714285)
# string 0.180115 0.000000 0.180115 ( 0.180154)
Not a big deal but better using a single string here
Thanks!
I think it's fine, I measured it using crosstest with this between each test: GC.start
cache = SSLTest.instance_variable_get('@ocsp_response_cache')
pid, size = `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`.strip.split.map(&:to_i)
puts "Cache size: #{cache&.size} / Process: #{size} kB" I confirmed with the version before caching that there's no leak, the process quickly rose to ~141MB and stayed there. Now testing the version with caching with 1000 SSL tests (some of them using the same domain) gave a process starting at ~149MB (not sure why, sometimes it's more sometimes less, ruby 🤷), but what matters is the increase and after 1000 sites tested, cache size of 366, the memory used by the process was still 149MB 🤔 so weird but it means it's totally OK this way (I saw it rising some times but decreasing after so I'm guessing the memory used by mongoid largely outrank this little hash) 👍 |
I think after that little key fix, if you could find a couple tests with other revokation cases (ideally one with the intermediary revoked instead of the first cert) that would be great and we can try this in production :) |
I had the same feeling but I didn't benchmark and went for the array option because it felt cleaner. Also it was before knowing that the cert serial had to be an integer. If it was free, it might have caused collisions because we couldn't know where the name ended and where the cert serial started. But in the end it's not an issue. I made some internal methods
Unfortunately I couldn't 🙁 All the ones I found were "revoked for an unknown reason" (never at the intermediate cert level) or were even expired so the revocation didn't have the time to show up. I wanted to fake the different cases, but it looks like minitest needs another gem to do this... The one about "returns no error on valid SNI website" is now flapping on OCSP and I fixed another one because the error changed slightly. "returns error on untrusted root" is actually returning "error code 19: self signed certificate in certificate chain" now too, but I didn't change this one. |
All right ! I prepared a branch on updown side (
They do not seem to flap so you should be able to reproduce them easily with the URL. |
Definitely! Come on, a Covid-19 site as the first one 😂 You can't tell me that it was random! So the errors were about missing info to get the OCSP responder. Most of them were the OCSP URI which was missing, for one of them the it was the extension containing it ( irb(main):004:0> SSLTest.test("https://covid19.reserve-civique.gouv.fr/")
=> [true, "OCSP test couldn't be performed: Missing OCSP URI in authorityInfoAccess extension", #<OpenSSL::X509::Certificate: subject=#<OpenSSL::X509::Name serialNumber=S15762003,CN=covid19.reserve-civique.gouv.fr,OU=0002 13000680200016,O=SG MIN AFFAIRES SOC,L=PARIS SP 07,C=FR>, issuer=#<OpenSSL::X509::Name CN=Certigna Services CA,2.5.4.97=#0C144E545246522D3438313436333038313030303336,OU=0002 48146308100036,O=DHIMYOTIS,C=FR>, serial=#<OpenSSL::BN:0x0000000007bba240>, not_before="2020-03-19T17:20:17.000Z", not_after="2022-03-19T17:20:17.000Z">]
irb(main):006:0> SSLTest.test("https://www.anonymisation.gov.pf/code-anonymat/avec-dn")
=> [true, "OCSP test couldn't be performed: Missing authorityInfoAccess extension", #<OpenSSL::X509::Certificate: subject=#<OpenSSL::X509::Name CN=www.anonymisation.gov.pf,serialNumber=35785DAN953,OU=RC-PF 002238,2.5.4.97=#0C0C4E545250462D303032323338,O=Service de l'informatique du territoire,L=Papeete,C=FR>, issuer=#<OpenSSL::X509::Name CN=Certinomis - AA et Agents,OU=0002 433998903,O=Certinomis,C=FR>, serial=#<OpenSSL::BN:0x0000000007bcc4b8>, not_before="2018-07-05T10:19:00.000Z", not_after="2020-07-04T10:19:00.000Z">]
irb(main):007:0> ["https://www.demarches-simplifiees.fr", "https://histovec.interieur.gouv.fr", "https://www.i-mata.gov.pf/", "https://doc.projet.gov.pf/", "https://www.paraoa.gov.pf/", "https://www.hotufina.gov.pf/", "https://www.service-public.pf/", "https://www.otia.gov.pf/", "https://www.service-public.pf/sipf/wp-admin/", "https://gestion.projet.gov.pf", "https://www.tiarama.gov.pf/", "https://www.turama.gov.pf/", "https://www.anonymisation.gov.pf/code-anonymat/avec-dn", "https://auth.gov.pf/auth/", "https://www.mes-demarches.gov.pf/", "https://www.oini.gov.pf", "https://www.tefenua.gov.pf/"].map {|u| SSLTest.test(u) }.map {|r| r[1]}.uniq
=> ["OCSP test couldn't be performed: Missing OCSP URI in authorityInfoAccess extension", "OCSP test couldn't be performed: Missing authorityInfoAccess extension"] |
(#2)
This PR adds a test for revoked certificates using OCSP.
It does not use CRL.
The main inspiration was from the ruby doc.
I used https://www.digicert.com/digicert-root-certificates.htm to test it on several certificates.