Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing intermediate SSL certificates fetching support #3106

Closed
Jerska opened this issue Feb 7, 2019 · 2 comments
Closed

Add missing intermediate SSL certificates fetching support #3106

Jerska opened this issue Feb 7, 2019 · 2 comments
Labels

Comments

@Jerska
Copy link

Jerska commented Feb 7, 2019

Hi! Thanks a ton for your hard work on this awesome library (with which I'd be happy to help if you're looking for contributors).

Summary

We had this error on a specific website:

RequestError: Error: unable to verify the first certificate

Digging into it, we found out that this was because the website was badly configured.
Indeed, what this error means is that the server only provided the last certificate of the chain, and did not provide, as asked by the standard, the full certificate chain.

Reproduction

Reproduction code is easy:

const request = require("request-promise-native");

(async () => {
  try {
    await request({ url: "https://incomplete-chain.badssl.com/" });
    console.log("passed");
  } catch (e) {
    console.log(e);
  }
})();

Expected behavior (looking at other systems)

The reason this was not an issue for them before is that browsers handle this situation fine. From what I understood, it seems that support was first added in IE7 and other vendors have since implemented the same logic.
curl is also fetching those intermediate certificates and has no issue processing those websites.

$ firefox https://incomplete-chain.badssl.com/ # => OK
$ curl -v https://incomplete-chain.badssl.com/ # => OK

If I understood correctly, it seems that there an extension to the TLS standard called AIA, which allows to provide a link to fetch the intermediate certificates, and that's what browsers, curl and other languages libs are using.

Current behavior

The reproduction code above gives this output:

Click to expand
{ RequestError: Error: unable to verify the first certificate
    at new RequestError (/Users/jerska/algolia/repro-top-tls/node_modules/request-promise-core/lib/errors.js:14:15)
    at Request.plumbing.callback (/Users/jerska/algolia/repro-top-tls/node_modules/request-promise-core/lib/plumbing.js:87:29)
    at Request.RP$callback [as _callback] (/Users/jerska/algolia/repro-top-tls/node_modules/request-promise-core/lib/plumbing.js:46:31)
    at self.callback (/Users/jerska/algolia/repro-top-tls/node_modules/request/request.js:185:22)
    at Request.emit (events.js:189:13)
    at Request.onRequestError (/Users/jerska/algolia/repro-top-tls/node_modules/request/request.js:881:8)
    at ClientRequest.emit (events.js:189:13)
    at TLSSocket.socketErrorListener (_http_client.js:392:9)
    at TLSSocket.emit (events.js:189:13)
    at emitErrorNT (internal/streams/destroy.js:82:8)
  name: 'RequestError',
  message: 'Error: unable to verify the first certificate',
  cause:
   { Error: unable to verify the first certificate
       at TLSSocket.onConnectSecure (_tls_wrap.js:1051:34)
       at TLSSocket.emit (events.js:189:13)
       at TLSSocket._finishInit (_tls_wrap.js:633:8) code: 'UNABLE_TO_VERIFY_LEAF_SIGNATURE' },
  error:
   { Error: unable to verify the first certificate
       at TLSSocket.onConnectSecure (_tls_wrap.js:1051:34)
       at TLSSocket.emit (events.js:189:13)
       at TLSSocket._finishInit (_tls_wrap.js:633:8) code: 'UNABLE_TO_VERIFY_LEAF_SIGNATURE' },
  options:
   { url: 'https://incomplete-chain.badssl.com/',
     callback: [Function: RP$callback],
     transform: undefined,
     simple: true,
     resolveWithFullResponse: false,
     transform2xxOnly: false },
  response: undefined }

Context

We've been building a general-purpose crawling system using request in my company.
It's meant to be able to crawl any website.
node w/ request was a great choice thanks to its asynchronocity and ease of use.

We've asked our users to fix their website. We haven't heard back and don't know if they'll be able to do it (as we're not talking to their infra team).
Another solution for us is to download the intermediate cert and declare it as trusted using agentOptions.ca, but this hardly scales.

It also leads us to poor UX, as any new user in this situation would have to:

  1. go through their internal org to get this fixed
  2. or wait for an update & deploy of our system to use it

Why am I asking you?

An issue was opened on this specific topic in the node repo: nodejs/node#16336 some time ago.
The reponse there was simple:

The reason I don't think this is a valid feature request is that it means node.js starts doing too much work ("magic") on behalf of the user, and inflexibly at that. That's okay for an end user product like a browser or a module like request but not for node.js core.
-- @bnoordhuis

Is this really an issue?

I wanted to know how many times we would risk getting these. I made a small script that goes over the top 10K Alexa websites and look for this error.
If this error happens, I then used https://whatsmychaincert.com/ to check if it's able to generate full chain for them.

Out of the top 10K Alexa websites, there are 80 domains which raised this error.
Out of those 80, 23 are really badly configured.
However, 57 of them have a valid chain, which whatsmychaincert was able to rebuild.
So we're talking about 0.5% of websites which are not securely fetchable using node.

Click to see the list
Click to see code used
require("events").EventEmitter.prototype._maxListeners = 100;

const fs = require("fs");
const colors = require("colors/safe");
const request = require("request-promise-native");
const csvParse = require("csv-parse/lib/sync");

const HOSTS_AMOUNT = 10000;
const PARALLEL = 150;
const TIMEOUT = 10000;

class Queue {
  constructor(poolSize = 10) {
    this.poolSize = poolSize;
    this.pool = [];
    this.id = 0;

    // Bind methods
    this.push = this.push.bind(this);
    this.remove = this.remove.bind(this);
    this.wait = this.wait.bind(this);
  }

  get promises() {
    return this.pool.map(({ promise }) => promise);
  }

  async push(job) {
    if (this.pool.length >= this.poolSize) {
      // Wait for one job to be complete
      await Promise.race(this.promises);
    }

    const id = this.id++;
    const promise = (async () => {
      try {
        await job();
        this.remove(id);
      } catch (e) {
        this.remove(id);
        throw e;
      }
    })();

    this.pool.push({ id, promise });
    return id;
  }

  remove(id) {
    const i = this.pool.findIndex(({ id: _id }) => id === _id);
    delete this.pool[i].id;
    delete this.pool[i].promise;
    this.pool.splice(i, 1);
  }

  async wait() {
    await Promise.all(this.promises);
  }
}

async function processDomain(erroredDomains, domain, key) {
  try {
    await request({
      url: `https://www.${domain}/`,
      timeout: TIMEOUT,
      simple: false
    });
    console.log(`${colors.cyan(key)} - ${domain}`);
  } catch (e) {
    if (e.message.match(/unable to verify the first certificate/)) {
      erroredDomains.push(domain);
      console.error(
        `${colors.cyan(key)} - ${colors.red(erroredDomains.length)} - ${domain}`
      );
    }
    console.log(`${colors.cyan(key)} - ${domain}`);
  }
}

async function checkIfSecure(secureDomains, insecureDomains, domain) {
  try {
    await request({
      url: `https://whatsmychaincert.com/generate?host=www.${domain}`,
      simple: true,
      timeout: TIMEOUT * 3,
      resolveWithFullResponse: true
    });
    secureDomains.push(domain);
    console.log(colors.green(domain));
  } catch (e) {
    // 500s or timeouts
    insecureDomains.push(domain);
    console.log(colors.red(domain));
  }
}

process.on("warning", e => console.warn(e.stack));

(async () => {
  const content = fs.readFileSync("./top-1m.csv");
  const top = csvParse(content, { skip_empty_lines: true })
    .slice(0, HOSTS_AMOUNT)
    .map(([, domain]) => domain);

  const erroredDomains = [];
  let queue = new Queue(PARALLEL);

  console.log(`Checking ${HOSTS_AMOUNT} websites:`);
  for (let i = 0; i < top.length; ++i) {
    await queue.push(processDomain.bind(null, erroredDomains, top[i], i + 1));
  }
  await queue.wait();

  // Hitting a single server, let's go slow
  queue = new Queue(2);

  const secureDomains = [];
  const insecureDomains = [];

  console.log(`Checking whether failing websites are secure:`);
  for (let i = 0; i < erroredDomains.length; ++i) {
    await queue.push(
      checkIfSecure.bind(
        null,
        secureDomains,
        insecureDomains,
        erroredDomains[i]
      )
    );
  }
  await queue.wait();

  console.log("\n");
  console.log(`Looked at ${HOSTS_AMOUNT} with PARALLEL = ${PARALLEL}`);
  console.log(`${erroredDomains.length} domains raised the error`);
  console.log(`Insecure ${insecureDomains.length}:`);
  insecureDomains.forEach(domain =>
    console.log(colors.red(`  - https://www.${domain}`))
  );
  console.log(`Secure ${secureDomains.length}:`);
  secureDomains.forEach(domain =>
    console.log(colors.green(`  - https://www.${domain}`))
  );
})();

Their tool to rebuild the chain is here: https://github.com/SSLMate/mkcertchain (in perl).
The chromium AIA rebuild seems to be implemented here: https://github.com/chromium/chromium/blob/b5b1e845e5eafe787991955d804fc100c9bb6ce0/net/cert/internal/cert_issuer_source_aia.cc .

Your Environment

software version
request 2.88.0
request-promise-native 1.0.5
node 10.15.1
@arvind-agarwal
Copy link

A possible workaround till intermediate certificate download is implemented:

node_extra_ca_certs_mozilla_bundle

It generates a PEM file that includes all root and intermediate certificates trusted by Mozilla. It uses the following environment variable

NODE_EXTRA_CA_CERTS

To generate the PEM file to use with the above environment variable. You can install the module using:

npm install --save node_extra_ca_certs_mozilla_bundle

and then launch your node script with an environment variable.

NODE_EXTRA_CA_CERTS=node_modules/node_extra_ca_certs_mozilla_bundle/ca_bundle/ca_intermediate_root_bundle.pem node your_script.js

Other ways to use the generated PEM file are available at:
https://github.com/arvind-agarwal/node_extra_ca_certs_mozilla_bundle

NOTE: I am the author of the above module.

@stale
Copy link

stale bot commented Dec 25, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants