Skip to content
This repository has been archived by the owner on Jun 16, 2021. It is now read-only.

MRG: update version of OpenSSL on Manylinux and Windows #8

Closed
wants to merge 19 commits into from

Conversation

matthew-brett
Copy link
Contributor

@njs - do I understand right, from your email on the distutils list,
that this should be a full fix?

@dvarrazzo - is there a good way for me to test the fix?

@njs - do I understand right, from your email on the distutils list,
that this should be a full fix?

@dvarrazzo - is there a good way for me to test the fix?
@dvarrazzo
Copy link
Member

I've never been able to reproduce the issue, so I wouldn't know. I guess it would be a combination of multithread database connections and https requests.

@matthew-brett
Copy link
Contributor Author

Am I right in thinking that the error here is due to something being encrypted to the main wheels repo? And therefore that pull request will always fail like this, if made from another repo, like a fork?

$ openssl aes-256-cbc -K $encrypted_bd656d6a5753_key -iv $encrypted_bd656d6a5753_iv -in id_rsa-travis-upload.enc -out /tmp/id_rsa-travis-upload -d
iv undefined
The command "openssl aes-256-cbc -K $encrypted_bd656d6a5753_key -iv $encrypted_bd656d6a5753_iv -in id_rsa-travis-upload.enc -out /tmp/id_rsa-travis-upload -d" failed and exited with 1 during .

https://travis-ci.org/psycopg/psycopg2-wheels/jobs/516942233#L469

@dvarrazzo
Copy link
Member

That's uploading the artifacts to our servers, and requires and encrypted private key. You can comment out that line, I can make sure that change won't be merged.

@njsmith
Copy link

njsmith commented Apr 8, 2019

@njs - do I understand right, from your email on the distutils list,
that this should be a full fix?

It's @njsmith, but otherwise yes, I believe this is correct :-).

In openssl 1.0.2 and earlier, by default openssl is not thread safe at all, and you have to do some special setup to use it from multiple threads. psycopg2 doesn't do that setup, so of course you'd expect weird crashes when working with encrypted connections from multiple threads simultaneously. In openssl 1.1.0 and later, openssl does its own locking by default, so the user isn't required to do anything special to make threads work. There's some discussion in this blog post:

https://www.openssl.org/blog/blog/2017/02/21/threads/

Technically it's possible that there are multiple bugs, and this just fixes one of them. We won't know for 100% certain until it's released and people either report bugs or not. It would probably be a good idea to try to recruit some of the folks in the original issue threads who were hitting the issue, and asking them to try out the new wheels and confirm they fix their issues.

But the bug pattern of only happening with the wheels and not with source builds is very strange, and is exactly what you'd expect from this openssl locking thing, so I'm pretty confident that this will help a lot.

@matthew-brett
Copy link
Contributor Author

OK - builds are working now. As you can see I:

  • updated OpenSSL to 1.1.1b
  • updated OpenLDAP to 2.4.47 (to allow compilation against OpenSSL 1.1.1b)
  • removed --with-krb5-flavor=MIT from OpenSSL configure; OpenSSL dropped all the Kerberos configure stuff for 1.1 see this mailing list thread.

@njsmith - the Mac builds are, I guess, using the system SSL; is that going to be a problem?

@dvarrazzo
Copy link
Member

dvarrazzo commented Apr 9, 2019

Hi Matthew, thank you for this.

Could you please make sure the tests connect to the database using ssh? This wouldn't prove the issue is fixed but at least that some basic work is there.

It should be just a matter of setting the env var PGSSLMODE=require before running the tests.

According to psycopg/psycopg2#836 the issue is also on Windows. Would you be able to fix appveyor to build openssh 1.1 there too? @jerickso is usually the window person, I can try stabbing in the dark too... Just let me know if you have interest in working on it.

@matthew-brett
Copy link
Contributor Author

I'm not nearly as comfortable on Windows, strange to relate. It also looks as if @jerickso has thought about this already: https://github.com/psycopg/psycopg2-wheels/blob/master/appveyor.yml#L162 - is it possible the fix is just to uncomment these lines?

@matthew-brett
Copy link
Contributor Author

Is this enough to run the ssh / Postgres test? https://travis-ci.org/psycopg/psycopg2-wheels/jobs/517776358#L1121

The test count is the same as it was before, at 761.

@dvarrazzo
Copy link
Member

is it possible the fix is just to uncomment these lines?

Maybe it is, as I said for me too is a bit stabbing in the dark. My approach would be to mung that appveyor file until it gives up and works, or I give up and decide I have better things to do :)

@dvarrazzo
Copy link
Member

Is this enough to run the ssh

I think it is, all the db communication in the test suite should have happened on ssh.

Unfortunately that's not enough to prove the ssh concurrency problems are fixed, but it's what I'd release for next wheels, if we get window on the same library too (and, honestly, without that, too)

@matthew-brett matthew-brett force-pushed the update-openssl branch 2 times, most recently from 9ed0980 to f9efca0 Compare April 9, 2019 14:58
@matthew-brett
Copy link
Contributor Author

Windows looks like it's working now:

https://ci.appveyor.com/project/matthew-brett/psycopg2-wheels

I had to patch setup.py to find the OpenSSL 1.1 libraries; they have different names:

https://github.com/psycopg/psycopg2-wheels/pull/8/files#diff-b41f560b111a0ae2ecc9843bcb0b50a3

@matthew-brett
Copy link
Contributor Author

Well - yes - but then psycopg2 won't build against OpenSSL 1.0.x. Maybe that's a feature ...

@dvarrazzo
Copy link
Member

I think it's fine. What we release is those binary packages and maybe the exe which Jason publishes (or used to) on his website. The psycopg2 project builds the library for Windows to run the tests and it seems wise to use the same ssh libraries we deploy; those artifacts are not distributed. And that's it, I don't think there are other uses.

I would considering a non-segfaulting library a nice feature to have, yes 😁

@matthew-brett
Copy link
Contributor Author

Ok - do you want me to do a PR to the main repo for the library name changes?

@dvarrazzo
Copy link
Member

I would say, unless @jerickso thinks what we are doing is a bad idea, this is what I would do.

Comments gave instructions for building 1.1.x, but we're already doing
that.
@njsmith
Copy link

njsmith commented Apr 10, 2019

In case anyone else was confused reading this comment thread, I think every time someone said "ssh" they actually meant "ssl"?

@jerickso
Copy link
Member

Looking over the appveyor yml, I did have comments on the steps I thought needed for building OpenSSL 1.1.0, but as @matthew-brett noted, the 1.1.x OpenSSL library names are different than the 1.0.x and got side tracked with life.

Since @njsmith mentioned that OpenSSL 1.1.x is more thread safe because it does some locking for us, I think requiring OpenSSL 1.1.x is a sensible requirement.

@matthew-brett matthew-brett changed the title WIP: do not merge: update version of OpenSSL MRG: update version of OpenSSL on Manylinux and Windows Apr 10, 2019
@matthew-brett
Copy link
Contributor Author

OK - PR to master, for library name change, at psycopg/psycopg2#894

Maybe build wheels for current version using patch to setup.py here, and then revert a728c13, ready for next version?

@matthew-brett
Copy link
Contributor Author

@njsmith - any thoughts on the Mac / SSL threading?

@njsmith
Copy link

njsmith commented Apr 10, 2019

By "system SSL" do you mean the openssl that ships with macOS? You want to get off that ASAP regardless of any threading issues. I'm surprised you can even build anything against it. Are you sure you aren't actually getting openssl through homebrew or something?

@matthew-brett
Copy link
Contributor Author

Ah yes, the wheel build picks up a homebrew postgresql install:

https://travis-ci.org/psycopg/psycopg2-wheels/jobs/518044937#L384

The build uses xcode9.2, which corresponds to macOS 10.12: https://docs.travis-ci.com/user/reference/osx

I have macOS 10.14 on my machines; the homebrew versions somewhat depend on the macOS version. Even on 10.14, installing postgresql now:

$ otool -L /usr/local/Cellar/postgresql/11.2/lib/libpq.dylib 
/usr/local/Cellar/postgresql/11.2/lib/libpq.dylib:
	/usr/local/opt/postgresql/lib/libpq.5.dylib (compatibility version 5.0.0, current version 5.11.0)
	/usr/local/opt/openssl/lib/libssl.1.0.0.dylib (compatibility version 1.0.0, current version 1.0.0)
	/usr/local/opt/openssl/lib/libcrypto.1.0.0.dylib (compatibility version 1.0.0, current version 1.0.0)
	/System/Library/Frameworks/Kerberos.framework/Versions/A/Kerberos (compatibility version 5.0.0, current version 6.0.0)
	/System/Library/Frameworks/LDAP.framework/Versions/A/LDAP (compatibility version 1.0.0, current version 2.4.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.200.5)

What do you think? Are we going to have to rebuild everything, until homebrew catches up?

@njsmith
Copy link

njsmith commented Apr 10, 2019

You know infinitely more about building wheels on macOS than I do :-).

The pyca/cryptography project definitely builds wheels that use the latest openssl on macOS... maybe @reaperhulk or @alex could say something about how they manage it?

@alex
Copy link

alex commented Apr 10, 2019

I'm not totally sure I understand the question, but I think https://github.com/pyca/cryptography/blob/master/.jenkins/Jenkinsfile-Update-Homebrew-OpenSSL is the relevant bit -- we rebuild openssl from homebrew.

@reaperhulk
Copy link

We build against the openssl@1.1 formula artifacts (and statically link!). It's not actually necessary to rebuild the way we do if you're on a new enough macOS to have bottles available. We do our rebuild process for both legacy and older macOS CI reasons.

@matthew-brett
Copy link
Contributor Author

As a data point, running the tests on my macOS 10.14 laptop, with export PGSSLMODE=require, and an install from psycopg2-binary, does not generate an error. So I assume that test isn't very sensitive to psycopg/psycopg2#543

@dvarrazzo
Copy link
Member

dvarrazzo commented Apr 11, 2019

Hi @matthew-brett,

I have applied your patch to psycopg2 setup.py to the main project (psycopg/psycopg2@72fe91c), and I've changed your branch (into our branch update-openssl) to make use of it (plus some minimal rebasing and re-enabling artifacts upload).

I understand everything is ready for openssl 1.1 on Windows and Linux. Do you think it's ok to leave OSX on 1.0 for the moment?

Thank you very much!

@dvarrazzo
Copy link
Member

I have also added a commit to master to skip upload if the keys are missing, in case you want to rebase your branch on ours.

@njsmith
Copy link

njsmith commented Apr 11, 2019

I understand everything is ready for openssl 1.1 on Windows and Linux. Do you think it's ok to leave OSX on 1.0 for the moment?

Presumably you wouldn't want to upload psycopg2 wheels for OSX until OSX gets the upgrade to 1.1, but there's no reason you can't upload Window and Linux wheels while waiting for OSX to catch up.

@dvarrazzo
Copy link
Member

Ok @njsmith, I'm building all the packages as of the current state then, to be pushed to testpypi, and merge to branch if there are no obvious problems.

@dvarrazzo
Copy link
Member

I have created test packages: could you please test if they work correctly?

pip install -i https://test.pypi.org/simple/ psycopg2-binary==2.8.2.dev0

I've asked to run some tests for windows on psycopg/psycopg2#836 because appveyor doesn't seem supporting ssl on the server. Once I have a go on windows about basic ssl functionality I can release 2.8.2

Thank you everyone!

@dvarrazzo
Copy link
Member

Merged to master after further adjustments by @jerickso. Thank you @matthew-brett!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants