Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New source creation uses Sequoia #6892

Merged
merged 13 commits into from
Oct 2, 2023
Merged

New source creation uses Sequoia #6892

merged 13 commits into from
Oct 2, 2023

Conversation

legoktm
Copy link
Member

@legoktm legoktm commented Jul 21, 2023

Status

Ready for review

Description of Changes

At a high-level, this PR includes:

  • Storing journalist public key out of the GPG keyring - including a hopefully not too hacky postinst migration
  • DB schema change, adding 3 columns to the source table
  • Have redwood encrypt streams instead of files
  • Changing all the EncryptionManager.encrypt_* functions to use redwood, and sometimes the decrypt function
  • Actually switching the new source creation to Sequoia
  • A bunch of fixes to tests

Each commit message should give a detailed explanation.

Fixes #6799.

Testing

How should the reviewer test this PR?

This test plan focuses on:

  • key generation and storage
  • GPG<-->PGP interop points
  • journalist pub key migration

I'm intentionally skipping manual testing of GPG-created sources because that makes sense to test during the PRs for public key and private key migrations since the behavior will be different. For the purposes of this PR, the automated test coverage should be sufficient.

In the dev environment

  • Spinning up a dev environment with make dev works
  • In the running container, select * from sources and see that public and private keys of the pre-generated sources are stored in the database
  • In the running container, echo "select pgp_secret_key from sources limit 1" | sqlite3 /var/lib/securedrop/db.sqlite, copy the private key to a separate file, test.asc, and then run sq inspect test.asc and observe that it says: "Secret key: Encrypted" for both the key and the encryption subkey. It should also say its a 4096-bit RSA key.
  • Create a new source through the SI (key generation works via web)
  • Log in as a source, submit a file and a message
  • Log into the JI and download said file + message and decrypt them using GPG
  • Send a reply to the source from the JI. Switch back to the SI and verify you're able to read it (it decrypts properly)
  • Back to the JI, download your own reply, and verify you can decrypt it using GPG (it was correctly encrypted to both the source and the journalist)

On a staging instance

  • Set up a 2.6.0 prod/staging instance
  • Copy securedrop/tests/files/test_journalist_key.pub from the develop branch and import it into the keyring (sudo -u www-data gpg --homedir /var/lib/securedrop/keys --import test_journalist_key.pub). This will make sure the non-SHA-1 version of the key is being used.
  • Run sudo -u www-data gpg --homedir /var/lib/securedrop/keys -k, it should show the public journalist key (and possibly a few source keys)
  • Check out this PR and run make build-debs
  • Copy and install the securedrop-app-code deb onto the app server.
  • Verify /var/lib/securedrop/journalist.pub now exists. It should be owned by root, with group www-data, and 640 permissions (root gets read-write, www-data is read-only, world has nothing).
  • Copy journalist.pub to somewhere that has sq installed and pass it to sq inspect - the fingerprint should match the gpg output from earlier and what's specified in your config.py.
  • Create a new source and successfully submit a message
  • Log in as a journalist and submit a reply

SDW integration

  • Point an SDW setup at the staging/prod instance you created
  • Log in as the journalist, verify you can decrypt/read both the source message and the journalist reply
  • Submit a reply to the source
  • Switch back to the SI, verify you can read the reply sent from SDW as the source.

Deployment

Any special considerations for deployment? Yes, the DB migration is one way.

Checklist

  • Linting (make lint) and tests (make test) pass in the development container
  • Configuration tests pass
  • I have written a test plan and validated it for this PR
  • These changes do not require documentation

@legoktm legoktm force-pushed the sequoia-initial branch 2 times, most recently from b17b3b0 to 595b64b Compare July 25, 2023 02:16
@legoktm legoktm marked this pull request as ready for review July 25, 2023 12:50
@legoktm legoktm requested a review from a team as a code owner July 25, 2023 12:50
Copy link
Contributor

@nabla-c0d3 nabla-c0d3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a quick look and left a few comments, most of them pretty minor but hopefully helpful! Nice work on the new Rust encryption library.

securedrop/encryption.py Outdated Show resolved Hide resolved
securedrop/encryption.py Outdated Show resolved Hide resolved
securedrop/encryption.py Outdated Show resolved Hide resolved
@legoktm legoktm requested a review from nabla-c0d3 August 1, 2023 23:00
@legoktm legoktm force-pushed the sequoia-initial branch 2 times, most recently from a314d3e to 66adcc9 Compare August 2, 2023 20:21
Copy link
Contributor

@nabla-c0d3 nabla-c0d3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good.
As for imports, absolute imports are usually more readable and straightforward than relative imports, so I would recommend using absolute imports everywhere.

securedrop/encryption.py Show resolved Hide resolved
securedrop/encryption.py Show resolved Hide resolved
@cfm
Copy link
Member

cfm commented Aug 8, 2023

@legoktm, would you mind rebasing and resolving conflicts from develop after #6884 and #6891? I'm starting review today commit by commit, but I'd like to read the overall diff tomorrow before starting in on the test plan.

@legoktm legoktm force-pushed the sequoia-initial branch 2 times, most recently from 09dd63c to 9a4850a Compare August 9, 2023 03:59
@legoktm
Copy link
Member Author

legoktm commented Aug 9, 2023

Done, now the parent PRs have been merged, if it would making reviewing easier, some parts of this could be split off into separate PRs - let me know.

Copy link
Member

@cfm cfm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@legoktm, I've read through the implementation, and I think this is fabulous work. I've left some comments inline: let me know what you think, but they're not blocking. Tomorrow I will:

  1. step through your test plan; and
  2. read through the tests.

securedrop/debian/securedrop-app-code.postinst Outdated Show resolved Hide resolved
securedrop/encryption.py Show resolved Hide resolved
securedrop/encryption.py Show resolved Hide resolved
securedrop/encryption.py Outdated Show resolved Hide resolved
Copy link
Member

@cfm cfm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test plan checks out in the development environment. In the staging environment, there are some upgrade glitches, detailed below and inline. I've started to diagnose and sketch fixes for these. @legoktm, next week I'll propose fixup commits and continue with the test plan while you're away.

In the dev environment

  • Spinning up a dev environment with make dev works
  • In the running container, select * from sources and see that public and private keys of the pre-generated sources are stored in the database
  • In the running container, echo "select pgp_secret_key from sources limit 1" | sqlite3 /var/lib/securedrop/db.sqlite, copy the private key to a separate file, test.asc, and then run sq inspect test.asc and observe that it says: "Secret key: Encrypted" for both the key and the encryption subkey. It should also say its a 4096-bit RSA key.
  • Create a new source through the SI (key generation works via web)
  • Log in as a source, submit a file and a message
  • Log into the JI and download said file + message and decrypt them using GPG
  • Send a reply to the source from the JI. Switch back to the SI and verify you're able to read it (it decrypts properly)
  • Back to the JI, download your own reply, and verify you can decrypt it using GPG (it was correctly encrypted to both the source and the journalist)

On a staging instance

  • Set up a 2.6.0 prod/staging instance
  • Run sudo -u www-data gpg --homedir /var/lib/securedrop/keys -k, it should show the public journalist key (and possibly a few source keys)
  • Check out this PR and run make build-debs
  • Copy and install the securedrop-app-code deb onto the app server.

Fails; see inline comments.

  • Verify /var/lib/securedrop/journalist.pub now exists. It should be owned by root, with group www-data, and 640 permissions (root gets read-write, www-data is read-only, world has nothing).
  • Copy journalist.pub to somewhere that has sq installed and pass it to sq inspect - the fingerprint should match the gpg output from earlier and what's specified in your config.py.

Yes, but compare:

root@sd-staging:~# sq inspect journalist.pub 
journalist.pub: OpenPGP Certificate.

    Fingerprint: 65A1B5FF195B56353CC63DFFCC40EF1228271441
                 Invalid: No binding signature at time 2023-08-10T22:03:47Z
Public-key algo: RSA (Encrypt or Sign)
Public-key size: 4096 bits
  Creation time: 2013-10-12 17:48:16 UTC

         UserID: SecureDrop Test/Development (DO NOT USE IN PRODUCTION)
                 Invalid: Policy rejected non-revocation signature (PositiveCertification) requiring second pre-image resistance
                 because: SHA1 is not considered secure since 2023-02-01T00:00:00Z

root@sd-staging:~# cd securedrop
root@sd-staging:~/securedrop# sq inspect securedrop/tests/files/test_journalist_key.pub 
securedrop/tests/files/test_journalist_key.pub: OpenPGP Certificate.

    Fingerprint: 65A1B5FF195B56353CC63DFFCC40EF1228271441
Public-key algo: RSA (Encrypt or Sign)
Public-key size: 4096 bits
  Creation time: 2013-10-12 17:48:16 UTC
      Key flags: certification, signing

         Subkey: C1C4E16BB24E4F4ABF37C3A6C3E7C4C0A2201B2A
Public-key algo: RSA (Encrypt or Sign)
Public-key size: 4096 bits
  Creation time: 2013-10-12 17:48:16 UTC
      Key flags: transport encryption, data-at-rest encryption

         UserID: SecureDrop Test/Development (DO NOT USE IN PRODUCTION)
  • Create a new source and successfully submit a message

No, fails in: (see comments on postinst)

[Thu Aug 10 22:14:37.482295 2023] [wsgi:error] [pid 135388:tid 114964295517952] [remote 127.0.0.1:32846]   File "/var/www/securedrop/source_app/main.py", line 262, in submit
[Thu Aug 10 22:14:37.482302 2023] [wsgi:error] [pid 135388:tid 114964295517952] [remote 127.0.0.1:32846]     Storage.get_default().save_message_submission(
[Thu Aug 10 22:14:37.482305 2023] [wsgi:error] [pid 135388:tid 114964295517952] [remote 127.0.0.1:32846]   File "/var/www/securedrop/store.py", line 363, in save_message_submission
[Thu Aug 10 22:14:37.482306 2023] [wsgi:error] [pid 135388:tid 114964295517952] [remote 127.0.0.1:32846]     EncryptionManager.get_default().encrypt_source_message(
[Thu Aug 10 22:14:37.482308 2023] [wsgi:error] [pid 135388:tid 114964295517952] [remote 127.0.0.1:32846]   File "/var/www/securedrop/encryption.py", line 112, in encrypt_source_message
[Thu Aug 10 22:14:37.482310 2023] [wsgi:error] [pid 135388:tid 114964295517952] [remote 127.0.0.1:32846]     recipients=[self.get_journalist_public_key()],
[Thu Aug 10 22:14:37.482312 2023] [wsgi:error] [pid 135388:tid 114964295517952] [remote 127.0.0.1:32846]   File "/var/www/securedrop/encryption.py", line 92, in get_journalist_public_key
[Thu Aug 10 22:14:37.482314 2023] [wsgi:error] [pid 135388:tid 114964295517952] [remote 127.0.0.1:32846]     return self.journalist_pub_key.read_text()
[Thu Aug 10 22:14:37.482316 2023] [wsgi:error] [pid 135388:tid 114964295517952] [remote 127.0.0.1:32846]   File "/usr/lib/python3.8/pathlib.py", line 1236, in read_text
[Thu Aug 10 22:14:37.482318 2023] [wsgi:error] [pid 135388:tid 114964295517952] [remote 127.0.0.1:32846]     with self.open(mode='r', encoding=encoding, errors=errors) as f:
[Thu Aug 10 22:14:37.482320 2023] [wsgi:error] [pid 135388:tid 114964295517952] [remote 127.0.0.1:32846]   File "/usr/lib/python3.8/pathlib.py", line 1222, in open
[Thu Aug 10 22:14:37.482322 2023] [wsgi:error] [pid 135388:tid 114964295517952] [remote 127.0.0.1:32846]     return io.open(self, mode, buffering, encoding, errors, newline,
[Thu Aug 10 22:14:37.482324 2023] [wsgi:error] [pid 135388:tid 114964295517952] [remote 127.0.0.1:32846]   File "/usr/lib/python3.8/pathlib.py", line 1078, in _opener
[Thu Aug 10 22:14:37.482326 2023] [wsgi:error] [pid 135388:tid 114964295517952] [remote 127.0.0.1:32846]     return self._accessor.open(self, flags, mode)
[Thu Aug 10 22:14:37.482328 2023] [wsgi:error] [pid 135388:tid 114964295517952] [remote 127.0.0.1:32846] PermissionError: [Errno 13] Permission denied: '/var/lib/securedrop/journalist.pub'
  • Log in as a journalist and submit a reply

SDW integration

  • Point an SDW setup at the staging/prod instance you created
  • Log in as the journalist, verify you can decrypt/read both the source message and the journalist reply
  • Submit a reply to the source
  • Switch back to the SI, verify you can read the reply sent from SDW as the source.

securedrop/debian/securedrop-app-code.postinst Outdated Show resolved Hide resolved
@legoktm legoktm force-pushed the sequoia-initial branch 6 times, most recently from ded62da to a35f513 Compare August 24, 2023 13:11
@legoktm
Copy link
Member Author

legoktm commented Aug 24, 2023

This is probably the second most complicated thing we're adding to the postinst (the apache header editing the most complicated IMO). Testing the whole postinst would be very difficult, I'm wondering if we should split this routine into a separate file so we can write automated tests against it.

@legoktm
Copy link
Member Author

legoktm commented Sep 6, 2023

apparmor :(

Sep  6 05:53:24 app-staging kernel: [ 6802.178341] audit: type=1400 audit(1693979604.315:20): apparmor="DENIED" operation="open" profile="/usr/sbin/apache2" name="/var/lib/securedrop/journalist.pub" pid=648 comm="apache2" requested_mask="r" denied_mask="r" fsuid=33 ouid=0

@legoktm
Copy link
Member Author

legoktm commented Sep 6, 2023

Changes:

  • da06ad5 now adds journalist.pub to apparmor and fixes the test_securedrop_application_test_journalist_key staging test
  • postinst now checks config.py exists before trying to export the key (it won't exist in install scenarios)
  • Rebased in such a way that the new postinst block is contained in one commit instead of being spread over 3
  • New smoke test for /public-key route that basically just checks the apparmor permission is in place.

Copy link
Member

@cfm cfm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...of course it was AppArmor! Thanks for fixing that so quickly, @legoktm.

I've now run through the test plan successfully! I've left a couple of one nits inline for your consideration, plus:

  • 4060c66 is unsigned.
    • Edit: Oh, this is unsigned because of my Co-authored-by line. GitHub suggests that we should be able to get this to "partially verified", but I won't lose sleep over it.

Otherwise I'm very happy with this.

("source", "http://localhost:80/", ""),
("journalist", "http://localhost:8080/", "L", "Powered by"),
("source", "http://localhost:80/", "", "Powered by"),
("source", "http://localhost:80/public-key", "", "-----BEGIN PGP PUBLIC KEY BLOCK-----"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

securedrop/tests/conftest.py Outdated Show resolved Hide resolved
@cfm
Copy link
Member

cfm commented Sep 8, 2023

Discussed this morning: If either @lsd-cat or @zenmonkeykstop has time next week for a security-focused second review, we'll do that, since I've focused so far on the functionality and the upgrade path. Otherwise I'll approve and merge once #6892 (review) is addressed.

legoktm and others added 13 commits September 8, 2023 12:14
If we restart Apache before database updates are applied, then we have
the possibility of running into missing columns or tables for a short
time. Also there's a theoretical possibility that users could try to
submit data mid-migration, so let's just have restarting apache be the
very last thing we do.

Co-authored-by: Cory Francis Myers <cory@freedom.press>
Add a new script that is similar to admin/bin/validate-gpg-key.sh,
except this also asserts that there's only one PGP key in the file.
This will be used in a follow-up to verify the key file we create
matches what we're looking for.

Ideally this should use Sequoia instead of pretty_bad_protocol, but that
takes some more work and probably makes sense to do in a separate PR.
For Sequoia, we need to have the journalist public key accessible
somewhere that isn't inside the GPG keyring. Storing it on disk as
`journalist.pub` is the simplest solution. This new path is added
to AppArmor.

The initial ansible provisioning is straightforward, we just need to
specify an explicit filename instead of using the admin-provided
filename.

For upgrades it's a bit trickier since we need to get the fingerprint
out of config.py and then have GPG export it.
Sequoia doesn't force us to use an on-disk keyring, so we can store the
keys in the database instead. Going forward this will greatly simplify
source deletion since there's no external state that needs to be kept
in sync.

This migration is different because if a downgrade needs to happen for
whatever reason, we do not want to delete the generated key pairs, since
doing so would break migrated sources. It should be safe to leave the
columns in place, so we'll simply do nothing on downgrade.
This isn't needed for SecureDrop production usage since we only decrypt
UTF-8 strings, but will make writing tests easier, in which we do want
to decrypt possibly-binary files. We need to return `Cow<[u8]>` from
decrypt() so pyo3 will convert it into Python `bytes` (`Vec<u8>` turns
into `list[int]`).

While we're at it, remove the Bytes alias since we now have two
different Rust types to represent Python "bytes".
In the initial implementation, I misunderstood how the interaction
between SecureTemporaryFile and pretty_bad_protocol worked.
SecureTemporaryFile isn't really a file, it's more like a buffer that's
backed by on-disk storage. SecureDrop streamed SecureTemporaryFile into
pretty_bad_protocol, which in turn streamed it to stdin to pass to GPG.

There's no native Python type that represents a stream (ignoring
asyncio), so pyO3 doesn't have a built-in conversion. The simplest thing
is to implement a Rust type that wraps a Python object, calls `.read()`
on it, and implements the `io::Read` trait. The separate pyo3-file[1]
crate (which I found after writing this) does basically the same thing.

The accompanying test case is implemented in Python since we need to
access SecureTemporaryFile.

In theory we could stream files directly into Sequoia and drop the
need for SecureTemporaryFile entirely, but that requires more work
and would need adding compression support to our Rust code.

[1] https://github.com/omerbenamram/pyo3-file
Our Sequoia code will fail if there is no passphrase set on the secret
key being used for decryption, so let's just set one. The passphrase is,
of course, "correcthorsebatterystaple".
Use the standard `Source.public_key` property to check if a public key
has been generated for a source. This hides any Sequoia vs GPG
implementation details from calling code and gets rid of a dynamic
property assignment and read.
This is always called with expected_success=True, remove the dead code.
In summary, this change implements:
* Newly created sources have a Sequoia-generated key pair stored in the
database, not in the GPG keyring.
* All message and file encryption functions are handled by Sequoia. The
journalist public key is read off disk instead of the GPG keyring.
* Decryption of journalist messages for legacy GPG sources is still done
through pretty_bad_protocol since the secret key is still in the GPG
keyring.

== EncryptionManager ==

The journalist public key is now read out of the SECUREDROP_DATA_ROOT
folder instead of from the keyring, so we no longer need to know the
specific fingerprint.

Key generation is invoked directly in source_user.create_source(), so
generate_source_key_pair() is removed, though it lives on in tests that
continue to verify GPG-based source behavior.

Since we can easily get a source's public key out of the GPG keyring,
all of the encryption functions now use Sequoia. Functionally this is
inefficient since we need to pull the key out of GPG and then pass it to
Sequoia, but the next phase of the transition will move the public keys
into the database, which will make this more efficient.

== Missing keypairs ==

It's generally wrong to mutate state during GET requests, so move the
step where we'd generate a keypair on /lookup requests to right after a
successful login. This necessitated refactoring the flow of the login
function to exit early in all failure cases so it's more obvious the new
code only runs for a successful request.

And this nicely leaves us a spot to fit in the migration of the secret
key out of GPG (coming soon).

== Tests ==

We're no longer able to mock the key length of generated keys since it's
hardcoded in Rust, so the tests might overall be slower. We could
probably pregenerate a set of keys and cycle through them over the
entire test run, but that's left for a follow-up.

The journalist public key is now copied into each test's unique
data_root. It is no longer present in the GPG keyring. A number of tests
also imported the journalist secret key, decrypted a submission, and
then deleted the key from the keyring. A new
utils.decrypt_as_journalist() helper function uses Sequoia to do all of
that in a much simpler way.

Some EncryptionManager tests were duplicated to use the
create_legacy_gpg_key() helper, which generates a GPG key pair and
deletes the Sequoia-generated keys to mimic what a pre-Sequoia source
creation would be like. Tests that are explicitly testing GPG-based
functionality should have `gpg` in their name, so it's easy to run them
directly with `-k gpg`.

Fixes #6799.
This is no longer needed since we don't pass SecureTemporaryFile to
pretty_bad_protocol anymore, it goes to Sequoia instead.
This verifies that a public key exists and Apache/SI has permission
to read it (e.g. no AppArmor denials) and serve it back to the user.
@cfm
Copy link
Member

cfm commented Sep 13, 2023

Next week @lsd-cat will do a security review of:

  1. the redwood API called by securedrop.encryption.EncryptionManager; and
  2. redwood's creation of new keys via Sequoia.

@lsd-cat
Copy link
Member

lsd-cat commented Sep 25, 2023

I did a limited review based on the limited capacity I had and my limited experience with both rust and sequoia. I mostly checked that the calls to the sequoia API were OK and that overall the encryption process looks good.
I have still to go over the filesystem/permissions part, but I'll get back to it this week.

@cfm
Copy link
Member

cfm commented Oct 2, 2023

Thanks for #6892 (comment), @lsd-cat. We decided in today's stand-up that we'll go ahead and merge this, although further review and any follow-up feedback are of course welcome.

@cfm cfm merged commit 496e6a2 into develop Oct 2, 2023
@legoktm legoktm deleted the sequoia-initial branch October 3, 2023 00:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

New source creation uses Sequoia
4 participants