Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mongo key encryption #1506

Merged
merged 17 commits into from
Oct 4, 2021
Merged

Mongo key encryption #1506

merged 17 commits into from
Oct 4, 2021

Conversation

VakarisZ
Copy link
Contributor

What does this PR do?

Fixes #1463

PR Checklist

  • Have you added an explanation of what your changes do and why you'd like to include them?
  • Is the TravisCI build passing?
  • Was the CHANGELOG.md updated to reflect the changes?
  • Was the documentation framework updated to reflect the changes?

Testing Checklist

  • Added relevant unit tests?
  • Have you successfully tested your changes locally? Elaborate:

    Tested by Registering, logging in, resetting the user

…nd PasswordBasedByteEncryptor

This change allows to encrypt strings and bytes without any additional conversion done on the caller
This change enables the encryption/decryption of mongo key with a custom secret
…ongo key initialization that happens during launch and initialization of key which happens after login or registration
@@ -19,21 +21,13 @@ def get(self):
return {"needs_registration": is_registration_needed}

def post(self):
credentials = _get_user_credentials_from_request(request)
# TODO delete the old key here, before creating new one
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth it? I think it is. It will allow us to remove the extra step of delete this mongo key from credential reset procedure

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, as long as it's appropriately encapsulated.

@codecov
Copy link

codecov bot commented Sep 30, 2021

Codecov Report

Merging #1506 (ddff2f0) into develop (f387595) will increase coverage by 0.12%.
The diff coverage is 70.71%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1506      +/-   ##
===========================================
+ Coverage    42.69%   42.82%   +0.12%     
===========================================
  Files          476      478       +2     
  Lines        14202    14242      +40     
===========================================
+ Hits          6064     6099      +35     
- Misses        8138     8143       +5     
Impacted Files Coverage Δ
monkey/monkey_island/cc/app.py 79.69% <ø> (+0.43%) ⬆️
...monkey_island/cc/resources/configuration_import.py 61.29% <0.00%> (ø)
monkey/monkey_island/cc/server_setup.py 0.00% <ø> (ø)
.../server_utils/encryption/encryptors/i_encryptor.py 100.00% <ø> (ø)
...utils/encryption/encryptors/key_based_encryptor.py 100.00% <ø> (ø)
...attack/technique_reports/technique_report_tools.py 23.52% <0.00%> (ø)
monkey/monkey_island/cc/services/database.py 58.33% <ø> (+2.77%) ⬆️
monkey/monkey_island/cc/services/initialize.py 0.00% <0.00%> (ø)
...island/cc/services/telemetry/processing/exploit.py 26.92% <0.00%> (ø)
...nd/cc/services/telemetry/processing/system_info.py 27.41% <0.00%> (ø)
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f387595...ddff2f0. Read the comment docs.

@VakarisZ
Copy link
Contributor Author

@mssalvatore take a look at commit 4f17693.
Multiple things are changed there:

  • Password utils renamed to credential utils
  • Credential utils moved to a single file to be more reusable between registration and login. Registration page changed to use 'username' instead of 'user', same as login page
  • Data store encryptor initialization split up into 2 parts: dir initialization and key decryption/setup.

Separating these changes into different commits post-factum seems more work than it's worth. Doing refactoring first requires a lot of forethought. We could separate the things roughly by modified files, but the it will introducuse system-breaking commits.

return f"{username}:{password}"


def get_creds_from_request(_request: Request) -> Tuple[str, str]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dnt abbrv.

Or at least abbreviate consistently. This function is get_creds...(), whereas above there is get_user_credentials...() My preference is to use "credentials" in these names.

Comment on lines 28 to 30
def get_secret_from_request(_request) -> str:
username, password = get_creds_from_request(_request)
return f"{username}:{password}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be handled in setup_datastore_key(). Pass it the username and password and let it worry about what to do with them.

key_file = os.path.join(key_file_dir, self._KEY_FILENAME)
def __init__(self, key_file_dir: str):
self.key_file_path = os.path.join(key_file_dir, self._KEY_FILENAME)
self._key_base_encryptor = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"_key_based"

Comment on lines 15 to 33
class PasswordBasedStringEncryptor(IEncryptor):

_BUFFER_SIZE = pyAesCrypt.crypto.bufferSizeDef

def __init__(self, password: str):
self.password = password

def encrypt(self, plaintext: str) -> str:
plaintext_stream = io.BytesIO(plaintext.encode())
ciphertext = PasswordBasedByteEncryptor(self.password).encrypt(plaintext_stream)

return base64.b64encode(ciphertext.getvalue()).decode()

def decrypt(self, ciphertext: str) -> str:
ciphertext = base64.b64decode(ciphertext)
ciphertext_stream = io.BytesIO(ciphertext)

plaintext_stream = PasswordBasedByteEncryptor(self.password).decrypt(ciphertext_stream)
return plaintext_stream.getvalue().decode("utf-8")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class PasswordBasedStringEncryptor(IEncryptor):
_BUFFER_SIZE = pyAesCrypt.crypto.bufferSizeDef
def __init__(self, password: str):
self.password = password
def encrypt(self, plaintext: str) -> str:
plaintext_stream = io.BytesIO(plaintext.encode())
ciphertext = PasswordBasedByteEncryptor(self.password).encrypt(plaintext_stream)
return base64.b64encode(ciphertext.getvalue()).decode()
def decrypt(self, ciphertext: str) -> str:
ciphertext = base64.b64decode(ciphertext)
ciphertext_stream = io.BytesIO(ciphertext)
plaintext_stream = PasswordBasedByteEncryptor(self.password).decrypt(ciphertext_stream)
return plaintext_stream.getvalue().decode("utf-8")
class PasswordBasedStringEncryptor(IEncryptor):
_BUFFER_SIZE = pyAesCrypt.crypto.bufferSizeDef
def __init__(self, password: str):
self._bytes_encryptor = PasswordBasedByteEncryptor(password)
def encrypt(self, plaintext: str) -> str:
plaintext_stream = io.BytesIO(plaintext.encode())
ciphertext = self._bytes_encryptor.encrypt(plaintext_stream)
return base64.b64encode(ciphertext.getvalue()).decode()
def decrypt(self, ciphertext: str) -> str:
ciphertext = base64.b64decode(ciphertext)
ciphertext_stream = io.BytesIO(ciphertext)
plaintext_stream = self._bytes_encryptor.decrypt(ciphertext_stream)
return plaintext_stream.getvalue().decode("utf-8")

@@ -17,36 +17,28 @@
# Note: password != key


class PasswordBasedEncryptor(IEncryptor):
class PasswordBasedByteEncryptor(IEncryptor):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about "PasswordBasedBytesEncryptor"? "Byte" makes it seem like it encrypts a single byte.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file name should match the class name.


_BUFFER_SIZE = pyAesCrypt.crypto.bufferSizeDef

def __init__(self, password: str):
self.password = password

def encrypt(self, plaintext: str) -> str:
plaintext_stream = io.BytesIO(plaintext.encode())
def encrypt(self, plaintext: BytesIO) -> BytesIO:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should take bytes, not BytesIO. Anything that has to use it (such as PasswordBasedStringEncryptor) has to convert first to bytes and then to BytesIO. Anything that already has data as bytes has to then convert it to BytesIO.

The only reason that it needs to be BytesIO is because that's what pyAesCrypt expects. This signature requiring BytesIO leaks implementation details to the callers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree on this. It will also improve readability where we use the functions.

Comment on lines 29 to 33
def init_key(self, secret: str):
if os.path.exists(self.key_file_path):
self._load_existing_key(secret)
else:
self._init_key(key_file)

self._key_base_encryptor = KeyBasedEncryptor(self._cipher_key)

def _init_key(self, password_file_path: str):
self._cipher_key = Random.new().read(self._BLOCK_SIZE)
with open_new_securely_permissioned_file(password_file_path, "wb") as f:
f.write(self._cipher_key)

def _load_existing_key(self, key_file):
with open(key_file, "rb") as f:
self._cipher_key = f.read()
self._create_new_key(secret)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this two-phased construction approach adds unnecessary complexity to the encryptor. I think we can solve this in a better way by using a factory. We can discuss options on Zoom.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean that database initializer is constructed first and the key is initialized later? Or that we either load the key or create one?

Comment on lines 25 to 28
def test_create_new_password_file(tmpdir):
initialize_datastore_encryptor(tmpdir)

assert os.path.isfile(os.path.join(tmpdir, PASSWORD_FILENAME))
setup_datastore_key(ENCRYPTOR_SECRET)
assert os.path.isfile(os.path.join(tmpdir, DataStoreEncryptor._KEY_FILENAME))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're here:

def test_create_new_password_file(tmp_path):
    initialize_datastore_encryptor(tmp_path)
    setup_datastore_key(ENCRYPTOR_SECRET)
    assert (tmp_path / DataStoreEncryptor._KEY_FILENAME).is_file()



def _credentials_match_registered_user(username, password):
def _credentials_match_registered_user(username: str, password: str):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _credentials_match_registered_user(username: str, password: str):
def _credentials_match_registered_user(username: str, password: str) -> bool:

logger = logging.getLogger(__name__)


class PasswordBasedStringEncryptor(IEncryptor):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file name should match the class name.

…ests for data store encryptor

Data store key needs to be deleted upon registration to create a new one.
…hange PasswordBasedBytesEncryptor interface to use bytes instead of io.BytesIO
…thods into EncryptorFactory

This makes encryptor initialization workflow more straight-forward and the files become smaller, easier to read
…equest

This better indicates that get_username_password_from_request returns a username/password pair rather than UserCreds structure
@VakarisZ VakarisZ marked this pull request as ready for review October 1, 2021 12:41
assert (initialized_key_dir / EncryptorFactory._KEY_FILENAME).isfile()


def test_key_removal(initialized_key_dir, monkeypatch):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_key_removal(initialized_key_dir, monkeypatch):
def test_key_removal(initialized_key_dir):

initialize_encryptor_factory(tmpdir)
assert not (tmpdir / EncryptorFactory._KEY_FILENAME).isfile()
initialize_datastore_encryptor(MOCK_USERNAME, MOCK_PASSWORD)
assert (tmpdir / EncryptorFactory._KEY_FILENAME).isfile()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add an encryption/decryption test.

if not factory:
raise FactoryNotInitializedError
secret = get_secret_from_credentials(username, password)
_encryptor = DataStoreEncryptor(factory.key_file_path, secret)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Factories are responsible for constructing objects. We should have factory.set_key_file_path() and factory.set_secret() and then call factory.construct_data_store_encryptor() or similar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is not a factory, it's a class that handles initialization parameters for data store encryptor. I'm not sure what would be a better name, maybe DataStoreEncryptorInitializer? I don't see any benefits in making this an actual factory, since the output object will not be different depending on key file path and secret. What is more, the usage of such factory is less intuitive. The registration resource would have to factory.set_secret() then built_encryptor = factory.construct_data_store_encryptor() and finally set_encryptor(built_encryptor). Now it's only one call - initialize_datastore_encryptor(username, password).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can simplify this way down. I'm going to dig into it a bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to refactor and see how it looks

_KEY_FILENAME = "mongo_key.bin"

def __init__(self, key_file_dir: str):
self.key_file_path = os.path.join(key_file_dir, self._KEY_FILENAME)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer pathlib

Comment on lines 13 to 14
def __init__(self, key_based_encryptor: KeyBasedEncryptor):
self._key_based_encryptor = key_based_encryptor
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the DataStoreEncryptor is a thin wrapper around KeyBasedEncryptor with no logic, I'd rather get_datastore_encryptor() return an IEncryptor and just return a concrete KeyBasedEncryptor. We can refactor everything to use encrypt() and decrypt() instead of enc() and dec(), which is better anyway.

This separates encryptor classes from other encryption related infrastructure that we have cc\server_utils\encryption
…e encryptor creation related code to data_store_encryptor.py, move the reponsibility to initialize data store encryptor to AuthenticationService
Database initialization can not be done because island doesn't know the key needed for encrypting collections. Since the key only appears after registration, database setup also should happen only after registration
Comment on lines +7 to +9
from monkey_island.cc.server_utils.encryption.encryptors.password_based_bytes_encryption import (
PasswordBasedBytesEncryptor,
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't do from monkey_island.cc.server_utils.encryption import PasswordBasedBytesEncryptor because of circular imports. 🤔

Copy link
Collaborator

@mssalvatore mssalvatore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this iteration looks much better. I have some thoughts on cleaning this up a bit further, but if all of the automated and manual tests pass, let's merge it.

@VakarisZ VakarisZ force-pushed the mongo_key_encryption branch from a8f875d to ddff2f0 Compare October 4, 2021 11:59
@VakarisZ VakarisZ merged commit af99482 into develop Oct 4, 2021
@VakarisZ VakarisZ deleted the mongo_key_encryption branch October 4, 2021 12:10
@mssalvatore mssalvatore mentioned this pull request Nov 19, 2021
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Encrypt mongodb key
3 participants