Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sops sometimes fails to handle files with context #320

Closed
dlovitch opened this issue Apr 3, 2018 · 5 comments
Closed

sops sometimes fails to handle files with context #320

dlovitch opened this issue Apr 3, 2018 · 5 comments
Labels
bug priority/medium Medium priority issues (e.g. breaking changes that have a workaround)

Comments

@dlovitch
Copy link

dlovitch commented Apr 3, 2018

We've been seeing odd/intermittent issues with files that have context, so I generated some small test case files that have working: ok as the only thing in them.

We used the same KMS key (and only that KMS key) for all tests and I used sops 1.16 for the 1.x tests.

My test should result in a - for a success and X for a failure.

sops 3.0.2 with a sops 3.0.2-encrypted file (with context)

$ count=0; while true; do count=$((count+1)); if sops -d with_context.yml 1>/dev/null 2>/dev/null; then echo -n "-"; else echo -n "X"; fi; sleep 1; done
----X-X---X--X----X---X--XX--X-XXXX----X--

sops 3.0.2 with a sops 3.0.2-encrypted file (no context)

$ count=0; while true; do count=$((count+1)); if sops -d no_context.yml 1>/dev/null 2>/dev/null; then echo -n "-"; else echo -n "X"; fi; sleep 1; done
----------------------------

sops 3.0.2 with a sops 1.16-encrypted file (with context)

$ count=0; while true; do count=$((count+1)); if ./venv/bin/sops -d with_context_v1.yml 1>/dev/null 2>/dev/null; then echo -n "-"; else echo -n "X"; fi; sleep 1; done
--------------------------------

sops 1.16 file with a sops 1.16-encrypted file (with context):

$ count=0; while true; do count=$((count+1)); if ./venv/bin/sops -d with_context_v1.yml 1>/dev/null 2>/dev/null; then echo -n "-"; else echo -n "X"; fi; sleep 1; done
------------------------------

When running sops -d with_context.yml:
Success:

$ sops -d with_context.yml 
[AWSKMS]	 INFO[0000] Decryption succeeded                          arn="arn:aws:kms:us-west-2:X:key/X"
[SOPS]	 INFO[0000] Data key recovered successfully              
working: ok

Failure:

$ sops -d with_context.yml 
[AWSKMS]	 WARN[0000] Decryption failed                             arn="arn:aws:kms:us-west-2:X:key/X"
Failed to get the data key required to decrypt the SOPS file.

Group 0: FAILED
  arn:aws:kms:us-west-2:X:key/X: FAILED
    - | Error decrypting key: InvalidCiphertextException: 
      | 	status code: 400, request id:
      | X

We log all error messages and when looking up that for the encryptionContext, somehow the app and env end up being the same value (the local sops yaml file shows two different values).

It feels like something isn't iterating over all possible encryption context keys.

Let me know if there's anything I can do to help/research.

For reference:

with_context.yml

working: ENC[AES256_GCM,data:X,tag:X,type:str]
sops:
    kms:
    -   arn: arn:aws:kms:us-west-2:X:key/X
        context:
            app: X
            env: X
        created_at: '2018-04-03T01:19:34Z'
        enc: X
    gcp_kms: []
    lastmodified: '2018-04-03T01:35:47Z'
    mac: ENC[AES256_GCM,data:X,iv:X,tag:X,type:str]
    pgp: []
    unencrypted_suffix: _unencrypted
    version: 3.0.2

no_context.yml

working: ENC[AES256_GCM,data:X,tag:X,type:str]
sops:
    kms:
    -   arn: arn:aws:kms:us-west-2:X:key/X
        created_at: '2018-04-03T01:28:38Z'
        enc: X
    gcp_kms: []
    lastmodified: '2018-04-03T01:28:46Z'
    mac: ENC[AES256_GCM,data:X,iv:X,tag:X,type:str]
    pgp: []
    unencrypted_suffix: _unencrypted
    version: 3.0.2

with_context_v1.yml

working: ENC[AES256_GCM,data:X,iv:X,tag:X,type:str]
sops:
    attention: This section contains key material that should only be modified with
        extra care. See `sops -h`.
    version: '1.16'
    unencrypted_suffix: _unencrypted
    kms:
    -   arn: arn:aws:kms:us-west-2:X:key/X
        context:
            app: X
            env: X
        enc: X
        created_at: '2018-04-03T01:39:57Z'
    lastmodified: '2018-04-03T01:39:57Z'
    mac: ENC[AES256_GCM,data:X,iv:X,tag:X,type:str]
@autrilla
Copy link
Contributor

autrilla commented Apr 3, 2018

Hmm, weird. Specially weird that it appears to work for files created with 1.x.

We log all error messages and when looking up that for the encryptionContext, somehow the app and env end up being the same value (the local sops yaml file shows two different values).

Could you clarify what you mean by this? I'm not sure I understand.

Some things that could help diagnose this:

  • Could you test if you still get the errors with SOPS 2.x? The code handling keys changed quite a bit for 3.x, so maybe a bug was introduced then.
  • If you take the with_context_v1.yml file and change the version in it to 3.0.2, does the error still appear?

@autrilla autrilla added the bug label Apr 3, 2018
@autrilla
Copy link
Contributor

@dlovitch any news on this?

@aripringle
Copy link

I'm running into this same issue using 3.0.3. I tried using 2.0.10 instead, and it seemed to work fine.

With 3.0.3, on a file created using encryption context (AWS KMS), would give this error almost exactly 50% of the time (10 failures on 20 attempts):

$ sops -d with-context.yaml
Failed to get the data key required to decrypt the SOPS file.

Group 0: FAILED
arn:aws:kms:us-east-1:[redacted]:key/[redacted]: FAILED
- | Error decrypting key: InvalidCiphertextException:
| status code: 400, request id:
| [redacted]

Recovery failed because no master key was able to decrypt the file. In
order for SOPS to recover the file, at least one key has to be successful,
but none were.

I tried this same process on 2.0.10, and all 20 attempts were successful.

@autrilla autrilla added the priority/medium Medium priority issues (e.g. breaking changes that have a workaround) label Apr 23, 2018
@jpsrn
Copy link
Contributor

jpsrn commented Mar 6, 2019

I ran into the same issue and I was able catch sops sending incorrect encryption context key-value pairs by inspecting AWS Cloudtrail logs for KMS service. I created a pull request (#435 ) that fixes the issue. Please note that in order to trigger the bug, you need to have at least two encryption context key-value pairs with at least two unique values. The pull request I referenced contains more information.

@ajvb
Copy link
Contributor

ajvb commented Jun 7, 2019

Fixed in 3.3.0 with #435

@ajvb ajvb closed this as completed Jun 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug priority/medium Medium priority issues (e.g. breaking changes that have a workaround)
Projects
None yet
Development

No branches or pull requests

5 participants