Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] GoalStateTestCase.test_it_should_refresh_the_goal_state_when_it_is_inconsistent failing with openssl 3.2.2 in Ubuntu 24.10 (python 3.12) #3163

Closed
mirespace opened this issue Jul 19, 2024 · 4 comments

Comments

@mirespace
Copy link
Contributor

mirespace commented Jul 19, 2024

Hi team,

Describe the bug: A clear and concise description of what the bug is.

When building WALinuxAgent in Ubuntu 24.10 the following test is failing:

=================================== FAILURES ===================================
_ GoalStateTestCase.test_it_should_refresh_the_goal_state_when_it_is_inconsistent _

self = <tests.common.protocol.test_goal_state.GoalStateTestCase testMethod=test_it_should_refresh_the_goal_state_when_it_is_inconsistent>

    def test_it_should_refresh_the_goal_state_when_it_is_inconsistent(self):
        #
        # Some scenarios can produce inconsistent goal states. For example, during hibernation/resume, the Fabric goal state changes (the
        # tenant certificate is re-generated when the VM is restarted) *without* the incarnation changing. If a Fast Track goal state
        # comes after that, the extensions will need the new certificate. This test simulates that scenario by mocking the certificates
        # request and returning first a set of certificates (certs-2.xml) that do not match those needed by the extensions, and then a
        # set (certs.xml) that does match. The test then ensures that the goal state was refreshed and the correct certificates were
        # fetched.
        #
        data_files = [
            "wire/certs-2.xml",
            "wire/certs.xml"
        ]
    
        def http_get_handler(url, *_, **__):
            if HttpRequestPredicates.is_certificates_request(url):
                http_get_handler.certificate_requests += 1
                if http_get_handler.certificate_requests < len(data_files):
                    data = load_data(data_files[http_get_handler.certificate_requests - 1])
                    return MockHttpResponse(status=200, body=data.encode('utf-8'))
            return None
        http_get_handler.certificate_requests = 0
    
        with mock_wire_protocol(wire_protocol_data.DATA_FILE_VM_SETTINGS) as protocol:
            protocol.set_http_handlers(http_get_handler=http_get_handler)
            protocol.mock_wire_data.reset_call_counts()
    
            goal_state = GoalState(protocol.client)
    
>           self.assertEqual(2, protocol.mock_wire_data.call_counts['goalstate'], "There should have been exactly 2 requests for the goal state (original + refresh)")
E           AssertionError: 2 != 1 : There should have been exactly 2 requests for the goal state (original + refresh)

tests/common/protocol/test_goal_state.py:476: AssertionError

The curious thing is that this was not happening a pair of weeks ago, so I've checked what changed at distro level to see if another package could be impacting WALinuxagent. The suspicious was openssl, and I can confirm it is the cause, but I couldn't discover 100% the underlying root of the problem, because using previous version 3.2.1, the test passes:

dpkg: warning: downgrading libssl3t64:amd64 from 3.2.2-1ubuntu1 to 3.2.1-3ubuntu1
[...]
dpkg: warning: downgrading openssl from 3.2.2-1ubuntu1 to 3.2.1-3ubuntu1
[...]

(oracular-amd64)root@Avalon:/build/walinuxagent-s1lr7I/walinuxagent-2.11.1.4# pytest tests/common/protocol/test_goal_state.py -k test_it_should_refresh_the_goal_state_when_it_is_inconsistent
=================================================================================== test session starts ====================================================================================
platform linux -- Python 3.12.4, pytest-8.2.2, pluggy-1.5.0
rootdir: /build/walinuxagent-s1lr7I/walinuxagent-2.11.1.4
collected 21 items / 20 deselected / 1 selected                                                                                                                                            

tests/common/protocol/test_goal_state.py 

===================================================================================== warnings summary =====================================================================================
[…]
====================================================================== 1 passed, 20 deselected, 183 warnings in 1.88s ====================================================================

Changes between opennsl versions (3.2.1 and 3.2.2): https://github.com/openssl/openssl/blob/openssl-3.2.2/CHANGES.md#changes-between-321-and-322-4-jun-2024

Distro and WALinuxAgent details :

  • Distro and Version: Ubuntu 24.10 (development version)
  • WALinuxAgent version: 2.11.1.4

Additional context

I explored different options to get to the point on what is happening, with no luck... so sharing here all the way I did (thanks in advance!):

Checking when openssl is involved in the test, I saw the check_certificates in azurelinuxagent/common/protocol/goal_state.py that calls Certificates(), and it ends calling decrypt_p7m func from CryptUtil in azurelinuxagent/common/utils/cryptutil.py which executes /usr/bin/openssl cms -decrypt -in Certificates.p7m -inkey TransportPrivate.pem -recip TransportCert.pem | /usr/bin/openssl pkcs12 -nodes -password pass: -out Certificates.pem.

This operation fails once, happening with both openssl 3.2.1 -the "good"- and 3.2.2 - "bad":

azurelinuxagent.common.utils.shellutil.CommandError: '/usr/bin/openssl cms -decrypt -in /tmp/GoalStateTestCase__8ay2k22/Certificates.p7m -inkey /tmp/GoalStateTestCase__8ay2k22/TransportPrivate.pem -recip /tmp/GoalStateTestCase__8ay2k22/TransportCert.pem | /usr/bin/openssl pkcs12 -nodes -password pass: -out /tmp/GoalStateTestCase__8ay2k22/Certificates.pem' failed: 1 (Error decrypting CMS using private key)

But it seems it "recovers" well in one case and it doesn't in the other. I checked how many times are openssl called in bad a good case:

Good

pytest            0 /usr/bin/pytest tests/common/protocol/test_goal_state.py -k test_it_should_refresh_the_goal_state_when_it_is_inconsistent
lsb_release       0 /usr/bin/lsb_release -a
getopt            0 /usr/bin/getopt --name lsb_release -o hvidrcas -l help,version,id,description,release,codename,all,short -- -a
cut               0 /usr/bin/cut -c1
tr                0 /usr/bin/tr [:lower:] [:upper:]
cut               0 /usr/bin/cut -c2-
tr                0 /usr/bin/tr [:upper:] [:lower:]
tr                0 /usr/bin/tr [:upper:] [:lower:]
uname             0 /usr/bin/uname -rs
openssl           0 /usr/bin/openssl cms -decrypt -in /tmp/GoalStateTestCase_dir/Certificates.p7m -inkey /tmp/GoalStateTestCase_dir/TransportPrivate.pem -recip /tmp/GoalStateTestCase_dir/TransportCert.pem
openssl           0 /usr/bin/openssl pkcs12 -nodes -password pass: -out /tmp/GoalStateTestCase_dir/Certificates.pem
openssl           0 /usr/bin/openssl rsa -in /tmp/GoalStateTestCase_dir/0.prv -pubout
openssl           0 /usr/bin/openssl x509 -in /tmp/GoalStateTestCase_dir/1.crt -pubkey -noout
openssl           0 /usr/bin/openssl x509 -in /tmp/GoalStateTestCase_dir/1.crt -fingerprint -noout
openssl           0 /usr/bin/openssl x509 -in /tmp/GoalStateTestCase_dir/2.crt -pubkey -noout
openssl           0 /usr/bin/openssl x509 -in /tmp/GoalStateTestCase_dir/2.crt -fingerprint -noout
openssl           0 /usr/bin/openssl cms -decrypt -in /tmp/GoalStateTestCase_dir/Certificates.p7m -inkey /tmp/GoalStateTestCase_dir/TransportPrivate.pem -recip /tmp/GoalStateTestCase_dir/TransportCert.pem
openssl           0 /usr/bin/openssl pkcs12 -nodes -password pass: -out /tmp/GoalStateTestCase_dir/Certificates.pem
openssl           0 /usr/bin/openssl cms -decrypt -in /tmp/GoalStateTestCase_dir/Certificates.p7m -inkey /tmp/GoalStateTestCase_dir/TransportPrivate.pem -recip /tmp/GoalStateTestCase_dir/TransportCert.pem
openssl           0 /usr/bin/openssl pkcs12 -nodes -password pass: -out /tmp/GoalStateTestCase_dir/Certificates.pem
openssl           0 /usr/bin/openssl rsa -in /tmp/GoalStateTestCase_dir/0.prv -pubout
openssl           0 /usr/bin/openssl x509 -in /tmp/GoalStateTestCase_dir/1.crt -pubkey -noout
openssl           0 /usr/bin/openssl x509 -in /tmp/GoalStateTestCase_dir/1.crt -fingerprint -noout
openssl           0 /usr/bin/openssl x509 -in /tmp/GoalStateTestCase_dir/2.crt -pubkey -noout
openssl           0 /usr/bin/openssl x509 -in /tmp/GoalStateTestCase_dir/2.crt -fingerprint -noout

And in the bad is one missing:

Bad

pytest             0 /usr/bin/pytest tests/common/protocol/test_goal_state.py -k test_it_should_refresh_the_goal_state_when_it_is_inconsistent
lsb_release        0 /usr/bin/lsb_release -a
getopt             0 /usr/bin/getopt --name lsb_release -o hvidrcas -l help,version,id,description,release,codename,all,short -- -a
cut                0 /usr/bin/cut -c1
tr                 0 /usr/bin/tr [:lower:] [:upper:]
cut                0 /usr/bin/cut -c2-
tr                 0 /usr/bin/tr [:upper:] [:lower:]
tr                 0 /usr/bin/tr [:upper:] [:lower:]
uname              0 /usr/bin/uname -rs
openssl            0 /usr/bin/openssl cms -decrypt -in /tmp/GoalStateTestCase_dir/Certificates.p7m -inkey /tmp/GoalStateTestCase_dir/TransportPrivate.pem -recip /tmp/GoalStateTestCase_dir/TransportCert.pem
openssl            0 /usr/bin/openssl pkcs12 -nodes -password pass: -out /tmp/GoalStateTestCase_dir/Certificates.pem
openssl            0 /usr/bin/openssl rsa -in /tmp/GoalStateTestCase_dir/0.prv -pubout
openssl            0 /usr/bin/openssl x509 -in /tmp/GoalStateTestCase_dir/1.crt -pubkey -noout
openssl            0 /usr/bin/openssl x509 -in /tmp/GoalStateTestCase_dir/1.crt -fingerprint -noout
openssl            0 /usr/bin/openssl x509 -in /tmp/GoalStateTestCase_dir/2.crt -pubkey -noout
openssl            0 /usr/bin/openssl x509 -in /tmp/GoalStateTestCase_dir/2.crt -fingerprint -noout
openssl            0 /usr/bin/openssl cms -decrypt -in /tmp/GoalStateTestCase_dir/Certificates.p7m -inkey /tmp/GoalStateTestCase_dir/TransportPrivate.pem -recip /tmp/GoalStateTestCase_dir/TransportCert.pem
openssl            0 /usr/bin/openssl pkcs12 -nodes -password pass: -out /tmp/GoalStateTestCase_dir/Certificates.pem


openssl            0 /usr/bin/openssl rsa -in /tmp/GoalStateTestCase_dir/0.prv -pubout
openssl            0 /usr/bin/openssl x509 -in /tmp/GoalStateTestCase_dir/1.crt -pubkey -noout
openssl            0 /usr/bin/openssl x509 -in /tmp/GoalStateTestCase_dir/1.crt -fingerprint -noout
openssl            0 /usr/bin/openssl x509 -in /tmp/GoalStateTestCase_dir/2.crt -pubkey -noout
openssl            0 /usr/bin/openssl x509 -in /tmp/GoalStateTestCase_dir/2.crt -fingerprint -noout

So, I checked if the Certificates.pem generated are different depending on the SSL version, and is not:

(oracular-amd64)root@Avalon:/tmp/3.2.2# md5sum Certificates.pem 
593387341dce417a3025192902f882eb  Certificates.pem

(oracular-amd64)root@Avalon:/tmp/3.2.1# md5sum Certificates.pem 
593387341dce417a3025192902f882eb  Certificates.pem

Accidentally, when doing a trace with pdb, I jumped out the reading loop when reading that file (to line 621 in azurelinuxagent/common/protocol/goal_state.py) and the test passed.. .I'm sharing that pdb session with you here in case I could give you a clue that I can't be able to identify. I put the breakpoints at the end.

Anything else you need from me, please let me know... Thanks for looking into it!

Miriam

@mirespace mirespace changed the title [BUG] GoalStateTestCase.test_it_should_refresh_the_goal_state_when_it_is_inconsistent failing with openssl 3.2.2 in Ubuntu 24.10 [BUG] GoalStateTestCase.test_it_should_refresh_the_goal_state_when_it_is_inconsistent failing with openssl 3.2.2 in Ubuntu 24.10 (python 3.12) Jul 22, 2024
@nagworld9
Copy link
Contributor

@mirespace Thanks for opening the issue. I was able to repro after installing openssl version 3.2.2 as you highlighted. The other thing I noticed is this test uses 2 test certs. One of the cert(certs.xml) decrypted properly (no issues) but other certs-2.xml is failing while decrypting it. As a result, test validation failed.

'/usr/local/ssl/bin/openssl cms -decrypt -in /tmp/GoalStateTestCase_27xaqevo/Certificates.p7m -inkey /tmp/GoalStateTestCase_27xaqevo/TransportPrivate.pem -recip /tmp/GoalStateTestCase_27xaqevo/TransportCert.pem | /usr/local/ssl/bin/openssl pkcs12 -nodes -password pass: -out /tmp/GoalStateTestCase_27xaqevo/Certificates.pem' failed: 1 (Error decrypting CMS using private key)

I tested this in vm as well with sample certs and able to decrypt the cert.

root@ubu24:/var/lib/waagent# rm Certificates.pem
root@ubu24:/var/lib/waagent# /usr/local/ssl/bin/openssl cms -decrypt -in Certificates.p7m -inkey TransportPrivate.pem -recip TransportCert.pem | /usr/local/ssl/bin/openssl pkcs12 -nodes -password pass: -out Certificates.pem
root@ubu24:/var/lib/waagent# ls Certificates.pem
Certificates.pem

So far this is my observation, and I'll continue investigate further

@narrieta
Copy link
Member

narrieta commented Aug 1, 2024

@mirespace thanks for reporting this. @nagworld9 and me looked into this and we need a small change in the Agent to accommodate a difference of behavior in the new openssl. We'll do this within the next couple of weeks.

@mirespace
Copy link
Contributor Author

Thanks @narrieta and @nagworld9 !

@narrieta
Copy link
Member

narrieta commented Aug 2, 2024

Fixed by #3166

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants