Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching signing key #859

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

henryfool91
Copy link

Hi! Recently, we noticed significantly slow creation of access and refresh tokens (RSA256) in our project. After investigating, I found that there was no caching for the signing key, which was affecting the token generation speed, especially in CPU-limited environments (source: https://pyjwt.readthedocs.io/en/stable/usage.html#encoding-decoding-tokens-with-rs256-rsa). Implementing this quick fix improved token generation performance by 15x for us.

All tests passed successfully, except for one old test for PyJWT support. This test checks the type of the signing key and expects it to be a string, but I’m not sure if the older version of PyJWT is still relevant. What do you think?

@henryfool91
Copy link
Author

also it kinda fixes this issue #820 @Andrew-Chen-Wang

Copy link
Member

@Andrew-Chen-Wang Andrew-Chen-Wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the token backend doesn't use the signing key frequently, so cached_property appears useless since TokenBackend I think is instantiated per request. Rather the generation of the key is what's taking time.

Instead, any key typed str should accept str | Key

Then, in settings.py, you can generate the signing keys and access the cached keys there.

@henryfool91
Copy link
Author

Hey, @Andrew-Chen-Wang , thanks for the quick response!

the token backend doesn't use the signing key frequently, so cached_property appears useless since TokenBackend I think is instantiated per request. Rather the generation of the key is what's taking time.

Actually, no. TokenBackend is instantiated only once, in state.py. But the key was being created every time for every jwt.encode operation, which caused performance issues with RSA256.

Instead, any key typed str should accept str | Key
Then, in settings.py, you can generate the signing keys and access the cached keys there.

That’s certainly possible approach, but is there really any point in this? I mean, is there a specific reason to force users to create a key from a string if they want better performance? Why shouldn't we handle it ourselves, keeping the app as simple as possible?

@Andrew-Chen-Wang
Copy link
Member

Thanks for checking

My worry is that using cached_property will not be able to detect changes during a setting reload in development servers.

Though I could be wrong; I'm not sure if the development server restarts the entire Python process on settings change. If it does, then it LGTM. Can you confirm?

@henryfool91
Copy link
Author

@Andrew-Chen-Wang

In the development server, everything will work fine, confirmed—it reloads the Python process on any code changes. However, changing environment variables still requires a full server restart, regardless of whether it's a production or development server - but ofc, that's always been like that.

Now, about the last issue—this test:

    @patch("jwt.encode", mock.Mock(return_value=b"test"))
    def test_token_encode_should_return_str_for_old_PyJWT(self):
        self.assertIsInstance(TokenBackend("HS256").encode({}), str)

Can I delete it, or should I modify it somehow? I'm not sure what its original purpose was, but it doesn't seem relevant anymore. What do you think?

@Andrew-Chen-Wang
Copy link
Member

We test on both pyjwt 1.7.1 and 2+ where jwt.encode returns a str in 2 and bytes in 1.7.1. The unit test is making sure this line of code returns our jwt as a str since pyjwt 1.7.1 returns bytes


if isinstance(token, bytes):
            # For PyJWT <= 1.7.1
            return token.decode("utf-8")
        # For PyJWT >= 2.0.0a1
        return token

@Andrew-Chen-Wang
Copy link
Member

I'm not sure why that specific test is failing if that's what you're implying, but the test is necessary

@henryfool91
Copy link
Author

@Andrew-Chen-Wang Oh yeah, sorry, I misunderstood the purpose of the test. The issue was caused by the signing key not being specified—this has been fixed in the new commit

@henryfool91
Copy link
Author

"make test" work locally, but it seems there are issues with PyJWT 1.7.1. I'll look into it.

@henryfool91
Copy link
Author

@Andrew-Chen-Wang made a kludge for pyjwt 1.71 support, now tests should be fine i think

@@ -64,6 +65,16 @@ def __init__(
self.leeway = leeway
self.json_encoder = json_encoder

@cached_property
def signing_key(self) -> Any:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised type checking is passing here since I imagine jws_alg.prepare_key is returning a class of some sort. When you make the type change, make sure to update get_verifying_key as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, in the alg abstract class of jwt module it looks like this:

    @abstractmethod
    def prepare_key(self, key: Any) -> Any:
        """
        Performs necessary validation and conversions on the key and returns
        the key value in the proper format for sign() and verify().
        """

In implemented classes there are a variety of possible return types, but i'm not sure if it's the right approach - to specify the return types of all existing implementations rather than use return type of abstract class, what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah that's dumb. I assumed they'd have some form of generic. Thanks for checking

@Andrew-Chen-Wang
Copy link
Member

I still kind of worry with the typing here, and we may need to revert this change in case self.signing_key is assumed by users to always be a string.

What do you think of signing_key remaining as a str and rename the cached property to something else?

regardless, lgtm

@henryfool91
Copy link
Author

@Andrew-Chen-Wang

I still kind of worry with the typing here, and we may need to revert this change in case self.signing_key is assumed by users to always be a string.

What do you think of signing_key remaining as a str and rename the cached property to something else?

regardless, lgtm

Sounds reasonable, I changed this in the last commit and also added caching for the verifying key because, under the hood, it's the same as preparing the signing key for some algorithms. Would you mind taking a look?

@Andrew-Chen-Wang
Copy link
Member

hey thanks so much for being accommodating to suggestions. This lgtm and i'll cut a release soon

@henryfool91
Copy link
Author

hey thanks so much for being accommodating to suggestions. This lgtm and i'll cut a release soon

thank you!
anyway, it looks like there are some problems with tests, gonna look into

@henryfool91
Copy link
Author

@Andrew-Chen-Wang i removed "|" syntax for 3.9 support, hope now everything ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants