Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix an error on SSL config reload (plus some cleanup). (#9334) #9644

Closed
wants to merge 2 commits into from

Conversation

ywkaras
Copy link
Contributor

@ywkaras ywkaras commented Apr 26, 2023

  • Fix an error on SSL config reload (plus some cleanup).

This seems to have eliminated some ERROR diags we were seeing in Yahoo Prod when doing config reloads.

The SSLSecret public functions no longer return pointers into the unorded_map of secrets, they return a copy of the secret data. This seemed thread unsafe. A periodic poll running in the background can update the secret data for an entry for a secret name in the map.

To avoid exporting pointers, I had to change the prototype of TSSslSecretGet(). Hopefully there are no existing plugins that are already using this TS API function, so breaking this rule will be moot. I added a new TS API type, TSAllocatedVarLenData, in order to be able to cleanly return a copy of the secret data.

  1. getOrLoadSecret grabbed the secret_map_mutex and called loadSecret.
  2. loadSecret dispatched to Continations that registered for the TS_EVENT_SSL_SECRET event. This would try to grab the Continuation's lock.
  3. In the meantime, those Continuations could call setSecret which would try to grab the secret_map_mutex. If this Continuation did this while holding the lock that step 2 is waiting upon, then there will be a deadlock between the Continuation lock and the secret_map_mutex between the two threads.

This patch avoids the deadlock by releasing the secret_map_mutex lock before calling loadSecret. It also updates the secret_map when loading secrets from a file in loadSecret.


Co-authored-by: Brian Neradt brian.neradt@verizonmedia.com
(cherry picked from commit 0c2488c)

Conflicts:
src/traffic_server/InkAPI.cc

* Fix an error on SSL config reload (plus some cleanup).

This seems to have eliminated some ERROR diags we were seeing in Yahoo Prod when
doing config reloads.

The SSLSecret public functions no longer return pointers into the unorded_map of
secrets, they return a copy of the secret data. This seemed thread unsafe. A
periodic poll running in the background can update the secret data for an entry
for a secret name in the map.

To avoid exporting pointers, I had to change the prototype of TSSslSecretGet().
Hopefully there are no existing plugins that are already using this TS API function,
so breaking this rule will be moot. I added a new TS API type, TSAllocatedVarLenData,
in order to be able to cleanly return a copy of the secret data.

* YTSATS-4067: Fix deadlock with secret_map_mutex (apache#740)

1. getOrLoadSecret grabbed the secret_map_mutex and called loadSecret.
2. loadSecret dispatched to Continations that registered for the
   TS_EVENT_SSL_SECRET event. This would try to grab the Continuation's
   lock.
3. In the meantime, those Continuations could call setSecret which would
   try to grab the secret_map_mutex. If this Continuation did this while
   holding the lock that step 2 is waiting upon, then there will be a
   deadlock between the Continuation lock and the secret_map_mutex
   between the two threads.

This patch avoids the deadlock by releasing the secret_map_mutex lock
before calling loadSecret. It also updates the secret_map when loading
secrets from a file in loadSecret.

---------

Co-authored-by: Brian Neradt <brian.neradt@verizonmedia.com>
(cherry picked from commit 0c2488c)

Conflicts:
src/traffic_server/InkAPI.cc
@ywkaras ywkaras self-assigned this Apr 26, 2023
@ywkaras ywkaras added this to the 9.2.1 milestone Apr 26, 2023
@ywkaras
Copy link
Contributor Author

ywkaras commented Apr 26, 2023

This is a draft PR, pending a decision whether to remove the TSSslSecretXxx TS API functions.

@ywkaras ywkaras force-pushed the os_pkey_cnf_reload_9.2.x branch from 2616f2b to c14fc14 Compare April 26, 2023 04:20
@apache apache deleted a comment from zwoop Apr 26, 2023
@ywkaras ywkaras closed this Apr 26, 2023
@ywkaras ywkaras deleted the os_pkey_cnf_reload_9.2.x branch April 26, 2023 16:25
@zwoop zwoop removed this from the 9.2.1 milestone Apr 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants