Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refs 14322 Refactor SecurityManager synchronization <master> #2682

Merged
merged 34 commits into from
Jun 6, 2022

Conversation

MiguelBarro
Copy link
Contributor

Description

For sanitizer CI see #2628

Refactor Guidelines:

  • What to do with const-use of members and references to the plugins?

    • the pointer members are initialized on init() which is called from participant constructor.
    • the pointer members are removed on destroy() which is called from ~SecurityManager() from the participant destructor.
    • meanwhile discovered_participant and on_process_handshake are called from transport threads and leads to a myriad of member calls. The issue here is that destroy is called concurrently with those transport calls and leads to a race.

    The initial approach was to promote them to smart pointers able to guarantee unique getters and setters see
    but this 2014 proposal of an atomic<unique_ptr> had no success. Though on 2017 the standard admitted
    __cpp_lib_atomic_shared_ptr (c++20) -> atomic<shared_ptr<T>>

    Finally a sentry based approach was favoured. The sentry requirements are:

    • It protects the plugins and other pointer based ancillary that doesn't require sync per-se but only nullptr checking.
    • It uses RAII strategy, thus it must be a method that returns a temporary object. While this object is alive the pointers cannot be modified.
    • The SecurityManager::destroy() method should wait till all the transport driven calls are done.
      Note that all methods call from destroy() must be sentry free to allow their execution.
      Note also that the destroy() must always called from a sentry free stack.
  • Collections must follow C++ const conventions. For example:

           std::map<GUID_t, DiscoveredParticipantInfo> discovered_participants_;

    in C++ philosophy this implies that the collection elements are like members and cannot be modified from SecurityManager const methods. This obviously doesn't fit in the actual use of them. Enconding/decoding operations should be const for the SecurityManager because it doesn't change its state but these members state. I'm going to promote them to:

           std::map<GUID_t, unique_ptr<DiscoveredParticipantInfo>> discovered_participants_;
  • Promote current mutex to shared_mutex. The write lock to protect changes on members (like collections update) and references (new/delete).

  • Collection members update must resort to their own sync methods. This was the premise before seeing how
    DiscoveredParticipantInfo guarantees atomical update of its AuthenticationInfo members by using unique_ptr<> in
    the getters and setters (though the getters and setters must use another sync mechanism).

  • Note that the previous mutex was recursive. Now care must be taken to avoid reentrancy.

After testing these changes it was detected that the sentry approach cannot prevent data removal due to discovery callbacks.
That is when a participant or endpoint was removed nothing prevent its encoding keys from being removed while other threads were using them to encode/decode a message.
In order to prevent this a new handle strategy was enforced:

  • Reference counting was added to the keys and other info required for encoding/decoding. This way even if the participant or endpoint info is removed only when the encode/decode operations are finished the actual removal would take place.
  • The handles cannot be created arbitrarily as before but class factories are assigned. This way the heap allocations and deallocations are prevented outside of the controlled factory code. Using this strategy the reference counting block uniqueness is assured.

Contributor Checklist

  • Commit messages follow the project guidelines.
  • The code follows the style guidelines of this project.
  • NA Tests that thoroughly check the new feature have been added/Regression tests checking the bug and its fix have been added.
  • Any new/modified methods have been properly documented using Doxygen.
  • Fast DDS test suite has been run locally.
  • Changes are ABI compatible.
  • Changes are API compatible.
  • Documentation builds and tests pass locally.
  • NA New feature has been added to the versions.md file (if applicable).
  • NA New feature has been documented/Current behavior is correctly described in the documentation.

Reviewer Checklist

  • Check contributor checklist is correct.
  • Check CI results: changes do not issue any warning.
  • Check CI results: failing tests are unrelated with the changes.

jsan-rt and others added 10 commits May 3, 2022 12:18
Signed-off-by: Javier Santiago <javiersantiago@eprosima.com>
* Refs #14285: Fixed typo. Updated working directory to be in line with master branch to prevent fuzzhash mismatches due to path changes.

Signed-off-by: Javier Santiago <javiersantiago@eprosima.com>

* Refs #14285: Adjusted generated file path

Signed-off-by: Javier Santiago <javiersantiago@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
…train workaround

Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Company <MiguelCompany@eprosima.com>
Signed-off-by: Miguel Company <MiguelCompany@eprosima.com>
@MiguelBarro MiguelBarro changed the title Test/deadlock/securitymanager Refs 14322 Refactor SecurityManager synchronization May 18, 2022
@MiguelBarro MiguelBarro changed the title Refs 14322 Refactor SecurityManager synchronization Refs 14322 Refactor SecurityManager synchronization <master> May 19, 2022
@EduPonz EduPonz added this to the v2.6.1 milestone Jun 1, 2022
MiguelBarro and others added 16 commits June 2, 2022 14:22
…ature/tsan/fixes> (#2600)

* Refs #14285: Trigger workflow

* Refs #14354: Trigger workflows

Signed-off-by: Javier Santiago <javiersantiago@eprosima.com>

* Refs #14069: Trigger workflow

Signed-off-by: Javier Santiago <javiersantiago@eprosima.com>

* Refs 14237. Prevent false-positive TSan deadlock report.

Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>

* Refs #14069: Workflow trigger

Signed-off-by: Javier Santiago <javiersantiago@eprosima.com>

* Refs #14069: fixing 4134 duplication to trim the table.

Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>

* Refs 14237. Enforcing BuiltinProtocols proper none discovery handling.

Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>

Co-authored-by: Javier Santiago <javiersantiago@eprosima.com>
Signed-off-by: Javier Santiago <javiersantiago@eprosima.com>
Signed-off-by: Javier Santiago <javiersantiago@eprosima.com>
Signed-off-by: Javier Santiago <javiersantiago@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Barro <miguel.barro@live.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Miguel Barro added 5 commits June 2, 2022 14:30
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
@MiguelBarro MiguelBarro force-pushed the test/deadlock/securitymanager branch from 52fe3cb to 53ee5b3 Compare June 2, 2022 12:31
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
@MiguelCompany
Copy link
Member

@richiprosima Please test mac

Miguel Barro added 2 commits June 6, 2022 08:53
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
Signed-off-by: Miguel Barro <miguelbarro@eprosima.com>
@MiguelBarro MiguelBarro added the no-test Skip CI tests if PR marked with this label label Jun 6, 2022
@MiguelCompany MiguelCompany added the no-aarch Skip build & test for aarch64 label Jun 6, 2022
@MiguelCompany
Copy link
Member

@richiprosima Please test this

Copy link
Contributor

@jsan-rt jsan-rt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MiguelCompany MiguelCompany merged commit 4c46abf into master Jun 6, 2022
@MiguelCompany MiguelCompany deleted the test/deadlock/securitymanager branch June 6, 2022 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
first-to-merge no-aarch Skip build & test for aarch64 no-test Skip CI tests if PR marked with this label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants