Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thrown bad_alloc unhandled during lazy allocation of an application durable thread local error object #915

Closed
ethouris opened this issue Oct 22, 2019 · 4 comments · Fixed by #1718
Assignees
Labels
[core] Area: Changes in SRT library core Priority: Low Type: Bug Indicates an unexpected problem or unintended behavior
Milestone

Comments

@ethouris
Copy link
Collaborator

This fragment, repeated in every wrapper for an API function:

   catch (bad_alloc)
   {
      s_UDTUnited.setError(new CUDTException(MJ_SYSTEMRES, MN_MEMORY, 0));
      return INVALID_SOCK;
   }

It actually catches an exception thrown by the system-default new operator. It is then likely that the next use of new operator will throw this exception again. If this happens, this time another bad_alloc exception will be thrown, and this time this exception isn't handled at all; it will simply make the application abnormally exit and do exactly the thing that this whole set of so loud function wrappers are designed to prevent - the exception will slip through the API layer, which contradicts the basic statement for the SRT API.

The problem isn't easy to solve, as the error is written into the thread-specific data that are allocated dynamically. A nice proposal that probably could be easily implemented is:

  • allow to delete dynamic data and set it to null, which should be next understood as memory allocation error
  • handle bad_alloc with simply deleting the dynamic data and set it to null
  • lazy allocate the memory for the error description, and when once allocated, next setError calls should only overwrite the existing error information

Also, the memory allocation in setError should handle any errors from new operator by itself, that is, set the thread-specific error data to null, regardless of what kind of error was about to be reported. The bad_alloc handlers, again, should only reset the error to null, as well as no explicit call to new operator should appear in any exception handler.

@ethouris ethouris added [core] Area: Changes in SRT library core Priority: Medium Status: Pending Type: Bug Indicates an unexpected problem or unintended behavior labels Oct 22, 2019
@ethouris
Copy link
Collaborator Author

As an extra idea for this one only, the new noexcept feature from C++11 (as optional in case when compiling in C++11 mode) could be used to ensure that the function is even theoretically incapable of throwing an exception.

@ethouris
Copy link
Collaborator Author

One more idea: the object that defines the memory allocation error can be introduced as a global variable and it will be pointed from the thread specific error description. It should be, of course, taken into account before trying to delete the associated object.

@ethouris ethouris added this to the v1.4.2 milestone Nov 4, 2019
@ethouris
Copy link
Collaborator Author

With the changed definition of the thread local error (after merging C++11 sync stuff), the thread local error is no longer using reallocation, but overwrites the application-durable object with the new error specification. Throwing an exception is then prevented in the mentioned cases.

There's one problem remaining, though: there's an uncaught new allocation done by lazy allocation done in the POSIX version of sync; this should be fixed (changing the description).

@ethouris ethouris changed the title Thrown bad_alloc handled by a new allocation Thrown bad_alloc handled by a new allocation during lazy allocation of an application durable thread local error object Jul 22, 2020
@ethouris ethouris changed the title Thrown bad_alloc handled by a new allocation during lazy allocation of an application durable thread local error object Thrown bad_alloc unhandled during lazy allocation of an application durable thread local error object Jul 22, 2020
@maxsharabayko maxsharabayko modified the milestones: v1.5.0, v1.5.0 - Sprint 22 Aug 25, 2020
@ethouris
Copy link
Collaborator Author

ethouris commented Sep 4, 2020

UPDATE: With the new implementation of the "sync" stuff, the problem is likely gone.

@maxsharabayko maxsharabayko modified the milestones: v1.5.0 - Sprint 22, v1.5.0 - Sprint 23 Sep 8, 2020
@maxsharabayko maxsharabayko modified the milestones: v1.5.0 - Sprint 23, v1.5.0 Sep 21, 2020
@maxsharabayko maxsharabayko modified the milestones: v1.5.0, v1.4.3 Oct 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[core] Area: Changes in SRT library core Priority: Low Type: Bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants