Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libunwind] use __unw_add_dynamic_eh_frame_section to register dynamic eh_frame section cause segment fault #76957

Closed
SihangZhu opened this issue Jan 4, 2024 · 12 comments

Comments

@SihangZhu
Copy link
Contributor

SihangZhu commented Jan 4, 2024

In libgcc, we register the eh_frame section of the live patching. We can use the __register_frame interface. This libunwind library also provides __register_frame and __deregister_frame functions, but they are aliases for __unw_add_dynamic_fde and __unw_remove_dynamic_fde and thus can only take a single FDE. I found the __unw_add_dynamic_eh_frame_section function, but during live patching, the eh_frame section may be followed by all 0. Scenario, the content of all 0 will be regarded as legal CIE.
We will fall into an infinite loop or access illegal memory.
the data of patchArea as below
image

Is it better for this function to have length of eh_frame section as a formal parameter?

@EugeneZelenko EugeneZelenko added question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead! libunwind and removed new issue labels Jan 4, 2024
@SihangZhu
Copy link
Contributor Author

ping

@MaskRay
Copy link
Member

MaskRay commented Jan 7, 2024

__unw_add_dynamic_eh_frame_section is from https://reviews.llvm.org/D111863 (2021) by @housel for compiler-rt/lib/orc/elfnix_platform.cpp (CC @lhames). Perhaps it's fine to change the signature if compiler-rt/lib/orc/elfnix_platform.cpp is the only use?

@lhames
Copy link
Contributor

lhames commented Jan 7, 2024

Scenario, the content of all 0 will be regarded as legal CIE.

That sounds like a bug to me. My understanding is that Linux eh-frame sections should end with a null terminator, and that zero is never a legal start for a CIE (which always has a non-zero length).

On top-of-tree, __unw_add_dynamic_eh_frame_section is defined as:

void __unw_add_dynamic_eh_frame_section(unw_word_t eh_frame_start) {
  // The eh_frame section start serves as the mh_group                                                                                                                                                                                                                                                                        
  unw_word_t mh_group = eh_frame_start;
  CFI_Parser<LocalAddressSpace>::CIE_Info cieInfo;
  CFI_Parser<LocalAddressSpace>::FDE_Info fdeInfo;
  auto p = (LocalAddressSpace::pint_t)eh_frame_start;
  while (true) {
    if (CFI_Parser<LocalAddressSpace>::decodeFDE(
            LocalAddressSpace::sThisAddressSpace, p, &fdeInfo, &cieInfo,
            true) == NULL) {
      DwarfFDECache<LocalAddressSpace>::add((LocalAddressSpace::pint_t)mh_group,
                                            fdeInfo.pcStart, fdeInfo.pcEnd,
                                            fdeInfo.fdeStart);
      p += fdeInfo.fdeLength;
    } else if (CFI_Parser<LocalAddressSpace>::parseCIE(
                   LocalAddressSpace::sThisAddressSpace, p, &cieInfo) == NULL) {
      p += cieInfo.cieLength;
    } else
      return;
  }
}

decodeFDE looks ok, but parseCIE's length check is incompatible with the code above -- it's returning "null" on zero, which is being interpreted as success, leading to an increment by an undefined cieInfo.cieLength value:

  ...
  pint_t cieLength = (pint_t)addressSpace.get32(p);
  p += 4;
  pint_t cieContentEnd = p + cieLength;
  if (cieLength == 0xffffffff) {
    // 0xffffffff means length is really next 8 bytes
    cieLength = (pint_t)addressSpace.get64(p);
    p += 8;
    cieContentEnd = p + cieLength;
  }
  if (cieLength == 0)
    return NULL;
  ...

We'll have to check other users of parseCIE -- we either need to guard the use in __unw_add_dynamic_eh_frame_section with a null-check or start treating CIEs that start with zero as an error. I suspect we want the former.

@lhames
Copy link
Contributor

lhames commented Jan 7, 2024

@SihangZhu I think the fix may be as simple as:

diff --git a/libunwind/src/libunwind.cpp b/libunwind/src/libunwind.cpp
index cd610377b63d..54083c4ab21f 100644
--- a/libunwind/src/libunwind.cpp
+++ b/libunwind/src/libunwind.cpp
@@ -324,7 +324,7 @@ void __unw_add_dynamic_eh_frame_section(unw_word_t eh_frame_start) {
   CFI_Parser<LocalAddressSpace>::CIE_Info cieInfo;
   CFI_Parser<LocalAddressSpace>::FDE_Info fdeInfo;
   auto p = (LocalAddressSpace::pint_t)eh_frame_start;
-  while (true) {
+  while (LocalAddressSpace::sThisAddressSpace.get32(p)) {
     if (CFI_Parser<LocalAddressSpace>::decodeFDE(
             LocalAddressSpace::sThisAddressSpace, p, &fdeInfo, &cieInfo,
             true) == NULL) {

Does that change fix your issue?

@SihangZhu
Copy link
Contributor Author

SihangZhu commented Jan 8, 2024

@SihangZhu I think the fix may be as simple as:

diff --git a/libunwind/src/libunwind.cpp b/libunwind/src/libunwind.cpp
index cd610377b63d..54083c4ab21f 100644
--- a/libunwind/src/libunwind.cpp
+++ b/libunwind/src/libunwind.cpp
@@ -324,7 +324,7 @@ void __unw_add_dynamic_eh_frame_section(unw_word_t eh_frame_start) {
   CFI_Parser<LocalAddressSpace>::CIE_Info cieInfo;
   CFI_Parser<LocalAddressSpace>::FDE_Info fdeInfo;
   auto p = (LocalAddressSpace::pint_t)eh_frame_start;
-  while (true) {
+  while (LocalAddressSpace::sThisAddressSpace.get32(p)) {
     if (CFI_Parser<LocalAddressSpace>::decodeFDE(
             LocalAddressSpace::sThisAddressSpace, p, &fdeInfo, &cieInfo,
             true) == NULL) {

Does that change fix your issue?

Thanks for your reply. In some application scenarios, there may be illegal memory behind the .eh_frame section, and trying to access it in a while loop will still cause segment fault.

@lhames
Copy link
Contributor

lhames commented Jan 8, 2024

HI @SihangZhu. It is not legal on Linux (as I understand it) to have an empty __eh_frame section. If the section exists it should include a null-terminator. If it does not exist it should not be registered.

@lhames
Copy link
Contributor

lhames commented Jan 8, 2024

It's worth noting that the null terminator is added by the static linker: ELF relocatable object files will have non-null-terminated __eh_frame sections, but these aren't expected to be registered with libunwind. JIT linkers are expected to append the null terminator similarly to static linkers.

@SihangZhu
Copy link
Contributor Author

HI @SihangZhu. It is not legal on Linux (as I understand it) to have an empty __eh_frame section. If the section exists it should include a null-terminator. If it does not exist it should not be registered.

Maybe the scenario I described before wasn't detailed enough, that's my fault.
In the embedded scenario, live patching is applied to the running program, and the code and data parts are stored in the hot patch according to a specific arrangement. When registering patches, its content will be loaded into memory, and the __unw_add_dynamic_eh_frame_section function will be called to register .eh_frame, but it is not known whether the memory address after .eh_frame during the patch area is accessible. Therefore, after accessing the .eh_frame section of the patch area in a loop and continuing to access the 32-bit memory, a segment fault may occur.
In this scenario, adding length information may be safer.

@lhames
Copy link
Contributor

lhames commented Jan 8, 2024

@SihangZhu What I'm saying is that the __eh_frame that you're describing isn't legal in a linked image / JIT'd output. If you're able to change the patch format to include a null terminator then that should be done. If you can't change the patch format, but are able to discern the length by some other means then it seems like the best solution would be to modify your registration logic to use:

  if (eh_frame_length > 0)
    __unw_add_dynamic_eh_frame_section(eh_frame_start);

Apologies -- I'm still half thinking in Darwin where the registration logic is different. The above won't help for non-empty __eh_frames.

I think the only valid solution is to add a null-terminator to the __eh_frame in the patch format.

@SihangZhu
Copy link
Contributor Author

@SihangZhu What I'm saying is that the __eh_frame that you're describing isn't legal in a linked image / JIT'd output. If you're able to change the patch format to include a null terminator then that should be done. If you can't change the patch format, but are able to discern the length by some other means then it seems like the best solution would be to modify your registration logic to use:

  if (eh_frame_length > 0)
    __unw_add_dynamic_eh_frame_section(eh_frame_start);

Specifically, in the picture I attached at the beginning, the eh_frame section saved in the patch is a legal one. The address in the memory is 0xffff891f9040 to 0xffff891f90a8, and its length is 104. If the address following 0xffff891f90a8 is inaccessible, problems will occur.
20240104-191637(WeLinkPC)

@lhames
Copy link
Contributor

lhames commented Jan 8, 2024

@SihangZhu: Looking at the section above I don't think it is legal. My read is that you have:

CIE 0: [0xffff891f9040, 0xffff891f9054)
FDE 0: [0xffff891f9054, 0xffff891f9068)
FDE 1: [0xffff891f9068, 0xffff891f907c)
FDE 3: [0xffff891f907c, 0xffff891f9090)
FDE 4: [0xffff891f9090, 0xffff891f90a8)

There is no null terminator at the end, so the section is not a legal loaded eh-frame section. You need to add a 4-byte null-terminator at the end.

@SihangZhu
Copy link
Contributor Author

SihangZhu commented Jan 9, 2024

Looking at the section above I don't think it is legal. My read is that you have

The original data of eh_frame in the patch area comes from a object file and has not been linked. The difference from the original is that relocation has been repaired. I just went back and looked at the relevant implementation in libgcc. It uses the fde structure to parse the memory. Indeed, if there is illegal memory after the last fde in eh_frame, segment fault will also occur in libgcc.
Thanks for your reply and explanation.

The difference from the original is that the relocation has been repaired and a terminator has been added at the end.

@SihangZhu SihangZhu changed the title [libunwind] [libunwind] use __unw_add_dynamic_eh_frame_section to register dynamic eh_frame section cause segment fault Jan 9, 2024
hstk30-hw pushed a commit that referenced this issue Jan 16, 2024
Fix this issue
[#76957](#76957)
Libgcc provides __register_frame to register a dynamic .eh_frame
section, while __unw_add_dynamic_eh_frame_section can be used to do the
same in libunwind. However, the address after dynamic .eh_frame are
padding with 0 value, it will be identified as
legal CIE. And __unw_add_dynamic_eh_frame_section will continue to parse
subsequent addresses until illegal memory or other sections are
accessed.
This patch adds length formal parameter for dynamic registration.
@EugeneZelenko EugeneZelenko removed the question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead! label Jan 26, 2024
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this issue Jan 28, 2024
Fix this issue
[llvm#76957](llvm#76957)
Libgcc provides __register_frame to register a dynamic .eh_frame
section, while __unw_add_dynamic_eh_frame_section can be used to do the
same in libunwind. However, the address after dynamic .eh_frame are
padding with 0 value, it will be identified as
legal CIE. And __unw_add_dynamic_eh_frame_section will continue to parse
subsequent addresses until illegal memory or other sections are
accessed.
This patch adds length formal parameter for dynamic registration.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants