Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of .debug_names lookups when DW_IDX_parent attributes are used #91808

Closed
wants to merge 2 commits into from

Conversation

clayborg
Copy link
Collaborator

@clayborg clayborg commented May 10, 2024

When a .debug_names table has entries that use the DW_IDX_parent attributes, we can end up with entries in the .debug_names table that are not full definitions. This is because a class that is foward declared, can contain types. For example:

0x0090cdbf: DW_TAG_compile_unit
              DW_AT_producer	("clang version 15.0.7")
              DW_AT_language	(DW_LANG_C_plus_plus_14)
              DW_AT_name	("UniqueInstance.cpp")

0x0090cdc7:   DW_TAG_namespace
                DW_AT_name	("std")

0x0090cdd5:     DW_TAG_class_type
                  DW_AT_name	("ios_base")
                  DW_AT_declaration	(true)

0x0090cdd7:       DW_TAG_class_type
                    DW_AT_name	("Init")
                    DW_AT_declaration	(true)

0x0090cdda:       DW_TAG_typedef
                    DW_AT_type	(0x0090ce4e "std::_Ios_Seekdir")
                    DW_AT_name	("seekdir")
                    DW_AT_decl_file	(0x11)
                    DW_AT_decl_line	(479)

0x0090cde4:       DW_TAG_typedef
                    DW_AT_type	(0x0090ce45 "std::_Ios_Openmode")
                    DW_AT_name	("openmode")
                    DW_AT_decl_file	(0x11)
                    DW_AT_decl_line	(447)

0x0090cdee:       DW_TAG_typedef
                    DW_AT_type	(0x0090ce3c "std::_Ios_Iostate")
                    DW_AT_name	("iostate")
                    DW_AT_decl_file	(0x11)
                    DW_AT_decl_line	(416)

0x0090cdf8:       NULL

"std::ios_base" is forward declared and it contains typedefs whose entries in the .debug_names table will point to the DIE at offset 0x0090cdd5. These entries cause our type lookups to try and parse a TON of forward declarations and waste time and resources.

This fix makes sure when/if we find an entry in the .debug_names table, we don't process it if it has a DW_AT_declaration(true) attribute.

@llvmbot
Copy link
Collaborator

llvmbot commented May 10, 2024

@llvm/pr-subscribers-lldb

Author: Greg Clayton (clayborg)

Changes

When a .debug_names table has entries that use the DW_IDX_parent attributes, we can end up with entries in the .debug_names table that are not full definitions. This is because a class that is foward declared, can contain types. For example:

0x0090cdbf: DW_TAG_compile_unit
              DW_AT_producer	("clang version 15.0.7")
              DW_AT_language	(DW_LANG_C_plus_plus_14)
              DW_AT_name	("UniqueInstance.cpp")

0x0090cdc7:   DW_TAG_namespace
                DW_AT_name	("std")

0x0090cdd5:     DW_TAG_class_type
                  DW_AT_name	("ios_base")
                  DW_AT_declaration	(true)

0x0090cdd7:       DW_TAG_class_type
                    DW_AT_name	("Init")
                    DW_AT_declaration	(true)

0x0090cdda:       DW_TAG_typedef
                    DW_AT_type	(0x0090ce4e "std::_Ios_Seekdir")
                    DW_AT_name	("seekdir")
                    DW_AT_decl_file	(0x11)
                    DW_AT_decl_line	(479)

0x0090cde4:       DW_TAG_typedef
                    DW_AT_type	(0x0090ce45 "std::_Ios_Openmode")
                    DW_AT_name	("openmode")
                    DW_AT_decl_file	(0x11)
                    DW_AT_decl_line	(447)

0x0090cdee:       DW_TAG_typedef
                    DW_AT_type	(0x0090ce3c "std::_Ios_Iostate")
                    DW_AT_name	("iostate")
                    DW_AT_decl_file	(0x11)
                    DW_AT_decl_line	(416)

0x0090cdf8:       NULL

"std::ios_base" is forward declared and it contains typedefs whose entries in the .debug_names table will point to the DIE at offset 0x0090cdd5. These entries cause our type lookups to try and parse a TON of forward declarations and waste time and resources.

This fix makes sure when/if we find an entry in the .debug_names table, we don't process it if it has a DW_AT_declaration(true) attribute.


Full diff: https://github.com/llvm/llvm-project/pull/91808.diff

1 Files Affected:

  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp (+4)
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
index 4da0d56fdcacb..f54e8071d97b6 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
@@ -83,6 +83,10 @@ bool DebugNamesDWARFIndex::ProcessEntry(
   DWARFDIE die = dwarf.GetDIE(*ref);
   if (!die)
     return true;
+  // Watch out for forward declarations that appear in the .debug_names tables
+  // only due to being there for a DW_IDX_parent.
+  // if (die.GetAttributeValueAsUnsigned(DW_AT_declaration, 0))
+  //   return true;
   return callback(die);
 }
 

…utes are used.

When a .debug_names table has entries that use the DW_IDX_parent attributes, we can end up with entries in the .debug_names table that are not full definitions. This is because a class that is foward declared, can contain types. For example:

0x0090cdbf: DW_TAG_compile_unit
              DW_AT_producer	("clang version 15.0.7")
              DW_AT_language	(DW_LANG_C_plus_plus_14)
              DW_AT_name	("UniqueInstance.cpp")

0x0090cdc7:   DW_TAG_namespace
                DW_AT_name	("std")

0x0090cdd5:     DW_TAG_class_type
                  DW_AT_name	("ios_base")
                  DW_AT_declaration	(true)

0x0090cdd7:       DW_TAG_class_type
                    DW_AT_name	("Init")
                    DW_AT_declaration	(true)

0x0090cdda:       DW_TAG_typedef
                    DW_AT_type	(0x0090ce4e "std::_Ios_Seekdir")
                    DW_AT_name	("seekdir")
                    DW_AT_decl_file	(0x11)
                    DW_AT_decl_line	(479)

0x0090cde4:       DW_TAG_typedef
                    DW_AT_type	(0x0090ce45 "std::_Ios_Openmode")
                    DW_AT_name	("openmode")
                    DW_AT_decl_file	(0x11)
                    DW_AT_decl_line	(447)

0x0090cdee:       DW_TAG_typedef
                    DW_AT_type	(0x0090ce3c "std::_Ios_Iostate")
                    DW_AT_name	("iostate")
                    DW_AT_decl_file	(0x11)
                    DW_AT_decl_line	(416)

0x0090cdf8:       NULL

"std::ios_base" is forward declared and it contains typedefs whose entries in the .debug_names table will point to the DIE at offset 0x0090cdd5. These entries cause our type lookups to try and parse a TON of forward declarations and waste time and resources.

This fix makes sure when/if we find an entry in the .debug_names table, we don't process it if it has a DW_AT_declaration(true) attribute.
@jeffreytan81
Copy link
Contributor

The change/explanation looks intuitive, but I remember having seen DIE entry with DW_AT_declaration (true) but is not a forward declaration (it is a definition and has other attributes) . Will we cause regression in that case?

@felipepiovezan
Copy link
Contributor

we should probably fix the underlying issue instead: #77696

@clayborg
Copy link
Collaborator Author

The change/explanation looks intuitive, but I remember having seen DIE entry with DW_AT_declaration (true) but is not a forward declaration (it is a definition and has other attributes) . Will we cause regression in that case?

No, it is ok for DW_AT_declaration(true) DIEs to be in the DWARF and contain children. They just won't declare the type itself. In the example above, you can see that "ios_base" contains typedefs.

@clayborg
Copy link
Collaborator Author

we should probably fix the underlying issue instead: #77696

This is one fix that is needed. Quoted from an e-mail chain:

I need to find my notes from those days, but I don't think we did. As Greg points out, the standard forbids forward declarations in debug_names; entries whose parents are forward declarations should be marked as having a parent that is not indexed.
The addition of IDX_parent_entries should not cause the addition of entries in the table, if it does then it is a bug in the implementation.

In fact, in the original PR we have this bit of code: https://github.com/llvm/llvm-project/pull/77457/files#diff-587587ad06ddb6f99f9ad8d8deffbc2ea59fde9d62d1b5ff58ace1f52cc75752R405

std::optional<uint64_t>
 DWARF5AccelTableData::getDefiningParentDieOffset(const DIE &Die) {
   if (auto *Parent = Die.getParent();
       Parent && !Parent->findAttribute(dwarf::Attribute::DW_AT_declaration))
     return Parent->getOffset();
   return {};
 }

@clayborg
Copy link
Collaborator Author

we should probably fix the underlying issue instead: #77696

We still have binaries that are floating around for now that contain this issue and this was causing crashes. So it would be nice to fix this in LLDB for now and back this out after we have a stable and trustable .debug_names section?

@felipepiovezan
Copy link
Contributor

we should probably fix the underlying issue instead: #77696

We still have binaries that are floating around for now that contain this issue and this was causing crashes. So it would be nice to fix this in LLDB for now and back this out after we have a stable and trustable .debug_names section?

That's ok by me, but I still don't understand how the problem being solved here is related to IDX_parent. As I said in the email you quoted, we don't add links to parents that are forward declarations. When the parent chain of a DIE is built, if we see an AT_declaration (just like what is being done in this PR), we pretend the parent is not indexed. In other others, the queries involving IDX_parent will never reach the ProcessEntry function.

Note that only DebugNamesDWARFIndex::GetFullyQualifiedType makes use of IDX_parent, but there are tons of other queries that call ProcessEntry. Maybe the slow path you are reaching is from one of those other queries?

@felipepiovezan
Copy link
Contributor

felipepiovezan commented May 11, 2024

Ok, minor correction: if there is no complete parent chain, it just defaults to the older implementation (which calls ProcessEntry). But my point still stands: I don't believe this is related to the presence of IDX_parent, if anything, the presence of complete IDX_parents prunes the number of calls to ProcessEntry (which was the point of the implementation in the first place)

void DebugNamesDWARFIndex::GetFullyQualifiedType(
    const DWARFDeclContext &context,
    llvm::function_ref<bool(DWARFDIE die)> callback) {
  if (context.GetSize() == 0)
    return;

  llvm::StringRef leaf_name = context[0].name;
  llvm::SmallVector<llvm::StringRef> parent_names;
  for (auto idx : llvm::seq<int>(1, context.GetSize()))
    parent_names.emplace_back(context[idx].name);

  // For each entry, grab its parent chain and check if we have a match.
  for (const DebugNames::Entry &entry :
       m_debug_names_up->equal_range(leaf_name)) {
    if (!isType(entry.tag()))
      continue;

    // Grab at most one extra parent, subsequent parents are not necessary to
    // test equality.
    std::optional<llvm::SmallVector<Entry, 4>> parent_chain =
        getParentChain(entry, parent_names.size() + 1);

    if (!parent_chain) {
      // Fallback: use the base class implementation.
      if (!ProcessEntry(entry, [&](DWARFDIE die) {
            return GetFullyQualifiedTypeImpl(context, die, callback);
          }))
        return;
      continue;
    }

    if (SameParentChain(parent_names, *parent_chain) &&
        !ProcessEntry(entry, callback))
      return;
  }
}

@dwblaikie
Copy link
Collaborator

What's an actual test case for this issue? The example given above doesn't look like it'd produce entries for the declaration of ios_base? Like here's something that looks equivalent, but is complete, and doesn't have a DW_IDX_parent on the nested typedef entry in the index: https://godbolt.org/z/efjbze3x1

it'd be good to have a reproducer for this to motivate the discussion...

@dwblaikie
Copy link
Collaborator

& FWIW, I think it is valid to include these declarations as entries, though not as named/index entries, per the spec:

It is possible that an indexed debugging information entry has a parent that is not indexed (for example, if its parent does not have a name attribute). In such a case, a parent attribute may point to a nameless index entry (that is, one that cannot be reached from any entry in the name table), or it may point to the nearest ancestor that does have an index entry

So I think you could have a parent that's a declaration, represented as a nameless index entry. That'd allow faster comparisons when examining the entry for the named child that had such a parent - because you could potentially check that named entry's fully qualified name directly from the named entry and the parents - without needing to parse all the DIEs in the CU to walk and find the start of the parents.

But I don't believe clang is producing such DWARF today.

@ayermolo
Copy link
Contributor

What is "nameless index entry"? I don't quite understand from the spec.

@clayborg
Copy link
Collaborator Author

What's an actual test case for this issue? The example given above doesn't look like it'd produce entries for the declaration of ios_base? Like here's something that looks equivalent, but is complete, and doesn't have a DW_IDX_parent on the nested typedef entry in the index: https://godbolt.org/z/efjbze3x1

it'd be good to have a reproducer for this to motivate the discussion...

I will see what I can do to repro with a minimal test case. But I do believe that this PR should go in as right now we have binaries with thousands of entries that are being found from the top down from a name entry and causes tons of bogus type resolving lookups to happen.

Copy link
Collaborator

@labath labath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think #77696 is justification enough to add this check, but I don't think we should be implicating DW_IDX_parent_entries until we know how those came about. Both me and David have tried to produce entries like you describe and failed (in fact I was not able to produce a DW_IDX_parent_entries referring to any kind of class type whatsoever (only to namespace entries)).

Even when this check goes in, I think it would be useful to get to the bottom of the DW_IDX_parent_entries issue, if nothing else, then because it may affect the decision of when/if to remove this workaround.

Comment on lines 86 to 87
// Watch out for forward declarations that appear in the .debug_names tables
// only due to being there for a DW_IDX_parent.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Watch out for forward declarations that appear in the .debug_names tables
// only due to being there for a DW_IDX_parent.
// Clang erroneously emits index entries for declaration DIEs in case when the
// definition is in a type unit (llvm.org/pr77696). Weed those out.

@labath
Copy link
Collaborator

labath commented May 21, 2024

What is "nameless index entry"? I don't quite understand from the spec.

It is possible that an indexed debugging information entry has a parent that is
not indexed (for example, if its parent does not have a name attribute). In such a
case, a parent attribute may point to a nameless index entry (that is, one that
cannot be reached from any entry in the name table), or it may point to the
nearest ancestor that does have an index entry.

My interpretation of this paragraph is that we're allowed to insert extra "floating" entries into the debug_names table. Such entries must not be reachable from the hash or name tables, but they can be referred to via DW_IDX_parent_entry attributes of other (reachable) entries.

Using Greg's example from #91808 (comment), I believe we could add a debug_names entry for std::ios_base (and refer to it from the entry for std::ios_base::seekdir), as long as the ios_base entry was not present in any of the name tables.

@felipepiovezan
Copy link
Contributor

I think #77696 is justification enough to add this check, but I don't think we should be implicating DW_IDX_parent_entries until we know how those came about.

This is pretty much my point, the workaround is fine, but I don't think the source-code comment in this PR is. The suggestion provided LGTM.

My guess is that the examples Greg is encountering are just hitting the slowness of DWARF parsing through other code paths (anything that causes too many calls to "ProcessEntries", and IDX_parent addressed one of those, @Michael137 was just asking me the other day if we can address some of the other queries as well).

I believe we could add a debug_names entry for std::ios_base (and refer to it from the entry for std::ios_base::seekdir), as long as the ios_base entry was not present in any of the name tables.

Yup, this is very doable!

I was not able to produce a DW_IDX_parent_entries referring to any kind of class type

When I compile the following with clang++ test.cpp -c -gdwarf-5 -O0 -o - | dwarfdump --debug-names -, I get the right parent chain. Is this not true when you try it?

namespace A { namespace B { struct State { class InnerState{}; }; } }
A::B::State::InnerState get_state() { return A::B::State::InnerState(); }
      String: 0x000000c4 "InnerState"
      Entry @ 0xbe {
        Abbrev: 0x138
        Tag: DW_TAG_class_type
        DW_IDX_die_offset: 0x0000003f
        DW_IDX_parent: Entry @ 0xde
      }
      String: 0x000000be "State"
      Entry @ 0xde {
        Abbrev: 0x9b8
        Tag: DW_TAG_structure_type
        DW_IDX_die_offset: 0x00000039
        DW_IDX_parent: Entry @ 0xe9
      }
      String: 0x000000bc "B"
      Entry @ 0xe9 {
        Abbrev: 0x1cb8
        Tag: DW_TAG_namespace
        DW_IDX_die_offset: 0x00000037
        DW_IDX_parent: Entry @ 0xd7
      }
      String: 0x000000ba "A"
      Entry @ 0xd7 {
        Abbrev: 0x1c98
        Tag: DW_TAG_namespace
        DW_IDX_die_offset: 0x00000035
        DW_IDX_parent: <parent not indexed>
      }

@labath
Copy link
Collaborator

labath commented May 21, 2024

I was not able to produce a DW_IDX_parent_entries referring to any kind of class type

When I compile the following with clang++ test.cpp -c -gdwarf-5 -O0 -o - | dwarfdump --debug-names -, I get the right parent chain. Is this not true when you try it?

I do get it. It was my bad, I had a leftover -fdebug-types-section flag in my test cmd from when I was experimenting with the debug types section. I guess that's another thing to chalk up to #77696

@ayermolo
Copy link
Contributor

ayermolo commented May 23, 2024

Using example above, with a fix by @dwblaikie

I see:

 Hash: 0xE0CDC6A2
      String: 0x00000018 "InnerState"
      Entry @ 0x10b {
        Abbrev: 0x3
        Tag: DW_TAG_class_type
        DW_IDX_type_unit: 0x01
        DW_IDX_die_offset: 0x00000030
      }

Since clang no longer emits entry for:

0x00000057:       DW_TAG_structure_type
                    DW_AT_declaration (true)
                    DW_AT_signature (0xe742f49eeadc2244)

Is this the right behavior?
Current BOLT behavior is to skip that DIE and reference it's parent:

Hash: 0xE0CDC6A2
      String: 0x00000018 "InnerState"
      Entry @ 0x109 {
        Abbrev: 0x3
        Tag: DW_TAG_class_type
        DW_IDX_type_unit: 0x01
        DW_IDX_die_offset: 0x00000030
        DW_IDX_parent: Entry @ 0x147
      }
Entry @ 0x147 {
        Abbrev: 0x7
        Tag: DW_TAG_namespace
        DW_IDX_type_unit: 0x01
        DW_IDX_die_offset: 0x00000025
        DW_IDX_parent: Entry @ 0x126
      }
0x00000055:     DW_TAG_namespace
                  DW_AT_name  ("B")

0x00000057:       DW_TAG_structure_type
                    DW_AT_declaration (true)
                    DW_AT_signature (0xe742f49eeadc2244)

0x00000060:         DW_TAG_class_type
                      DW_AT_calling_convention  (DW_CC_pass_by_value)
                      DW_AT_name  ("InnerState")
                      DW_AT_byte_size (0x01)
                      DW_AT_decl_file ("/main.cpp")
                      DW_AT_decl_line (1)

It doesn't emit entry for this because there is no name attribute

0x00000057:       DW_TAG_structure_type
                    DW_AT_declaration (true)
                    DW_AT_signature (0xe742f49eeadc2244)

@labath
Copy link
Collaborator

labath commented May 23, 2024

Using example above, with a fix by @dwblaikie

I see:

 Hash: 0xE0CDC6A2
      String: 0x00000018 "InnerState"
      Entry @ 0x10b {
        Abbrev: 0x3
        Tag: DW_TAG_class_type
        DW_IDX_type_unit: 0x01
        DW_IDX_die_offset: 0x00000030
      }

Since clang no longer emits entry for:

0x00000057:       DW_TAG_structure_type
                    DW_AT_declaration (true)
                    DW_AT_signature (0xe742f49eeadc2244)

Is this the right behavior?

I would say "yes", because the spec is pretty explicit about excluding DW_AT_declaration entries.

I can see a case being made that DW_AT_signature should be treated the same way as DW_AT_specification and DW_AT_abstract_origin (i.e., transparently), but that's definitely not what the spec says right now.

Current BOLT behavior is to skip that DIE and reference it's parent:

Hash: 0xE0CDC6A2
      String: 0x00000018 "InnerState"
      Entry @ 0x109 {
        Abbrev: 0x3
        Tag: DW_TAG_class_type
        DW_IDX_type_unit: 0x01
        DW_IDX_die_offset: 0x00000030
        DW_IDX_parent: Entry @ 0x147
      }
Entry @ 0x147 {
        Abbrev: 0x7
        Tag: DW_TAG_namespace
        DW_IDX_type_unit: 0x01
        DW_IDX_die_offset: 0x00000025
        DW_IDX_parent: Entry @ 0x126
      }
0x00000055:     DW_TAG_namespace
                  DW_AT_name  ("B")

0x00000057:       DW_TAG_structure_type
                    DW_AT_declaration (true)
                    DW_AT_signature (0xe742f49eeadc2244)

0x00000060:         DW_TAG_class_type
                      DW_AT_calling_convention  (DW_CC_pass_by_value)
                      DW_AT_name  ("InnerState")
                      DW_AT_byte_size (0x01)
                      DW_AT_decl_file ("/main.cpp")
                      DW_AT_decl_line (1)

It doesn't emit entry for this because there is no name attribute

0x00000057:       DW_TAG_structure_type
                    DW_AT_declaration (true)
                    DW_AT_signature (0xe742f49eeadc2244)

This gets a bit fuzzy, I think. The spec appears to allow this behavior (In such a
case, a parent attribute may point to a nameless index entry (...), or it may point to the
nearest ancestor that does have an index entry.
), but I don't think this is particularly useful. I think it would be better to have the parent point to the definition in the type unit (streching the meaning of "parent" in the spec), or use one of those nameless entries to point to the physical (declaration) parent)

(IANAL YMMV)

@felipepiovezan
Copy link
Contributor

felipepiovezan commented May 23, 2024

Since clang no longer emits entry for:

But what does the debug_info section look like? In particular, what is the parent of the class DIE?
If the parent of InnerState is not some kind of entry for State (either a declaration or a definition), IMO Clang is generating incorrect information for the type. What caused Clang to stop emitting these entries?

This gets a bit fuzzy, I think. The spec appears to allow this behavior (In such a case, a parent attribute may point to a nameless index entry (...), or it may point to the nearest ancestor that does have an index entry.), but I don't think this is particularly useful. I think it would be better to have the parent point to the definition in the type unit (streching the meaning of "parent" in the spec), or use one of those nameless entries to point to the physical (declaration) parent)

(IANAL YMMV)

We can discuss this, but I think the point is going to be moot given what I mentioned above. The debug_names section is reflecting the state of debug_info. If the debug_info is saying that State is not a parent of InnerState, the debug_names section is correct in the literal sense, but will produce incorrect results for the query: "find A::B::State::InnerState".

In the case where the declaration is there, debug_names will have correct info for InnerState: it will just say the parent is not indexed and things work out just fine.

have the parent point to the definition in the type unit (streching the meaning of "parent" in the spec),

Why do you say this is stretching the meaning of parent? This looks fine to me, but it seems impossible to emit such debug_names section if the compiler is no longer emitting the declaration with the type signature. (we'd need to check if the emitter code has some way of finding the definition, but also how could it possibly know there is a type between any two Parent-Child nodes? It really feels like we can't just elide the definition).

@ayermolo
Copy link
Contributor

Using example above, with a fix by @dwblaikie
I see:

 Hash: 0xE0CDC6A2
      String: 0x00000018 "InnerState"
      Entry @ 0x10b {
        Abbrev: 0x3
        Tag: DW_TAG_class_type
        DW_IDX_type_unit: 0x01
        DW_IDX_die_offset: 0x00000030
      }

Since clang no longer emits entry for:

0x00000057:       DW_TAG_structure_type
                    DW_AT_declaration (true)
                    DW_AT_signature (0xe742f49eeadc2244)

Is this the right behavior?

I would say "yes", because the spec is pretty explicit about excluding DW_AT_declaration entries.

I can see a case being made that DW_AT_signature should be treated the same way as DW_AT_specification and DW_AT_abstract_origin (i.e., transparently), but that's definitely not what the spec says right now.

Current BOLT behavior is to skip that DIE and reference it's parent:

Hash: 0xE0CDC6A2
      String: 0x00000018 "InnerState"
      Entry @ 0x109 {
        Abbrev: 0x3
        Tag: DW_TAG_class_type
        DW_IDX_type_unit: 0x01
        DW_IDX_die_offset: 0x00000030
        DW_IDX_parent: Entry @ 0x147
      }
Entry @ 0x147 {
        Abbrev: 0x7
        Tag: DW_TAG_namespace
        DW_IDX_type_unit: 0x01
        DW_IDX_die_offset: 0x00000025
        DW_IDX_parent: Entry @ 0x126
      }
0x00000055:     DW_TAG_namespace
                  DW_AT_name  ("B")

0x00000057:       DW_TAG_structure_type
                    DW_AT_declaration (true)
                    DW_AT_signature (0xe742f49eeadc2244)

0x00000060:         DW_TAG_class_type
                      DW_AT_calling_convention  (DW_CC_pass_by_value)
                      DW_AT_name  ("InnerState")
                      DW_AT_byte_size (0x01)
                      DW_AT_decl_file ("/main.cpp")
                      DW_AT_decl_line (1)

It doesn't emit entry for this because there is no name attribute

0x00000057:       DW_TAG_structure_type
                    DW_AT_declaration (true)
                    DW_AT_signature (0xe742f49eeadc2244)

This gets a bit fuzzy, I think. The spec appears to allow this behavior (In such a case, a parent attribute may point to a nameless index entry (...), or it may point to the nearest ancestor that does have an index entry.), but I don't think this is particularly useful. I think it would be better to have the parent point to the definition in the type unit (streching the meaning of "parent" in the spec), or use one of those nameless entries to point to the physical (declaration) parent)

(IANAL YMMV)

For "parent point to the definition in the type unit". That definition will be in a different type unit, correct? Is that allowed? For one entry DW_IDX_type_unit will have one index, and it's parent will have another index. There is also DWP scenario. Where we don't know which TU will be picked. So chain might be pointing to "discarded" TU from a different CU.

So in this particular case what would the nameless entry look like?

@ayermolo
Copy link
Contributor

ayermolo commented May 23, 2024

Since clang no longer emits entry for:

But what does the debug_info section look like? In particular, what is the parent of the class DIE? If the parent of InnerState is not some kind of entry for State (either a declaration or a definition), IMO Clang is generating incorrect information for the type. What caused Clang to stop emitting these entries?

This gets a bit fuzzy, I think. The spec appears to allow this behavior (In such a case, a parent attribute may point to a nameless index entry (...), or it may point to the nearest ancestor that does have an index entry.), but I don't think this is particularly useful. I think it would be better to have the parent point to the definition in the type unit (streching the meaning of "parent" in the spec), or use one of those nameless entries to point to the physical (declaration) parent)
(IANAL YMMV)

We can discuss this, but I think the point is going to be moot given what I mentioned above. The debug_names section is reflecting the state of debug_info. If the debug_info is saying that State is not a parent of InnerState, the debug_names section is correct in the literal sense, but will produce incorrect results for the query: "find A::B::State::InnerState".

In the case where the declaration is there, debug_names will have correct info for InnerState: it will just say the parent is not indexed and things work out just fine.

have the parent point to the definition in the type unit (streching the meaning of "parent" in the spec),

Why do you say this is stretching the meaning of parent? This looks fine to me, but it seems impossible to emit such debug_names section if the compiler is no longer emitting the declaration with the type signature. (we'd need to check if the emitter code has some way of finding the definition, but also how could it possibly know there is a type between any two Parent-Child nodes? It really feels like we can't just elide the definition).

For clang I used example here with TOT clang. Which included this change: bd5c636

namespace A { namespace B { struct State { class InnerState{}; }; } }
A::B::State::InnerState get_state() { return A::B::State::InnerState(); }

int main() {
  return 0;
}

clang++ main.cpp -fuse-ld=lld -g2 -O0 -fdebug-types-section -gpubnames -o main.exe

https://godbolt.org/z/3r9bcYv3Y

@labath
Copy link
Collaborator

labath commented May 29, 2024

Discussed with Pavel, I applied this change to #92328 so we can ensure the DIEs from the index is always definition DIEs and avoid duplicate/expensive checks later.

To elaborate, I suggested Zequan does this, because I think there's consensus that this is a good way to filter out (incorrect) declaration dies from the index, and it's a better (faster) fix than what Zequan had in his PR. It's still worthwhile to figure out where those entries are coming from. We know of one case with type units and that has now been fixed (thanks to David), if there are more, we should try to understand where they are coming from).

vg0204 pushed a commit to vg0204/llvm-project that referenced this pull request May 29, 2024
…ing when parsing declaration DIEs. (llvm#92328)

This reapplies
llvm@9a7262c
(llvm#90663) and added llvm#91808 as a
fix.

It was causing tests on macos to fail because
`SymbolFileDWARF::GetForwardDeclCompilerTypeToDIE` returned the map
owned by this symol file. When there were two symbol files, two
different maps were created for caching from compiler type to DIE even
if they are for the same module. The solution is to do the same as
`SymbolFileDWARF::GetUniqueDWARFASTTypeMap`: inquery
SymbolFileDWARFDebugMap first to get the shared underlying SymbolFile so
the map is shared among multiple SymbolFileDWARF.
@felipepiovezan
Copy link
Contributor

Discussed with Pavel, I applied this change to #92328 so we can ensure the DIEs from the index is always definition DIEs and avoid duplicate/expensive checks later.

To elaborate, I suggested Zequan does this, because I think there's consensus that this is a good way to filter out (incorrect) declaration dies from the index, and it's a better (faster) fix than what Zequan had in his PR. It's still worthwhile to figure out where those entries are coming from. We know of one case with type units and that has now been fixed (thanks to David), if there are more, we should try to understand where they are coming from).

Sounds good!

@felipepiovezan
Copy link
Contributor

thanks for well-written summary @dwblaikie !

@ayermolo
Copy link
Contributor

I have a follow up question.

It sounds like you've answered it yourself. :)

With this kind of debug info, lldb would think this refers to a global entity and return it for queries like ::InnerState. Without that entry, lldb will (correctly) look up the context in the DIE tree, and compare it that way.

Yes, although it wasn't clear to me how lldb would handle it. Thanks for elaborating. :)

@ayermolo
Copy link
Contributor

ayermolo commented May 30, 2024

@felipepiovezan
I have another question.
For the same example.
I see:

Name 4 {
      Hash: 0x2F94396D
      String: 0x00000049 "_Z9get_statev"
      Entry @ 0x112 {
        Abbrev: 0x2
        Tag: DW_TAG_subprogram
        DW_IDX_die_offset: 0x00000023
        DW_IDX_parent: <parent not indexed>
      }
...      
Name 8 {
      Hash: 0x2B607
      String: 0x00000041 "B"
      Entry @ 0x13b {
        Abbrev: 0x7
        Tag: DW_TAG_namespace
        DW_IDX_type_unit: 0x00
        DW_IDX_die_offset: 0x00000025
        DW_IDX_parent: Entry @ 0x112
      }

This seems like a bug no?
Looks like it's confusing

0x00000023:   DW_TAG_namespace
                DW_AT_name  ("A")

In TU 0
With Subprogram at the same (relative offset) in the CU

0x0000006a: Compile Unit: length = 0x0000005c, format = DWARF32, version = 0x0005, unit_type = DW_UT_compile, abbr_offset = 0x0000, addr_size = 0x08 (next unit at 0x000000ca)
...
0x0000008d:   DW_TAG_subprogram

I think it should be pointing to:

String: 0x00000023 "A"
      Entry @ 0x11e {
        Abbrev: 0x4
        Tag: DW_TAG_namespace
        DW_IDX_type_unit: 0x00
        DW_IDX_die_offset: 0x00000023
        DW_IDX_parent: <parent not indexed>
      }

@clayborg
Copy link
Collaborator Author

So I am trying to lookup std::ios_base in my example for a real definitions, but I end up with thousands of forward declarations due to the DWARF snippet in my original summary string.

So when I lookup std::ios_base, I found thousands of DW_AT_declaration entries for std::ios_base that I don't want to propagate beyond the DWARF index class handing out valid matches, so thus this patch. There are entries in the name tables for std::ios_base that are declarations only and it would be great if these are not here, but someone is producing them somehow, I am not sure if this is clang or bolt.

The way I saw this was to enable DWARF logging in LLDB and if I did that I saw thousands of ios_base lookups that failed to produce real types and this was why I created this patch.

Thanks for the detailed explanations above everyone.

I would vote to get this in to reduce churn when doing type lookups, even if we do fix the compiler, linker, and or other tools to not produce them in the future, we still have such binaries floating around from past compilers, linkers and other tools.

@Michael137
Copy link
Member

@clayborg this change went in as part of #92328 so i think we can close this now.

Though there's a test failure linked to that PR currently, meaning it might have to get reverted. We should keep an eye out for that

@ayermolo
Copy link
Contributor

Still would be nice to have a small repro to make sure clang, and BOLT, now does the right thing.

@dwblaikie
Copy link
Collaborator

One easy question would be: do you/your users use -fdebug-types-section? If so, that'd probably explain what you were seeing & you could add some test coverage for that wherever you like (in lldb, presumably, maybe in bolt too). But if you/they don't, then it's unclear where this bad index came from - and it'd be good to know more about the DWARF so we could figure out where else there might be problems like this...

@ayermolo
Copy link
Contributor

One easy question would be: do you/your users use -fdebug-types-section? If so, that'd probably explain what you were seeing & you could add some test coverage for that wherever you like (in lldb, presumably, maybe in bolt too). But if you/they don't, then it's unclear where this bad index came from - and it'd be good to know more about the DWARF so we could figure out where else there might be problems like this...

Yes -fdebug-types-section is the default for us.

ayermolo added a commit to ayermolo/llvm-project that referenced this pull request May 30, 2024
declaration.

Summary:
Previously when an entry was skipped in parent chain a child will point to the
next valid entry in the chain. After discussion in
llvm#91808 this is not very useful.
Changed implemenation so that all the children of the entry that is skipped
won't have DW_IDX_parent.
ayermolo added a commit to ayermolo/llvm-project that referenced this pull request May 30, 2024
declaration.

Summary:
Previously when an entry was skipped in parent chain a child will point to the
next valid entry in the chain. After discussion in
llvm#91808 this is not very useful.
Changed implemenation so that all the children of the entry that is skipped
won't have DW_IDX_parent.
@felipepiovezan
Copy link
Contributor

felipepiovezan commented May 30, 2024

@felipepiovezan I have another question. For the same example. I see:

You are right. The fact that they have the same relative offset tells me that some part of the code is failing to account for TUs. I just checked the printing code in the hope that it was a mistake while dumping, but it doesn't seem to be...

@ayermolo
Copy link
Contributor

@felipepiovezan I have another question. For the same example. I see:

You are right. The fact that they have the same relative offset tells me that some part of the code is failing to account for TUs. I just checked the printing code in the hope that it was a mistake while dumping, but it doesn't seem to be...

yeah not print tools, at assembly it points to wrong thing. :(

Do you think you would be able to fix? :D

@dwblaikie
Copy link
Collaborator

I did some looking around on the CU/TU collisison and found/filed this: #93886 - should be more explicit/clearer

@clayborg
Copy link
Collaborator Author

clayborg commented Jun 3, 2024

This fix was committed as part of:

commit 51dd4ea
Author: Zequan Wu zequanwu@google.com
Date: Tue May 28 11:49:07 2024 -0400

 Reapply [lldb][DWARF] Delay struct/class/union definition DIE searching when parsing declaration DIEs. (#92328)

This reapplies
https://github.com/llvm/llvm-project/commit/9a7262c2601874e5aa64c5db19746770212d4b44
(#90663) and added https://github.com/llvm/llvm-project/pull/91808 as a
fix.

It was causing tests on macos to fail because
`SymbolFileDWARF::GetForwardDeclCompilerTypeToDIE` returned the map
owned by this symol file. When there were two symbol files, two
different maps were created for caching from compiler type to DIE even
if they are for the same module. The solution is to do the same as
`SymbolFileDWARF::GetUniqueDWARFASTTypeMap`: inquery
SymbolFileDWARFDebugMap first to get the shared underlying SymbolFile so
the map is shared among multiple SymbolFileDWARF.

@clayborg clayborg closed this Jun 3, 2024
ayermolo added a commit that referenced this pull request Jun 5, 2024
…claration. (#93865)

Previously when an entry was skipped in parent chain a child will point
to the next valid entry in the chain. After discussion in
#91808 this is not very useful.
Changed implemenation so that all the children of the entry that is
skipped won't have DW_IDX_parent.
@labath
Copy link
Collaborator

labath commented Jun 6, 2024

PSA: I've reverted Zequan's patch due to other bugs, so this change is now uncommitted again. Since this could go in as a separate change, I'm going to recreate a patch for it tomorrow (running out of time today), including the fix from #94400, if someone doesn't beat me to it. I'm sorry about all the back and forth.

labath added a commit to labath/llvm-project that referenced this pull request Jun 7, 2024
This makes sure we try to process declaration DIEs that are erroneously
present in the index. Until bd5c636, clang was emitting index
entries for declaration DIEs with DW_AT_signature attributes. This makes
sure to avoid returning those DIEs as the definitions of a type, but
also makes sure to pass through DIEs referring to static constexpr
member variables, which is a (probably nonconforming) extension used by
dsymutil.

It adds test cases for both of the scenarios. It is essentially a
recommit of llvm#91808.
labath added a commit that referenced this pull request Jun 11, 2024
This makes sure we try to process declaration DIEs that are erroneously
present in the index. Until bd5c636, clang was emitting index
entries for declaration DIEs with DW_AT_signature attributes. This makes
sure to avoid returning those DIEs as the definitions of a type, but
also makes sure to pass through DIEs referring to static constexpr
member variables, which is a (probably nonconforming) extension used by
dsymutil.

It adds test cases for both of the scenarios. It is essentially a
recommit of #91808.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants