Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldb][DWARFUnit] Implement PeekDIEName query #78486

Merged
merged 5 commits into from
Jan 20, 2024

Conversation

felipepiovezan
Copy link
Contributor

This allows us to query the AT_Name of a DIE without parsing the entire CU.

Part of the ongoing effort to support IDX_Parent in accelerator tables 1.

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 17, 2024

@llvm/pr-subscribers-lldb

Author: Felipe de Azevedo Piovezan (felipepiovezan)

Changes

This allows us to query the AT_Name of a DIE without parsing the entire CU.

Part of the ongoing effort to support IDX_Parent in accelerator tables 1.


Full diff: https://github.com/llvm/llvm-project/pull/78486.diff

5 Files Affected:

  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp (+7)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.h (+5)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp (+8)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.h (+5)
  • (modified) lldb/unittests/SymbolFile/DWARF/DWARFDIETest.cpp (+58)
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
index 553b6a4c551d20..775b7a2e73f512 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
@@ -191,3 +191,10 @@ DWARFDebugInfo::GetDIE(const DIERef &die_ref) {
     return cu->GetNonSkeletonUnit().GetDIE(die_ref.die_offset());
   return DWARFDIE(); // Not found
 }
+
+llvm::StringRef
+DWARFDebugInfo::PeekDIEName(const DIERef &die_ref) {
+  if(DWARFUnit *cu = GetUnit(die_ref))
+    return cu->GetNonSkeletonUnit().PeekDIEName(die_ref.die_offset());
+  return llvm::StringRef();
+}
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.h b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.h
index d5e48f312ea0e9..a8b5abc3beed2d 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.h
@@ -43,6 +43,11 @@ class DWARFDebugInfo {
   bool ContainsTypeUnits();
   DWARFDIE GetDIE(const DIERef &die_ref);
 
+  /// Returns the AT_Name of this DIE, if it exists, without parsing the entire
+  /// compile unit. An empty is string is returned upon error or if the
+  /// attribute is not present.
+  llvm::StringRef PeekDIEName(const DIERef &die_ref);
+
   enum {
     eDumpFlag_Verbose = (1 << 0),  // Verbose dumping
     eDumpFlag_ShowForm = (1 << 1), // Show the DW_form type
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
index 0e2f4d45543bb5..7db279ed37d04a 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
@@ -663,6 +663,14 @@ DWARFUnit::GetDIE(dw_offset_t die_offset) {
   return DWARFDIE(); // Not found
 }
 
+llvm::StringRef DWARFUnit::PeekDIEName(dw_offset_t die_offset) {
+  const DWARFDataExtractor &data = GetData();
+  DWARFDebugInfoEntry die;
+  if (!die.Extract(data, this, &die_offset))
+    return llvm::StringRef();
+  return die.GetName(this);
+}
+
 DWARFUnit &DWARFUnit::GetNonSkeletonUnit() {
   ExtractUnitDIEIfNeeded();
   if (m_dwo)
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.h b/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.h
index 3f528e913d8cfa..bc225a52e1d030 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.h
@@ -187,6 +187,11 @@ class DWARFUnit : public UserID {
 
   DWARFDIE GetDIE(dw_offset_t die_offset);
 
+  /// Returns the AT_Name of the DIE at `die_offset`, if it exists, without
+  /// parsing the entire compile unit. An empty is string is returned upon
+  /// error or if the attribute is not present.
+  llvm::StringRef PeekDIEName(dw_offset_t die_offset);
+
   DWARFUnit &GetNonSkeletonUnit();
 
   static uint8_t GetAddressByteSize(const DWARFUnit *cu);
diff --git a/lldb/unittests/SymbolFile/DWARF/DWARFDIETest.cpp b/lldb/unittests/SymbolFile/DWARF/DWARFDIETest.cpp
index 8497855b2f3db5..074873b65bc8dd 100644
--- a/lldb/unittests/SymbolFile/DWARF/DWARFDIETest.cpp
+++ b/lldb/unittests/SymbolFile/DWARF/DWARFDIETest.cpp
@@ -7,6 +7,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "Plugins/SymbolFile/DWARF/DWARFDIE.h"
+#include "Plugins/SymbolFile/DWARF/DWARFDebugInfo.h"
 #include "TestingSupport/Symbol/YAMLModuleTester.h"
 #include "llvm/ADT/STLExtras.h"
 #include "gmock/gmock.h"
@@ -104,3 +105,60 @@ TEST(DWARFDIETest, ChildIteration) {
   DWARFDIE no_children_die(unit, die_child0);
   EXPECT_TRUE(no_children_die.children().empty());
 }
+
+TEST(DWARFDIETest, PeekName) {
+  const char *yamldata = R"(
+--- !ELF
+FileHeader:
+  Class:   ELFCLASS64
+  Data:    ELFDATA2LSB
+  Type:    ET_EXEC
+  Machine: EM_386
+DWARF:
+  debug_str:
+    - 'NameType1'
+    - 'NameType2'
+  debug_abbrev:
+    - Table:
+        - Code:            0x00000001
+          Tag:             DW_TAG_compile_unit
+          Children:        DW_CHILDREN_yes
+          Attributes:
+            - Attribute:       DW_AT_language
+              Form:            DW_FORM_data2
+        - Code:            0x00000002
+          Tag:             DW_TAG_base_type
+          Children:        DW_CHILDREN_no
+          Attributes:
+            - Attribute:       DW_AT_name
+              Form:            DW_FORM_strp
+  debug_info:
+    - Version:         4
+      AddrSize:        8
+      Entries:
+        - AbbrCode:        0x00000001
+          Values:
+            - Value:           0x000000000000000C
+        - AbbrCode:        0x00000002
+          Values:
+            - Value:           0x0000000000000000 # Name = NameType1
+        - AbbrCode:        0x00000002
+          Values:
+            - Value:           0x000000000000000a # Name = NameType2
+        - AbbrCode:        0x00000000
+)";
+
+  YAMLModuleTester t(yamldata);
+  auto *symbol_file =
+      llvm::cast<SymbolFileDWARF>(t.GetModule()->GetSymbolFile());
+  auto &debug_info = symbol_file->DebugInfo();
+
+  DIERef first_die(std::nullopt, DIERef::Section::DebugInfo, 11 /*FirstDIEOffset*/);
+  EXPECT_EQ(debug_info.PeekDIEName(first_die), "");
+
+  DIERef second_die(std::nullopt, DIERef::Section::DebugInfo, 14);
+  EXPECT_EQ(debug_info.PeekDIEName(second_die), "NameType1");
+
+  DIERef third_die(std::nullopt, DIERef::Section::DebugInfo, 19);
+  EXPECT_EQ(debug_info.PeekDIEName(third_die), "NameType2");
+}

This allows us to query the AT_Name of a DIE without parsing the entire CU.

Part of the ongoing effort to support IDX_Parent in accelerator tables [1].

[1]: https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151/44
Copy link

github-actions bot commented Jan 17, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Comment on lines 667 to 672
const DWARFDataExtractor &data = GetData();
DWARFDebugInfoEntry die;
if (!die.Extract(data, this, &die_offset))
return llvm::StringRef();
return die.GetName(this);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option here is to check if all of the DIEs have been already parsed in the DWARFUnit and if so, grab the existing DIE, and if it hasn't been parsed do this.

This approach doesn't take care of cases where the current DIE doesn't have a DW_AT_name, but it has a DW_AT_specification or DW_AT_abstract_origin attribute that points to another DIE that does have a name. In this case we will get no name back from this function because it doesn't recurse into the DW_AT_specification or DW_AT_abstract_origin die.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option here is to check if all of the DIEs have been already parsed in the DWARFUnit and if so, grab the existing DIE, and if it hasn't been parsed do this.

Correct me if I am wrong, but even if we take an already parsed DIE, won't we be doing the exact same thing? It eventually calls DWARFDebugInfoEntry::GetAttributeValue, which is a more complicated way of doing the above (but it still creates the extractor, etc...)

This approach doesn't take care of cases where the current DIE doesn't have a DW_AT_name, but it has a DW_AT_specification or DW_AT_abstract_origin

Oh great point, we need this for sure! I will update the implementation and add tests for it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the current approach should take care of that already; we call DWARFDebugInfoEntry::GetName, whose implementation is:

const char *DWARFDebugInfoEntry::GetName(const DWARFUnit *cu) const {
  return GetAttributeValueAsString(cu, DW_AT_name, nullptr, true);
}

This is going to look through AT_specification and AT_abstract_origin.

I should still add a test to verify this though!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the attribute isn't in the current DIE (it is in a DW_AT_specification or DW_AT_abstract_origin), the DWARFDebugInfoEntry::GetAttributeValue(...) function will just run this code:

    if (GetAttributeValue(cu, DW_AT_specification, form_value)) {
      DWARFDIE die = form_value.Reference();
      if (die) {

or

    if (GetAttributeValue(cu, DW_AT_abstract_origin, form_value)) {
      DWARFDIE die = form_value.Reference();
      if (die) {

Both of these go back to the DWARFDIE class which will cause all of the DIEs to be parsed in the DWARFUnit anyway. So the peek function will work, it will just do a lot more work than you expected. Note that if the DW_AT_specification or DW_AT_abstract_origin attributes can represent a DIE from another compile unit of they use the DW_FORM_ref_addr, which can point across compile units. So it will take some more work to make the DWARFUnit::PeekDIEName work and be as efficient as needed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not saying that this works needs to be done, just pointing out that this might not be as efficient as you want it to be and might cause all DIEs to be parsed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh you are right, my bad, I had misinterpreted your original comment.

So, to summarize, the current implementation is technically correct but not correct in its original intent, i.e. it will be slow in the presence of AT_specification/AT_abstract_origin.

There are two improvements we need to do here:

  1. Avoid going through the DWARFDIE class when AT_specification/AT_abstract_origin are references inside the same unit
  2. Figure out if we can also be fast when ref_addr is used.

I will explore those today and come back here

@felipepiovezan
Copy link
Contributor Author

Added tests for AT_specification and AT_abstract origin.

@clayborg I didn't change the implementation because IIUC the "parsed DIE" is merely a structure with computed parent/child/sibling relationship, we don't build any data structures for the attributes themselves.

If we were to go through the parsed DIE (i.e. DWARFDIE instead of DIERef), the implementation will still boil down to the same thing, so we would be adding one more code path to PeekDIEName without any benefits. Does that make sense?

@felipepiovezan
Copy link
Contributor Author

@clayborg with your feedback, I think I was able to solve the issue and reuse existing code.
Please have a look at the latest commit!

if (!die.Extract(GetData(), this, &die_offset))
return llvm::StringRef();

// Does die contain an AT_Name?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Does die contain an AT_Name?
// Does die contain a DW_AT_name?

makes grepping more reliable

@felipepiovezan
Copy link
Contributor Author

Make comments more grep-able.

Copy link
Collaborator

@clayborg clayborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now this is safe and all my issues are gone. Thanks for improving.

@felipepiovezan
Copy link
Contributor Author

Now this is safe and all my issues are gone. Thanks for improving.

Thanks for spotting the issue with the first implementation!

@felipepiovezan felipepiovezan merged commit 4684507 into llvm:main Jan 20, 2024
4 checks passed
@felipepiovezan felipepiovezan deleted the felipe/peekdie branch January 20, 2024 00:11
@clayborg
Copy link
Collaborator

I was thinking about DIE -> name stuff, and was wondering if we would benefit from a cache of DIE offset to DW_AT_name and DIE offset to DW_AT_linkage_name. Not sure how many times we might lookup DIE names multiple times.

felipepiovezan added a commit to felipepiovezan/llvm-project that referenced this pull request Feb 2, 2024
This allows us to query the AT_Name of a DIE without parsing the entire
CU.

Part of the ongoing effort to support IDX_Parent in accelerator tables
[1].

[1]:
https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151/44

(cherry picked from commit 4684507)
kevinfrei pushed a commit to kevinfrei/llvm that referenced this pull request Sep 25, 2024
Summary:
Revert "[lldb][DWARFUnit] Implement PeekDIEName query (llvm#78486)"

This reverts commit 4684507.

Revert "[lldb][DWARFIndex] Use IDX_parent to implement GetFullyQualifiedType query (llvm#79932)"

This reverts commit 91f4a84.

Note(toyang): added back the DWARFDeclContext arrayref constructor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants