-
Notifications
You must be signed in to change notification settings - Fork 11.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lldb][DWARFUnit] Implement PeekDIEName query #78486
Conversation
@llvm/pr-subscribers-lldb Author: Felipe de Azevedo Piovezan (felipepiovezan) ChangesThis allows us to query the AT_Name of a DIE without parsing the entire CU. Part of the ongoing effort to support IDX_Parent in accelerator tables 1. Full diff: https://github.com/llvm/llvm-project/pull/78486.diff 5 Files Affected:
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
index 553b6a4c551d20..775b7a2e73f512 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
@@ -191,3 +191,10 @@ DWARFDebugInfo::GetDIE(const DIERef &die_ref) {
return cu->GetNonSkeletonUnit().GetDIE(die_ref.die_offset());
return DWARFDIE(); // Not found
}
+
+llvm::StringRef
+DWARFDebugInfo::PeekDIEName(const DIERef &die_ref) {
+ if(DWARFUnit *cu = GetUnit(die_ref))
+ return cu->GetNonSkeletonUnit().PeekDIEName(die_ref.die_offset());
+ return llvm::StringRef();
+}
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.h b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.h
index d5e48f312ea0e9..a8b5abc3beed2d 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.h
@@ -43,6 +43,11 @@ class DWARFDebugInfo {
bool ContainsTypeUnits();
DWARFDIE GetDIE(const DIERef &die_ref);
+ /// Returns the AT_Name of this DIE, if it exists, without parsing the entire
+ /// compile unit. An empty is string is returned upon error or if the
+ /// attribute is not present.
+ llvm::StringRef PeekDIEName(const DIERef &die_ref);
+
enum {
eDumpFlag_Verbose = (1 << 0), // Verbose dumping
eDumpFlag_ShowForm = (1 << 1), // Show the DW_form type
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
index 0e2f4d45543bb5..7db279ed37d04a 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
@@ -663,6 +663,14 @@ DWARFUnit::GetDIE(dw_offset_t die_offset) {
return DWARFDIE(); // Not found
}
+llvm::StringRef DWARFUnit::PeekDIEName(dw_offset_t die_offset) {
+ const DWARFDataExtractor &data = GetData();
+ DWARFDebugInfoEntry die;
+ if (!die.Extract(data, this, &die_offset))
+ return llvm::StringRef();
+ return die.GetName(this);
+}
+
DWARFUnit &DWARFUnit::GetNonSkeletonUnit() {
ExtractUnitDIEIfNeeded();
if (m_dwo)
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.h b/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.h
index 3f528e913d8cfa..bc225a52e1d030 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.h
@@ -187,6 +187,11 @@ class DWARFUnit : public UserID {
DWARFDIE GetDIE(dw_offset_t die_offset);
+ /// Returns the AT_Name of the DIE at `die_offset`, if it exists, without
+ /// parsing the entire compile unit. An empty is string is returned upon
+ /// error or if the attribute is not present.
+ llvm::StringRef PeekDIEName(dw_offset_t die_offset);
+
DWARFUnit &GetNonSkeletonUnit();
static uint8_t GetAddressByteSize(const DWARFUnit *cu);
diff --git a/lldb/unittests/SymbolFile/DWARF/DWARFDIETest.cpp b/lldb/unittests/SymbolFile/DWARF/DWARFDIETest.cpp
index 8497855b2f3db5..074873b65bc8dd 100644
--- a/lldb/unittests/SymbolFile/DWARF/DWARFDIETest.cpp
+++ b/lldb/unittests/SymbolFile/DWARF/DWARFDIETest.cpp
@@ -7,6 +7,7 @@
//===----------------------------------------------------------------------===//
#include "Plugins/SymbolFile/DWARF/DWARFDIE.h"
+#include "Plugins/SymbolFile/DWARF/DWARFDebugInfo.h"
#include "TestingSupport/Symbol/YAMLModuleTester.h"
#include "llvm/ADT/STLExtras.h"
#include "gmock/gmock.h"
@@ -104,3 +105,60 @@ TEST(DWARFDIETest, ChildIteration) {
DWARFDIE no_children_die(unit, die_child0);
EXPECT_TRUE(no_children_die.children().empty());
}
+
+TEST(DWARFDIETest, PeekName) {
+ const char *yamldata = R"(
+--- !ELF
+FileHeader:
+ Class: ELFCLASS64
+ Data: ELFDATA2LSB
+ Type: ET_EXEC
+ Machine: EM_386
+DWARF:
+ debug_str:
+ - 'NameType1'
+ - 'NameType2'
+ debug_abbrev:
+ - Table:
+ - Code: 0x00000001
+ Tag: DW_TAG_compile_unit
+ Children: DW_CHILDREN_yes
+ Attributes:
+ - Attribute: DW_AT_language
+ Form: DW_FORM_data2
+ - Code: 0x00000002
+ Tag: DW_TAG_base_type
+ Children: DW_CHILDREN_no
+ Attributes:
+ - Attribute: DW_AT_name
+ Form: DW_FORM_strp
+ debug_info:
+ - Version: 4
+ AddrSize: 8
+ Entries:
+ - AbbrCode: 0x00000001
+ Values:
+ - Value: 0x000000000000000C
+ - AbbrCode: 0x00000002
+ Values:
+ - Value: 0x0000000000000000 # Name = NameType1
+ - AbbrCode: 0x00000002
+ Values:
+ - Value: 0x000000000000000a # Name = NameType2
+ - AbbrCode: 0x00000000
+)";
+
+ YAMLModuleTester t(yamldata);
+ auto *symbol_file =
+ llvm::cast<SymbolFileDWARF>(t.GetModule()->GetSymbolFile());
+ auto &debug_info = symbol_file->DebugInfo();
+
+ DIERef first_die(std::nullopt, DIERef::Section::DebugInfo, 11 /*FirstDIEOffset*/);
+ EXPECT_EQ(debug_info.PeekDIEName(first_die), "");
+
+ DIERef second_die(std::nullopt, DIERef::Section::DebugInfo, 14);
+ EXPECT_EQ(debug_info.PeekDIEName(second_die), "NameType1");
+
+ DIERef third_die(std::nullopt, DIERef::Section::DebugInfo, 19);
+ EXPECT_EQ(debug_info.PeekDIEName(third_die), "NameType2");
+}
|
This allows us to query the AT_Name of a DIE without parsing the entire CU. Part of the ongoing effort to support IDX_Parent in accelerator tables [1]. [1]: https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151/44
0bfe580
to
b0a3348
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
const DWARFDataExtractor &data = GetData(); | ||
DWARFDebugInfoEntry die; | ||
if (!die.Extract(data, this, &die_offset)) | ||
return llvm::StringRef(); | ||
return die.GetName(this); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One option here is to check if all of the DIEs have been already parsed in the DWARFUnit and if so, grab the existing DIE, and if it hasn't been parsed do this.
This approach doesn't take care of cases where the current DIE doesn't have a DW_AT_name
, but it has a DW_AT_specification
or DW_AT_abstract_origin
attribute that points to another DIE that does have a name. In this case we will get no name back from this function because it doesn't recurse into the DW_AT_specification
or DW_AT_abstract_origin
die.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One option here is to check if all of the DIEs have been already parsed in the DWARFUnit and if so, grab the existing DIE, and if it hasn't been parsed do this.
Correct me if I am wrong, but even if we take an already parsed DIE, won't we be doing the exact same thing? It eventually calls DWARFDebugInfoEntry::GetAttributeValue
, which is a more complicated way of doing the above (but it still creates the extractor, etc...)
This approach doesn't take care of cases where the current DIE doesn't have a DW_AT_name, but it has a DW_AT_specification or DW_AT_abstract_origin
Oh great point, we need this for sure! I will update the implementation and add tests for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the current approach should take care of that already; we call DWARFDebugInfoEntry::GetName
, whose implementation is:
const char *DWARFDebugInfoEntry::GetName(const DWARFUnit *cu) const {
return GetAttributeValueAsString(cu, DW_AT_name, nullptr, true);
}
This is going to look through AT_specification and AT_abstract_origin.
I should still add a test to verify this though!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the attribute isn't in the current DIE (it is in a DW_AT_specification
or DW_AT_abstract_origin
), the DWARFDebugInfoEntry::GetAttributeValue(...)
function will just run this code:
if (GetAttributeValue(cu, DW_AT_specification, form_value)) {
DWARFDIE die = form_value.Reference();
if (die) {
or
if (GetAttributeValue(cu, DW_AT_abstract_origin, form_value)) {
DWARFDIE die = form_value.Reference();
if (die) {
Both of these go back to the DWARFDIE class which will cause all of the DIEs to be parsed in the DWARFUnit anyway. So the peek function will work, it will just do a lot more work than you expected. Note that if the DW_AT_specification
or DW_AT_abstract_origin
attributes can represent a DIE from another compile unit of they use the DW_FORM_ref_addr
, which can point across compile units. So it will take some more work to make the DWARFUnit::PeekDIEName
work and be as efficient as needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not saying that this works needs to be done, just pointing out that this might not be as efficient as you want it to be and might cause all DIEs to be parsed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh you are right, my bad, I had misinterpreted your original comment.
So, to summarize, the current implementation is technically correct but not correct in its original intent, i.e. it will be slow in the presence of AT_specification/AT_abstract_origin.
There are two improvements we need to do here:
- Avoid going through the DWARFDIE class when AT_specification/AT_abstract_origin are references inside the same unit
- Figure out if we can also be fast when ref_addr is used.
I will explore those today and come back here
Added tests for AT_specification and AT_abstract origin. @clayborg I didn't change the implementation because IIUC the "parsed DIE" is merely a structure with computed parent/child/sibling relationship, we don't build any data structures for the attributes themselves. If we were to go through the parsed DIE (i.e. |
@clayborg with your feedback, I think I was able to solve the issue and reuse existing code. |
if (!die.Extract(GetData(), this, &die_offset)) | ||
return llvm::StringRef(); | ||
|
||
// Does die contain an AT_Name? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Does die contain an AT_Name? | |
// Does die contain a DW_AT_name? |
makes grepping more reliable
Make comments more grep-able. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now this is safe and all my issues are gone. Thanks for improving.
Thanks for spotting the issue with the first implementation! |
I was thinking about DIE -> name stuff, and was wondering if we would benefit from a cache of DIE offset to |
This allows us to query the AT_Name of a DIE without parsing the entire CU. Part of the ongoing effort to support IDX_Parent in accelerator tables [1]. [1]: https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151/44 (cherry picked from commit 4684507)
Summary: Revert "[lldb][DWARFUnit] Implement PeekDIEName query (llvm#78486)" This reverts commit 4684507. Revert "[lldb][DWARFIndex] Use IDX_parent to implement GetFullyQualifiedType query (llvm#79932)" This reverts commit 91f4a84. Note(toyang): added back the DWARFDeclContext arrayref constructor
This allows us to query the AT_Name of a DIE without parsing the entire CU.
Part of the ongoing effort to support IDX_Parent in accelerator tables 1.