-
Notifications
You must be signed in to change notification settings - Fork 11.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LTO] Introduce a new class ImportIDTable #106503
[LTO] Introduce a new class ImportIDTable #106503
Conversation
The new class implements a deduplication table to convert import list elements: {SourceModule, GUID, Definition/Declaration} into 32-bit integers, and vice versa. This patch adds a unit test but does not add a use yet. To be precise, the deduplication table holds {SourceModule, GUID} pairs. We use the bottom one bit of the 32-bit integers to indicate whether we have a definition or declaration. A subsequent patch will collapse the import list hierarchy -- FunctionsToImportTy holding many instances of FunctionsToImportTy -- down to DenseSet<uint32_t> with each element indexing into the deduplication table above. This will address multiple sources of space inefficiency.
@llvm/pr-subscribers-llvm-transforms Author: Kazu Hirata (kazutakahirata) ChangesThe new class implements a deduplication table to convert import list {SourceModule, GUID, Definition/Declaration} into 32-bit integers, and vice versa. This patch adds a unit test but To be precise, the deduplication table holds {SourceModule, GUID} A subsequent patch will collapse the import list hierarchy -- Full diff: https://github.com/llvm/llvm-project/pull/106503.diff 3 Files Affected:
diff --git a/llvm/include/llvm/Transforms/IPO/FunctionImport.h b/llvm/include/llvm/Transforms/IPO/FunctionImport.h
index b7280c56be9cc8..99fe110191dec9 100644
--- a/llvm/include/llvm/Transforms/IPO/FunctionImport.h
+++ b/llvm/include/llvm/Transforms/IPO/FunctionImport.h
@@ -10,6 +10,7 @@
#define LLVM_TRANSFORMS_IPO_FUNCTIONIMPORT_H
#include "llvm/ADT/DenseSet.h"
+#include "llvm/ADT/MapVector.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/IR/GlobalValue.h"
#include "llvm/IR/ModuleSummaryIndex.h"
@@ -96,6 +97,76 @@ class FunctionImporter {
std::tuple<unsigned, const GlobalValueSummary *,
std::unique_ptr<ImportFailureInfo>>>;
+ // Issues import IDs. Each ID uniquely corresponds to a tuple of
+ // (FromModule, GUID, Definition/Declaration).
+ //
+ // The import IDs make the import list space efficient by referring to each
+ // import with a 32-bit integer ID while maintaining a central table that maps
+ // those integer IDs to tuples of (FromModule, GUID, Def/Decl).
+ //
+ // In one large application, a pair of (FromModule, GUID) is mentioned in
+ // import lists more than 50 times on average across all destination modules.
+ // Mentioning the 32-byte tuple:
+ //
+ // std::tuple<StringRef, GlobalValue::GUID, GlobalValueSummary::ImportKind>
+ //
+ // 50 times by value in various import lists would be costly. We can reduce
+ // the memory footprint of import lists by placing one copy in a central table
+ // and referring to it with 32-bit integer IDs.
+ //
+ // To save space within the central table, we only store pairs of
+ // (FromModule, GUID) in the central table. In the actual 32-bit integer ID,
+ // the top 31 bits index into the central table while the bottom 1 bit
+ // indicates whether an ID is for GlobalValueSummary::Declaration or
+ // GlobalValueSummary::Definition.
+ class ImportIDTable {
+ public:
+ using ImportIDTy = uint32_t;
+
+ // Create a pair of import IDs [Def, Decl] for a given pair of FromModule
+ // and GUID.
+ std::pair<ImportIDTy, ImportIDTy> createImportIDs(StringRef FromModule,
+ GlobalValue::GUID GUID) {
+ auto Key = std::make_pair(FromModule, GUID);
+ auto InsertResult = TheTable.try_emplace(Key, TheTable.size());
+ return makeIDPair(InsertResult.first->second);
+ }
+
+ // Get a pair of previously created import IDs [Def, Decl] for a given pair
+ // of FromModule and GUID. Returns std::nullopt if not available.
+ std::optional<std::pair<ImportIDTy, ImportIDTy>>
+ getImportIDs(StringRef FromModule, GlobalValue::GUID GUID) {
+ auto Key = std::make_pair(FromModule, GUID);
+ auto It = TheTable.find(Key);
+ if (It != TheTable.end())
+ return makeIDPair(It->second);
+ return std::nullopt;
+ }
+
+ // Return a tuple of [FromModule, GUID, Def/Decl] that a given ImportID
+ // corresponds to.
+ std::tuple<StringRef, GlobalValue::GUID, GlobalValueSummary::ImportKind>
+ lookup(ImportIDTy ImportID) const {
+ GlobalValueSummary::ImportKind Kind =
+ (ImportID & 1) ? GlobalValueSummary::Declaration
+ : GlobalValueSummary::Definition;
+ auto It = TheTable.begin() + (ImportID >> 1);
+ StringRef FromModule = It->first.first;
+ GlobalValue::GUID GUID = It->first.second;
+ return std::make_tuple(FromModule, GUID, Kind);
+ }
+
+ private:
+ // Make a pair of import IDs [Def, Decl] from an index into TheTable.
+ static std::pair<ImportIDTy, ImportIDTy> makeIDPair(ImportIDTy Index) {
+ ImportIDTy Def = Index << 1;
+ ImportIDTy Decl = Def | 1;
+ return std::make_pair(Def, Decl);
+ }
+
+ MapVector<std::pair<StringRef, GlobalValue::GUID>, ImportIDTy> TheTable;
+ };
+
/// The map maintains the list of imports. Conceptually, it is a collection
/// of tuples of the form:
///
diff --git a/llvm/unittests/Transforms/IPO/CMakeLists.txt b/llvm/unittests/Transforms/IPO/CMakeLists.txt
index ac632450d57305..65be2966bec049 100644
--- a/llvm/unittests/Transforms/IPO/CMakeLists.txt
+++ b/llvm/unittests/Transforms/IPO/CMakeLists.txt
@@ -13,4 +13,5 @@ add_llvm_unittest(IPOTests
WholeProgramDevirt.cpp
AttributorTest.cpp
FunctionSpecializationTest.cpp
+ ImportIDTableTests.cpp
)
diff --git a/llvm/unittests/Transforms/IPO/ImportIDTableTests.cpp b/llvm/unittests/Transforms/IPO/ImportIDTableTests.cpp
new file mode 100644
index 00000000000000..7554ee59c6869a
--- /dev/null
+++ b/llvm/unittests/Transforms/IPO/ImportIDTableTests.cpp
@@ -0,0 +1,79 @@
+//===- ImportIDTableTests.cpp - Unit tests for ImportIDTable --------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Transforms/IPO/FunctionImport.h"
+#include "gmock/gmock.h"
+#include "gtest/gtest.h"
+#include <set>
+#include <type_traits>
+
+using namespace llvm;
+
+TEST(ImportIDTableTests, Basic) {
+ FunctionImporter::ImportIDTable Table;
+
+ auto [Def, Decl] = Table.createImportIDs("mod", 123U);
+ auto [Def2, Decl2] = Table.createImportIDs("stuff", 456U);
+
+ // Def and Decl must be of the same unsigned integer type.
+ static_assert(
+ std::is_unsigned_v<FunctionImporter::ImportIDTable::ImportIDTy>);
+ static_assert(std::is_same_v<FunctionImporter::ImportIDTable::ImportIDTy,
+ decltype(Def)>);
+ static_assert(std::is_same_v<FunctionImporter::ImportIDTable::ImportIDTy,
+ decltype(Decl)>);
+
+ // Check that all IDs are unique.
+ std::set<FunctionImporter::ImportIDTable::ImportIDTy> IDs = {Def, Decl, Def2,
+ Decl2};
+ EXPECT_THAT(IDs, ::testing::SizeIs(4));
+
+ // Verify what Def maps to.
+ auto DefTuple = Table.lookup(Def);
+ EXPECT_EQ(std::get<0>(DefTuple), StringRef("mod"));
+ EXPECT_EQ(std::get<1>(DefTuple), 123U);
+ EXPECT_EQ(std::get<2>(DefTuple), GlobalValueSummary::Definition);
+
+ // Verify what Def maps to.
+ auto DeclTuple = Table.lookup(Decl);
+ EXPECT_EQ(std::get<0>(DeclTuple), StringRef("mod"));
+ EXPECT_EQ(std::get<1>(DeclTuple), 123U);
+ EXPECT_EQ(std::get<2>(DeclTuple), GlobalValueSummary::Declaration);
+
+ // Verify what Def2 maps to.
+ auto Def2Tuple = Table.lookup(Def2);
+ EXPECT_EQ(std::get<0>(Def2Tuple), StringRef("stuff"));
+ EXPECT_EQ(std::get<1>(Def2Tuple), 456U);
+ EXPECT_EQ(std::get<2>(Def2Tuple), GlobalValueSummary::Definition);
+
+ // Verify what Def2 maps to.
+ auto Decl2Tuple = Table.lookup(Decl2);
+ EXPECT_EQ(std::get<0>(Decl2Tuple), StringRef("stuff"));
+ EXPECT_EQ(std::get<1>(Decl2Tuple), 456U);
+ EXPECT_EQ(std::get<2>(Decl2Tuple), GlobalValueSummary::Declaration);
+}
+
+TEST(ImportIDTableTests, Duplicates) {
+ FunctionImporter::ImportIDTable Table;
+
+ auto [Def1, Decl1] = Table.createImportIDs("mod", 123U);
+ auto [Def2, Decl2] = Table.createImportIDs("mod", 123U);
+
+ // Verify we get the same IDs back.
+ EXPECT_EQ(Def1, Def2);
+ EXPECT_EQ(Decl1, Decl2);
+}
+
+TEST(ImportIDTableTests, Missing) {
+ FunctionImporter::ImportIDTable Table;
+
+ auto Result = Table.getImportIDs("mod", 123U);
+
+ // Verify that we get std::nullopt for a non-existent pair.
+ EXPECT_EQ(Result, std::nullopt);
+}
|
Add a test for getImportIDs returning existing IDs.
Addressed the feedback. PTAL. Thanks! |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/146/builds/999 Here is the relevant piece of the build log for the reference
|
The new class implements a deduplication table to convert import list
elements:
{SourceModule, GUID, Definition/Declaration}
into 32-bit integers, and vice versa. This patch adds a unit test but
does not add a use yet.
To be precise, the deduplication table holds {SourceModule, GUID}
pairs. We use the bottom one bit of the 32-bit integers to indicate
whether we have a definition or declaration.
A subsequent patch will collapse the import list hierarchy --
FunctionsToImportTy holding many instances of FunctionsToImportTy --
down to DenseSet<uint32_t> with each element indexing into the
deduplication table above. This will address multiple sources of
space inefficiency.