Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serialization] Use 32 bit aligned decl id instead of unaligned decl id #95348

Closed
wants to merge 1 commit into from

Conversation

ChuanqiXu9
Copy link
Member

See the post commit message in
#92083.

I suspect the compile time regression in AArch64 is related to alignments.

I am not sure if this is the problem since I can't reproduce.

See the post commit message in
llvm#92083.

I suspect the compile time regression in AArch64 is related to
alignments.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:modules C++20 modules and Clang Header Modules labels Jun 13, 2024
@ChuanqiXu9 ChuanqiXu9 requested a review from alexfh June 13, 2024 03:44
@llvmbot
Copy link
Collaborator

llvmbot commented Jun 13, 2024

@llvm/pr-subscribers-clang-modules

@llvm/pr-subscribers-clang

Author: Chuanqi Xu (ChuanqiXu9)

Changes

See the post commit message in
#92083.

I suspect the compile time regression in AArch64 is related to alignments.

I am not sure if this is the problem since I can't reproduce.


Full diff: https://github.com/llvm/llvm-project/pull/95348.diff

4 Files Affected:

  • (modified) clang/include/clang/Serialization/ASTBitCodes.h (+11-9)
  • (modified) clang/include/clang/Serialization/ASTReader.h (+5-5)
  • (modified) clang/include/clang/Serialization/ModuleFile.h (+1-1)
  • (modified) clang/lib/Serialization/ASTReader.cpp (+11-11)
diff --git a/clang/include/clang/Serialization/ASTBitCodes.h b/clang/include/clang/Serialization/ASTBitCodes.h
index 52a6c5e10f802..be5b6c4e3b9bd 100644
--- a/clang/include/clang/Serialization/ASTBitCodes.h
+++ b/clang/include/clang/Serialization/ASTBitCodes.h
@@ -168,13 +168,17 @@ const unsigned int NUM_PREDEF_SUBMODULE_IDS = 1;
 /// because blobs in bitstream are 32-bit aligned). This structure is
 /// serialized "as is" to the AST file.
 class UnalignedUInt64 {
-  uint32_t BitLow = 0;
-  uint32_t BitHigh = 0;
+  uint32_t BitLow;
+  uint32_t BitHigh;
 
 public:
   UnalignedUInt64() = default;
   UnalignedUInt64(uint64_t BitOffset) { set(BitOffset); }
 
+  operator uint64_t() const {
+    return get();
+  }
+
   void set(uint64_t Offset) {
     BitLow = Offset;
     BitHigh = Offset >> 32;
@@ -255,11 +259,9 @@ class DeclOffset {
   }
 };
 
-// The unaligned decl ID used in the Blobs of bistreams.
-using unaligned_decl_id_t =
-    llvm::support::detail::packed_endian_specific_integral<
-        serialization::DeclID, llvm::endianness::native,
-        llvm::support::unaligned>;
+// The 32 bits aligned decl ID used in the Blobs of bistreams due the blobs
+// are 32 bits aligned.
+using SerializedDeclID = UnalignedUInt64;
 
 /// The number of predefined preprocessed entity IDs.
 const unsigned int NUM_PREDEF_PP_ENTITY_IDS = 1;
@@ -1986,9 +1988,9 @@ enum CleanupObjectKind { COK_Block, COK_CompoundLiteral };
 
 /// Describes the categories of an Objective-C class.
 struct ObjCCategoriesInfo {
-  // The ID of the definition. Use unaligned_decl_id_t to keep
+  // The ID of the definition. Use SerializedDeclID to keep
   // ObjCCategoriesInfo 32-bit aligned.
-  unaligned_decl_id_t DefinitionID;
+  SerializedDeclID DefinitionID;
 
   // Offset into the array of category lists.
   unsigned Offset;
diff --git a/clang/include/clang/Serialization/ASTReader.h b/clang/include/clang/Serialization/ASTReader.h
index 0a9006223dcbd..07c4f3b624441 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -586,11 +586,11 @@ class ASTReader
 
   struct FileDeclsInfo {
     ModuleFile *Mod = nullptr;
-    ArrayRef<serialization::unaligned_decl_id_t> Decls;
+    ArrayRef<serialization::SerializedDeclID> Decls;
 
     FileDeclsInfo() = default;
     FileDeclsInfo(ModuleFile *Mod,
-                  ArrayRef<serialization::unaligned_decl_id_t> Decls)
+                  ArrayRef<serialization::SerializedDeclID> Decls)
         : Mod(Mod), Decls(Decls) {}
   };
 
@@ -599,7 +599,7 @@ class ASTReader
 
   /// An array of lexical contents of a declaration context, as a sequence of
   /// Decl::Kind, DeclID pairs.
-  using LexicalContents = ArrayRef<serialization::unaligned_decl_id_t>;
+  using LexicalContents = ArrayRef<serialization::SerializedDeclID>;
 
   /// Map from a DeclContext to its lexical contents.
   llvm::DenseMap<const DeclContext*, std::pair<ModuleFile*, LexicalContents>>
@@ -1482,7 +1482,7 @@ class ASTReader
 public:
   class ModuleDeclIterator
       : public llvm::iterator_adaptor_base<
-            ModuleDeclIterator, const serialization::unaligned_decl_id_t *,
+            ModuleDeclIterator, const serialization::SerializedDeclID *,
             std::random_access_iterator_tag, const Decl *, ptrdiff_t,
             const Decl *, const Decl *> {
     ASTReader *Reader = nullptr;
@@ -1492,7 +1492,7 @@ class ASTReader
     ModuleDeclIterator() : iterator_adaptor_base(nullptr) {}
 
     ModuleDeclIterator(ASTReader *Reader, ModuleFile *Mod,
-                       const serialization::unaligned_decl_id_t *Pos)
+                       const serialization::SerializedDeclID *Pos)
         : iterator_adaptor_base(Pos), Reader(Reader), Mod(Mod) {}
 
     value_type operator*() const {
diff --git a/clang/include/clang/Serialization/ModuleFile.h b/clang/include/clang/Serialization/ModuleFile.h
index 56193d44dd6f3..a9cbd4bb267aa 100644
--- a/clang/include/clang/Serialization/ModuleFile.h
+++ b/clang/include/clang/Serialization/ModuleFile.h
@@ -458,7 +458,7 @@ class ModuleFile {
   unsigned BaseDeclIndex = 0;
 
   /// Array of file-level DeclIDs sorted by file.
-  const serialization::unaligned_decl_id_t *FileSortedDecls = nullptr;
+  const serialization::SerializedDeclID *FileSortedDecls = nullptr;
   unsigned NumFileSortedDecls = 0;
 
   /// Array of category list location information within this
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
index 59338b44db32f..6ef74296c829d 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -1266,7 +1266,7 @@ bool ASTReader::ReadLexicalDeclContextStorage(ModuleFile &M,
   if (!Lex.first) {
     Lex = std::make_pair(
         &M, llvm::ArrayRef(
-                reinterpret_cast<const unaligned_decl_id_t *>(Blob.data()),
+                reinterpret_cast<const SerializedDeclID *>(Blob.data()),
                 Blob.size() / sizeof(DeclID)));
   }
   DC->setHasExternalLexicalStorage(true);
@@ -1658,7 +1658,7 @@ bool ASTReader::ReadSLocEntry(int ID) {
 
     unsigned NumFileDecls = Record[7];
     if (NumFileDecls && ContextObj) {
-      const unaligned_decl_id_t *FirstDecl = F->FileSortedDecls + Record[6];
+      const SerializedDeclID *FirstDecl = F->FileSortedDecls + Record[6];
       assert(F->FileSortedDecls && "FILE_SORTED_DECLS not encountered yet ?");
       FileDeclIDs[FID] =
           FileDeclsInfo(F, llvm::ArrayRef(FirstDecl, NumFileDecls));
@@ -3388,7 +3388,7 @@ llvm::Error ASTReader::ReadASTBlock(ModuleFile &F,
     case TU_UPDATE_LEXICAL: {
       DeclContext *TU = ContextObj->getTranslationUnitDecl();
       LexicalContents Contents(
-          reinterpret_cast<const unaligned_decl_id_t *>(Blob.data()),
+          reinterpret_cast<const SerializedDeclID *>(Blob.data()),
           static_cast<unsigned int>(Blob.size() / sizeof(DeclID)));
       TULexicalDecls.push_back(std::make_pair(&F, Contents));
       TU->setHasExternalLexicalStorage(true);
@@ -3616,7 +3616,7 @@ llvm::Error ASTReader::ReadASTBlock(ModuleFile &F,
       break;
 
     case FILE_SORTED_DECLS:
-      F.FileSortedDecls = (const unaligned_decl_id_t *)Blob.data();
+      F.FileSortedDecls = (const SerializedDeclID *)Blob.data();
       F.NumFileSortedDecls = Record[0];
       break;
 
@@ -7911,23 +7911,23 @@ class UnalignedDeclIDComp {
   UnalignedDeclIDComp(ASTReader &Reader, ModuleFile &M)
       : Reader(Reader), Mod(M) {}
 
-  bool operator()(unaligned_decl_id_t L, unaligned_decl_id_t R) const {
+  bool operator()(SerializedDeclID L, SerializedDeclID R) const {
     SourceLocation LHS = getLocation(L);
     SourceLocation RHS = getLocation(R);
     return Reader.getSourceManager().isBeforeInTranslationUnit(LHS, RHS);
   }
 
-  bool operator()(SourceLocation LHS, unaligned_decl_id_t R) const {
+  bool operator()(SourceLocation LHS, SerializedDeclID R) const {
     SourceLocation RHS = getLocation(R);
     return Reader.getSourceManager().isBeforeInTranslationUnit(LHS, RHS);
   }
 
-  bool operator()(unaligned_decl_id_t L, SourceLocation RHS) const {
+  bool operator()(SerializedDeclID L, SourceLocation RHS) const {
     SourceLocation LHS = getLocation(L);
     return Reader.getSourceManager().isBeforeInTranslationUnit(LHS, RHS);
   }
 
-  SourceLocation getLocation(unaligned_decl_id_t ID) const {
+  SourceLocation getLocation(SerializedDeclID ID) const {
     return Reader.getSourceManager().getFileLoc(
         Reader.getSourceLocationForDeclID(
             Reader.getGlobalDeclID(Mod, (LocalDeclID)ID)));
@@ -7954,7 +7954,7 @@ void ASTReader::FindFileRegionDecls(FileID File,
   SourceLocation EndLoc = BeginLoc.getLocWithOffset(Length);
 
   UnalignedDeclIDComp DIDComp(*this, *DInfo.Mod);
-  ArrayRef<unaligned_decl_id_t>::iterator BeginIt =
+  ArrayRef<SerializedDeclID>::iterator BeginIt =
       llvm::lower_bound(DInfo.Decls, BeginLoc, DIDComp);
   if (BeginIt != DInfo.Decls.begin())
     --BeginIt;
@@ -7967,12 +7967,12 @@ void ASTReader::FindFileRegionDecls(FileID File,
              ->isTopLevelDeclInObjCContainer())
     --BeginIt;
 
-  ArrayRef<unaligned_decl_id_t>::iterator EndIt =
+  ArrayRef<SerializedDeclID>::iterator EndIt =
       llvm::upper_bound(DInfo.Decls, EndLoc, DIDComp);
   if (EndIt != DInfo.Decls.end())
     ++EndIt;
 
-  for (ArrayRef<unaligned_decl_id_t>::iterator DIt = BeginIt; DIt != EndIt;
+  for (ArrayRef<SerializedDeclID>::iterator DIt = BeginIt; DIt != EndIt;
        ++DIt)
     Decls.push_back(GetDecl(getGlobalDeclID(*DInfo.Mod, (LocalDeclID)(*DIt))));
 }

Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 78ee473784e5ef6f0b19ce4cb111fb6e4d23c6b2 eeef0c06e2a17f49ce3bdf8ae78b9bf1cd05a077 -- clang/include/clang/Serialization/ASTBitCodes.h clang/include/clang/Serialization/ASTReader.h clang/include/clang/Serialization/ModuleFile.h clang/lib/Serialization/ASTReader.cpp
View the diff from clang-format here.
diff --git a/clang/include/clang/Serialization/ASTBitCodes.h b/clang/include/clang/Serialization/ASTBitCodes.h
index be5b6c4e3b..2b672f01fd 100644
--- a/clang/include/clang/Serialization/ASTBitCodes.h
+++ b/clang/include/clang/Serialization/ASTBitCodes.h
@@ -175,9 +175,7 @@ public:
   UnalignedUInt64() = default;
   UnalignedUInt64(uint64_t BitOffset) { set(BitOffset); }
 
-  operator uint64_t() const {
-    return get();
-  }
+  operator uint64_t() const { return get(); }
 
   void set(uint64_t Offset) {
     BitLow = Offset;
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
index 6ef74296c8..15be400454 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -1265,9 +1265,9 @@ bool ASTReader::ReadLexicalDeclContextStorage(ModuleFile &M,
   auto &Lex = LexicalDecls[DC];
   if (!Lex.first) {
     Lex = std::make_pair(
-        &M, llvm::ArrayRef(
-                reinterpret_cast<const SerializedDeclID *>(Blob.data()),
-                Blob.size() / sizeof(DeclID)));
+        &M,
+        llvm::ArrayRef(reinterpret_cast<const SerializedDeclID *>(Blob.data()),
+                       Blob.size() / sizeof(DeclID)));
   }
   DC->setHasExternalLexicalStorage(true);
   return false;
@@ -7972,8 +7972,7 @@ void ASTReader::FindFileRegionDecls(FileID File,
   if (EndIt != DInfo.Decls.end())
     ++EndIt;
 
-  for (ArrayRef<SerializedDeclID>::iterator DIt = BeginIt; DIt != EndIt;
-       ++DIt)
+  for (ArrayRef<SerializedDeclID>::iterator DIt = BeginIt; DIt != EndIt; ++DIt)
     Decls.push_back(GetDecl(getGlobalDeclID(*DInfo.Mod, (LocalDeclID)(*DIt))));
 }
 

@ChuanqiXu9
Copy link
Member Author

@alexfh Could you try to test this? And if this doesn't mitigate it, it will be helpful to provide the hotspot.

@alexfh
Copy link
Contributor

alexfh commented Aug 1, 2024

As mentioned in #92083 (comment), this didn't change a lot. Is there any value in this change now that the culprit (hash collisions) was found and addressed?

@ChuanqiXu9
Copy link
Member Author

Thanks. I just forgot to handle this. Closed.

@ChuanqiXu9 ChuanqiXu9 closed this Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants