Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More aggressively deduplicate global warnings based on contents. #112801

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

adrian-prantl
Copy link
Collaborator

I've been getting complaints from users being spammed by -gmodules missing file warnings going out of control because each object file depends on an entire DAG of PCM files that usually are all missing at once. To reduce this problem, this patch does two things:

  1. Module now maintains a DenseMap<hash, once> that is used to display each warning only once, based on its actual text.

  2. The PCM warning itself is reworded to include less details, such as the DIE offset, which is only useful to LLDB developers, who can get this from the dwarf log if they need it. Because the detail is omitted the hashing from (1) deduplicates the warnings.

rdar://138144624

I've been getting complaints from users being spammed by -gmodules
missing file warnings going out of control because each object file
depends on an entire DAG of PCM files that usually are all missing at
once. To reduce this problem, this patch does two things:

1. Module now maintains a DenseMap<hash, once> that is used to display
each warning only once, based on its actual text.

2. The PCM warning itself is reworded to include less details, such as
the DIE offset, which is only useful to LLDB developers, who can get
this from the dwarf log if they need it. Because the detail is omitted
the hashing from (1) deduplicates the warnings.

rdar://138144624
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 17, 2024

@llvm/pr-subscribers-lldb

Author: Adrian Prantl (adrian-prantl)

Changes

I've been getting complaints from users being spammed by -gmodules missing file warnings going out of control because each object file depends on an entire DAG of PCM files that usually are all missing at once. To reduce this problem, this patch does two things:

  1. Module now maintains a DenseMap<hash, once> that is used to display each warning only once, based on its actual text.

  2. The PCM warning itself is reworded to include less details, such as the DIE offset, which is only useful to LLDB developers, who can get this from the dwarf log if they need it. Because the detail is omitted the hashing from (1) deduplicates the warnings.

rdar://138144624


Full diff: https://github.com/llvm/llvm-project/pull/112801.diff

4 Files Affected:

  • (modified) lldb/include/lldb/Core/Module.h (+6-3)
  • (modified) lldb/source/Core/Module.cpp (+19-10)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp (+13-12)
  • (added) lldb/test/Shell/Diagnostics/TestDedupWarnings.test (+22)
diff --git a/lldb/include/lldb/Core/Module.h b/lldb/include/lldb/Core/Module.h
index 5589c1c9a350dc..e2a99cc7bcf50b 100644
--- a/lldb/include/lldb/Core/Module.h
+++ b/lldb/include/lldb/Core/Module.h
@@ -30,6 +30,7 @@
 
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/STLFunctionalExtras.h"
+#include "llvm/ADT/StableHashing.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/Support/Chrono.h"
 
@@ -1057,9 +1058,10 @@ class Module : public std::enable_shared_from_this<Module>,
   /// time for the symbol tables can be aggregated here.
   StatsDuration m_symtab_index_time;
 
-  std::once_flag m_optimization_warning;
-  std::once_flag m_language_warning;
-
+  /// A set of hashes of all warnings and errors, to avoid reporting them
+  /// multiple times to the same Debugger.
+  llvm::DenseMap<llvm::stable_hash, std::unique_ptr<std::once_flag>>
+      m_shown_diagnostics;
   void SymbolIndicesToSymbolContextList(Symtab *symtab,
                                         std::vector<uint32_t> &symbol_indexes,
                                         SymbolContextList &sc_list);
@@ -1086,6 +1088,7 @@ class Module : public std::enable_shared_from_this<Module>,
   void ReportWarning(const llvm::formatv_object_base &payload);
   void ReportError(const llvm::formatv_object_base &payload);
   void ReportErrorIfModifyDetected(const llvm::formatv_object_base &payload);
+  std::once_flag *GetDiagnosticOnceFlag(llvm::StringRef msg);
 };
 
 } // namespace lldb_private
diff --git a/lldb/source/Core/Module.cpp b/lldb/source/Core/Module.cpp
index 88cc957e91fac4..1139f7c0bbbae7 100644
--- a/lldb/source/Core/Module.cpp
+++ b/lldb/source/Core/Module.cpp
@@ -1093,8 +1093,8 @@ void Module::ReportWarningOptimization(
   ss << file_name
      << " was compiled with optimization - stepping may behave "
         "oddly; variables may not be available.";
-  Debugger::ReportWarning(std::string(ss.GetString()), debugger_id,
-                          &m_optimization_warning);
+  llvm::StringRef msg = ss.GetString();
+  Debugger::ReportWarning(msg.str(), debugger_id, GetDiagnosticOnceFlag(msg));
 }
 
 void Module::ReportWarningUnsupportedLanguage(
@@ -1104,8 +1104,8 @@ void Module::ReportWarningUnsupportedLanguage(
      << Language::GetNameForLanguageType(language)
      << "\". "
         "Inspection of frame variables will be limited.";
-  Debugger::ReportWarning(std::string(ss.GetString()), debugger_id,
-                          &m_language_warning);
+  llvm::StringRef msg = ss.GetString();
+  Debugger::ReportWarning(msg.str(), debugger_id, GetDiagnosticOnceFlag(msg));
 }
 
 void Module::ReportErrorIfModifyDetected(
@@ -1125,20 +1125,29 @@ void Module::ReportErrorIfModifyDetected(
   }
 }
 
+std::once_flag *Module::GetDiagnosticOnceFlag(llvm::StringRef msg) {
+  std::lock_guard<std::recursive_mutex> guard(m_mutex);
+  auto &once_ptr = m_shown_diagnostics[llvm::stable_hash_name(msg)];
+  if (!once_ptr)
+    once_ptr = std::make_unique<std::once_flag>();
+  return once_ptr.get();
+}
+
 void Module::ReportError(const llvm::formatv_object_base &payload) {
   StreamString strm;
   GetDescription(strm.AsRawOstream(), lldb::eDescriptionLevelBrief);
-  strm.PutChar(' ');
-  strm.PutCString(payload.str());
-  Debugger::ReportError(strm.GetString().str());
+  std::string msg = payload.str();
+  strm << ' ' << msg;
+  Debugger::ReportError(strm.GetString().str(), {}, GetDiagnosticOnceFlag(msg));
 }
 
 void Module::ReportWarning(const llvm::formatv_object_base &payload) {
   StreamString strm;
   GetDescription(strm.AsRawOstream(), lldb::eDescriptionLevelFull);
-  strm.PutChar(' ');
-  strm.PutCString(payload.str());
-  Debugger::ReportWarning(std::string(strm.GetString()));
+  std::string msg = payload.str();
+  strm << ' ' << msg;
+  Debugger::ReportWarning(strm.GetString().str(), {},
+                          GetDiagnosticOnceFlag(msg));
 }
 
 void Module::LogMessage(Log *log, const llvm::formatv_object_base &payload) {
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
index 9287d4baf19e9c..e5b8eee8d08c24 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
@@ -2069,13 +2069,15 @@ void SymbolFileDWARF::UpdateExternalModuleListIfNeeded() {
     Status error = ModuleList::GetSharedModule(dwo_module_spec, module_sp,
                                                nullptr, nullptr, nullptr);
     if (!module_sp) {
+      // ReportWarning also rate-limits based on the warning string,
+      // but in a -gmodules build, each object file has a similar DAG
+      // of module dependencies that would all be listed here.
       GetObjectFile()->GetModule()->ReportWarning(
-          "{0:x16}: unable to locate module needed for external types: "
-          "{1}\nerror: {2}\nDebugging will be degraded due to missing "
-          "types. Rebuilding the project will regenerate the needed "
-          "module files.",
-          die.GetOffset(), dwo_module_spec.GetFileSpec().GetPath().c_str(),
-          error.AsCString("unknown error"));
+          "{0}", error.AsCString("unknown error"));
+      GetObjectFile()->GetModule()->ReportWarning(
+          "Unable to locate module needed for external types.\n"
+          "Debugging will be degraded due to missing types. Rebuilding the "
+          "project will regenerate the needed module files.");
       continue;
     }
 
@@ -2095,12 +2097,11 @@ void SymbolFileDWARF::UpdateExternalModuleListIfNeeded() {
 
     if (dwo_id != dwo_dwo_id) {
       GetObjectFile()->GetModule()->ReportWarning(
-          "{0:x16}: Module {1} is out-of-date (hash mismatch). Type "
-          "information "
-          "from this module may be incomplete or inconsistent with the rest of "
-          "the program. Rebuilding the project will regenerate the needed "
-          "module files.",
-          die.GetOffset(), dwo_module_spec.GetFileSpec().GetPath().c_str());
+          "Module {0} is out-of-date (hash mismatch).\n"
+          "Type information from this module may be incomplete or inconsistent "
+          "with the rest of the program. Rebuilding the project will "
+          "regenerate the needed module files.",
+          dwo_module_spec.GetFileSpec().GetPath());
     }
   }
 }
diff --git a/lldb/test/Shell/Diagnostics/TestDedupWarnings.test b/lldb/test/Shell/Diagnostics/TestDedupWarnings.test
new file mode 100644
index 00000000000000..d4fcf78d01b81c
--- /dev/null
+++ b/lldb/test/Shell/Diagnostics/TestDedupWarnings.test
@@ -0,0 +1,22 @@
+# REQUIRES: system-darwin
+# Test the rate-limiting of module not found warnings.
+# RUN: rm -rf %t
+# RUN: mkdir -p %t
+
+# RUN: echo 'module "C" { header "c.h" }' >%t/module.modulemap
+# RUN: echo 'struct c {};' >>%t/c.h
+# RUN: echo '@import C;'                  >%t/a.m
+# RUN: echo 'struct a { struct c c; } a;' >>%t/a.m
+# RUN: echo '@import C;'                  >%t/b.m
+# RUN: echo 'struct b { struct c c; } b;' >>%t/b.m
+# RUN: echo 'int main() {}'               >>%t/b.m
+
+# RUN: %clang_host -fmodules -Xclang -fmodules-cache-path=%t/cache -I%t -g -gmodules %t/a.m -o %t/a.o -c
+# RUN: %clang_host -fmodules -Xclang -fmodules-cache-path=%t/cache -I%t -g -gmodules %t/b.m -o %t/b.o -c
+# RUN: %clang_host %t/a.o %t/b.o -o %t/a.out
+# RUN: rm -rf %t/cache
+# RUN: %lldb %t/a.out -o "b main" -o run -o "p a" -o "p b" -o q 2>&1 | FileCheck %s
+# CHECK: {{[ab]}}.o{{.*}}/cache/{{.*}}/C-{{.*}}.pcm' does not exist
+# CHECK-NOT: /cache/{{.*}}/C-{.*}.pcm' does not exist
+# CHECK: {{[ab]}}.o{{.*}}/cache/{{.*}}/C-{{.*}}.pcm' does not exist
+# CHECK-NOT: /cache/{{.*}}/C-{.*}.pcm' does not exist

Copy link
Member

@JDevlieghere JDevlieghere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for fixing this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants