Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DependencyScanningFilesystem] Make sure the local/shared cache filename lookups use only absolute paths #66122

Merged
merged 1 commit into from
Sep 20, 2023

Conversation

akyrtzi
Copy link
Contributor

@akyrtzi akyrtzi commented Sep 12, 2023

Previously a relative path would be used as a key for cache lookup and if the same relative path was used from another compiler invocation with a different working directory then the first cache entry was erroneously returned.

@akyrtzi akyrtzi requested a review from a team as a code owner September 12, 2023 18:30
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Sep 12, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Sep 12, 2023

@llvm/pr-subscribers-clang

Changes

Previously a relative path would be used as a key for cache lookup and if the same relative path was used from another compiler invocation with a different working directory then the first cache entry was erroneously returned.

Full diff: https://github.com/llvm/llvm-project/pull/66122.diff

3 Files Affected:

  • (modified) clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h (+15-3)
  • (modified) clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp (+55-15)
  • (added) clang/test/ClangScanDeps/relative-filenames.c (+38)
diff --git a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
index 4b4e3c7eb2ecd06..5d904c6437873b0 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
@@ -215,6 +215,7 @@ class DependencyScanningFilesystemLocalCache {
 public:
   /// Returns entry associated with the filename or nullptr if none is found.
   const CachedFileSystemEntry *findEntryByFilename(StringRef Filename) const {
+    assert(llvm::sys::path::is_absolute_gnu(Filename));
     auto It = Cache.find(Filename);
     return It == Cache.end() ? nullptr : It->getValue();
   }
@@ -224,6 +225,7 @@ class DependencyScanningFilesystemLocalCache {
   const CachedFileSystemEntry &
   insertEntryForFilename(StringRef Filename,
                          const CachedFileSystemEntry &Entry) {
+    assert(llvm::sys::path::is_absolute_gnu(Filename));
     const auto *InsertedEntry = Cache.insert({Filename, &Entry}).first->second;
     assert(InsertedEntry == &Entry && "entry already present");
     return *InsertedEntry;
@@ -282,13 +284,14 @@ class DependencyScanningWorkerFilesystem : public llvm::vfs::ProxyFileSystem {
 public:
   DependencyScanningWorkerFilesystem(
       DependencyScanningFilesystemSharedCache &SharedCache,
-      IntrusiveRefCntPtr FS)
-      : ProxyFileSystem(std::move(FS)), SharedCache(SharedCache) {}
+      IntrusiveRefCntPtr FS);
 
   llvm::ErrorOr status(const Twine &Path) override;
   llvm::ErrorOr>
   openFileForRead(const Twine &Path) override;
 
+  std::error_code setCurrentWorkingDirectory(const Twine &Path) override;
+
   /// Returns entry for the given filename.
   ///
   /// Attempts to use the local and shared caches first, then falls back to
@@ -304,8 +307,11 @@ class DependencyScanningWorkerFilesystem : public llvm::vfs::ProxyFileSystem {
   /// For a filename that's not yet associated with any entry in the caches,
   /// uses the underlying filesystem to either look up the entry based in the
   /// shared cache indexed by unique ID, or creates new entry from scratch.
+  /// \p FilenameForLookup will always be an absolute path, and different than
+  /// \p OriginalFilename if \p OriginalFilename is relative.
   llvm::ErrorOr
-  computeAndStoreResult(StringRef Filename);
+  computeAndStoreResult(StringRef OriginalFilename,
+                        StringRef FilenameForLookup);
 
   /// Scan for preprocessor directives for the given entry if necessary and
   /// returns a wrapper object with reference semantics.
@@ -388,6 +394,12 @@ class DependencyScanningWorkerFilesystem : public llvm::vfs::ProxyFileSystem {
   /// The local cache is used by the worker thread to cache file system queries
   /// locally instead of querying the global cache every time.
   DependencyScanningFilesystemLocalCache LocalCache;
+
+  /// The working directory to use for making relative paths absolute before
+  /// using them for cache lookups.
+  std::string WorkingDirForCacheLookup;
+
+  void updateWorkingDirForCacheLookup(llvm::ErrorOr CWD);
 };
 
 } // end namespace dependencies
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp b/clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
index 31404855e3b1dc3..9f0daaed4fdf9ff 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
@@ -96,6 +96,7 @@ DependencyScanningFilesystemSharedCache::
 DependencyScanningFilesystemSharedCache::CacheShard &
 DependencyScanningFilesystemSharedCache::getShardForFilename(
     StringRef Filename) const {
+  assert(llvm::sys::path::is_absolute_gnu(Filename));
   return CacheShards[llvm::hash_value(Filename) % NumShards];
 }
 
@@ -109,6 +110,7 @@ DependencyScanningFilesystemSharedCache::getShardForUID(
 const CachedFileSystemEntry *
 DependencyScanningFilesystemSharedCache::CacheShard::findEntryByFilename(
     StringRef Filename) const {
+  assert(llvm::sys::path::is_absolute_gnu(Filename));
   std::lock_guard LockGuard(CacheLock);
   auto It = EntriesByFilename.find(Filename);
   return It == EntriesByFilename.end() ? nullptr : It->getValue();
@@ -189,6 +191,14 @@ static bool shouldCacheStatFailures(StringRef Filename) {
   return shouldScanForDirectivesBasedOnExtension(Filename);
 }
 
+DependencyScanningWorkerFilesystem::DependencyScanningWorkerFilesystem(
+    DependencyScanningFilesystemSharedCache &SharedCache,
+    IntrusiveRefCntPtr FS)
+    : ProxyFileSystem(std::move(FS)), SharedCache(SharedCache) {
+  updateWorkingDirForCacheLookup(
+      getUnderlyingFS().getCurrentWorkingDirectory());
+}
+
 bool DependencyScanningWorkerFilesystem::shouldScanForDirectives(
     StringRef Filename) {
   return shouldScanForDirectivesBasedOnExtension(Filename);
@@ -215,44 +225,57 @@ DependencyScanningWorkerFilesystem::findEntryByFilenameWithWriteThrough(
 }
 
 llvm::ErrorOr
-DependencyScanningWorkerFilesystem::computeAndStoreResult(StringRef Filename) {
-  llvm::ErrorOr Stat = getUnderlyingFS().status(Filename);
+DependencyScanningWorkerFilesystem::computeAndStoreResult(
+    StringRef OriginalFilename, StringRef FilenameForLookup) {
+  llvm::ErrorOr Stat =
+      getUnderlyingFS().status(OriginalFilename);
   if (!Stat) {
-    if (!shouldCacheStatFailures(Filename))
+    if (!shouldCacheStatFailures(OriginalFilename))
       return Stat.getError();
     const auto &Entry =
-        getOrEmplaceSharedEntryForFilename(Filename, Stat.getError());
-    return insertLocalEntryForFilename(Filename, Entry);
+        getOrEmplaceSharedEntryForFilename(FilenameForLookup, Stat.getError());
+    return insertLocalEntryForFilename(FilenameForLookup, Entry);
   }
 
   if (const auto *Entry = findSharedEntryByUID(*Stat))
-    return insertLocalEntryForFilename(Filename, *Entry);
+    return insertLocalEntryForFilename(FilenameForLookup, *Entry);
 
   auto TEntry =
-      Stat->isDirectory() ? TentativeEntry(*Stat) : readFile(Filename);
+      Stat->isDirectory() ? TentativeEntry(*Stat) : readFile(OriginalFilename);
 
   const CachedFileSystemEntry *SharedEntry = [&]() {
     if (TEntry) {
       const auto &UIDEntry = getOrEmplaceSharedEntryForUID(std::move(*TEntry));
-      return &getOrInsertSharedEntryForFilename(Filename, UIDEntry);
+      return &getOrInsertSharedEntryForFilename(FilenameForLookup, UIDEntry);
     }
-    return &getOrEmplaceSharedEntryForFilename(Filename, TEntry.getError());
+    return &getOrEmplaceSharedEntryForFilename(FilenameForLookup,
+                                               TEntry.getError());
   }();
 
-  return insertLocalEntryForFilename(Filename, *SharedEntry);
+  return insertLocalEntryForFilename(FilenameForLookup, *SharedEntry);
 }
 
 llvm::ErrorOr
 DependencyScanningWorkerFilesystem::getOrCreateFileSystemEntry(
-    StringRef Filename, bool DisableDirectivesScanning) {
-  if (const auto *Entry = findEntryByFilenameWithWriteThrough(Filename))
-    return scanForDirectivesIfNecessary(*Entry, Filename,
+    StringRef OriginalFilename, bool DisableDirectivesScanning) {
+  StringRef FilenameForLookup;
+  SmallString<256> PathBuf;
+  if (llvm::sys::path::is_absolute_gnu(OriginalFilename)) {
+    FilenameForLookup = OriginalFilename;
+  } else {
+    PathBuf = WorkingDirForCacheLookup;
+    llvm::sys::path::append(PathBuf, OriginalFilename);
+    FilenameForLookup = PathBuf.str();
+  }
+  if (const auto *Entry =
+          findEntryByFilenameWithWriteThrough(FilenameForLookup))
+    return scanForDirectivesIfNecessary(*Entry, OriginalFilename,
                                         DisableDirectivesScanning)
         .unwrapError();
-  auto MaybeEntry = computeAndStoreResult(Filename);
+  auto MaybeEntry = computeAndStoreResult(OriginalFilename, FilenameForLookup);
   if (!MaybeEntry)
     return MaybeEntry.getError();
-  return scanForDirectivesIfNecessary(*MaybeEntry, Filename,
+  return scanForDirectivesIfNecessary(*MaybeEntry, OriginalFilename,
                                       DisableDirectivesScanning)
       .unwrapError();
 }
@@ -330,3 +353,20 @@ DependencyScanningWorkerFilesystem::openFileForRead(const Twine &Path) {
     return Result.getError();
   return DepScanFile::create(Result.get());
 }
+
+std::error_code DependencyScanningWorkerFilesystem::setCurrentWorkingDirectory(
+    const Twine &Path) {
+  updateWorkingDirForCacheLookup(Path.str());
+  return ProxyFileSystem::setCurrentWorkingDirectory(Path);
+}
+
+void DependencyScanningWorkerFilesystem::updateWorkingDirForCacheLookup(
+    llvm::ErrorOr CWD) {
+  if (CWD && !CWD->empty()) {
+    WorkingDirForCacheLookup = *CWD;
+  } else {
+    // The cache lookup functions will not accept relative paths for safety, so
+    // at least make it absolute from a "root".
+    WorkingDirForCacheLookup = "/";
+  }
+}
diff --git a/clang/test/ClangScanDeps/relative-filenames.c b/clang/test/ClangScanDeps/relative-filenames.c
new file mode 100644
index 000000000000000..03f2be7ec4c1f6a
--- /dev/null
+++ b/clang/test/ClangScanDeps/relative-filenames.c
@@ -0,0 +1,38 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+// RUN: sed -e "s|DIR|%/t|g" %t/cdb.json.template > %t/cdb.json
+
+// RUN: clang-scan-deps -compilation-database %t/cdb.json -format make -j 1 > %t/result.txt
+// RUN: FileCheck %s -input-file=%t/result.txt
+
+// CHECK: {{/|\\}}dir1{{/|\\}}t1.c
+// CHECK: {{/|\\}}dir1{{/|\\}}head.h
+// CHECK: {{/|\\}}dir2{{/|\\}}t2.c
+// CHECK: {{/|\\}}dir2{{/|\\}}head.h
+
+//--- cdb.json.template
+[
+  {
+    "directory": "DIR/dir1",
+    "command": "clang -fsyntax-only t1.c",
+    "file": "t1.c"
+  },
+  {
+    "directory": "DIR/dir2",
+    "command": "clang -fsyntax-only t2.c",
+    "file": "t2.c"
+  }
+]
+
+//--- dir1/t1.c
+#include "head.h"
+
+//--- dir1/head.h
+#ifndef BBB
+#define BBB
+#endif
+
+//--- dir2/t2.c
+#include "head.h"
+
+//--- dir2/head.h

Copy link
Contributor

@jansvoboda11 jansvoboda11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this seems like an important aspect to get right. I left a couple of comments/questions.

@akyrtzi akyrtzi force-pushed the akyrtzi/fix-depscan-relative-paths branch 3 times, most recently from a457c0c to 5794986 Compare September 18, 2023 18:21
@akyrtzi akyrtzi force-pushed the akyrtzi/fix-depscan-relative-paths branch from 5794986 to 1423e87 Compare September 18, 2023 22:09
…ame lookups use only absolute paths

Previously a relative path would be used as a key for cache lookup and if the same relative path
was used from another compiler invocation with a different working directory then the first cache entry
was erroneously returned.
@akyrtzi akyrtzi force-pushed the akyrtzi/fix-depscan-relative-paths branch from 1423e87 to 5901b47 Compare September 19, 2023 20:34
@akyrtzi akyrtzi force-pushed the akyrtzi/fix-depscan-relative-paths branch from 5901b47 to 26b0440 Compare September 19, 2023 22:11
@akyrtzi akyrtzi merged commit 36b37c7 into llvm:main Sep 20, 2023
1 of 2 checks passed
@akyrtzi akyrtzi deleted the akyrtzi/fix-depscan-relative-paths branch September 20, 2023 01:18
akyrtzi added a commit to swiftlang/llvm-project that referenced this pull request Sep 27, 2023
…ame lookups use only absolute paths (llvm#66122)

Previously a relative path would be used as a key for cache lookup and
if the same relative path was used from another compiler invocation with
a different working directory then the first cache entry was erroneously
returned.

(cherry picked from commit 36b37c7)
akyrtzi added a commit to swiftlang/llvm-project that referenced this pull request Sep 27, 2023
…-path-fix

[DependencyScanningFilesystem] Make sure the local/shared cache filename lookups use only absolute paths (llvm#66122)
tru pushed a commit that referenced this pull request Sep 29, 2023
…ame lookups use only absolute paths (#66122)

Previously a relative path would be used as a key for cache lookup and
if the same relative path was used from another compiler invocation with
a different working directory then the first cache entry was erroneously
returned.

(cherry picked from commit 36b37c7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants