Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversion between local file paths and public file urls - consider scheme and port #4363

Open
stefan-korn opened this issue Dec 19, 2024 · 3 comments
Labels

Comments

@stefan-korn
Copy link
Contributor

Current Behavior

The UrlHostTokenResolver converts a local file path to a public file URL via the hostify token. This gets the URL wrong if there is a difference in scheme or port between the URL that should be resolved (the url with the host TOKEN) and the public files url (that originates from the current request).

Partly this is related to resource_mapping entity which saves the hostified file path (for local files) in the file path field. This uses scheme and port that was used when creating the resource mapping.

Now, if for example you change the URL of the DKAN site in a way that affects either scheme or port, the links created from the host token do not work any more.

I am unsure if this is for some reason intented, but given the flexibility that Drupal itself offers regarding a change of the websites URL, this seems a bit limitating.

Expected Behavior

Local file resource should still be reachable if the DKAN site URL changes in port or scheme.

Steps To Reproduce

  • Have your DKAN site reachable under two ports (i. e. 80 and 443, or 443 and 8443)
  • Create a local file resource under url with one port
  • Visit the the dataset/distribution with this local file resource under the other port
  • Note that the link points to the first port.

The problem gets more clear if you change the port instead of having two ports open.

Relevant log output (optional)

Anything else?

No response

@stefan-korn
Copy link
Contributor Author

Something like this is doing the trick for me:

diff --git a/modules/common/src/UrlHostTokenResolver.php b/modules/common/src/UrlHostTokenResolver.php
index 724e6e20f..062184467 100644
--- a/modules/common/src/UrlHostTokenResolver.php
+++ b/modules/common/src/UrlHostTokenResolver.php
@@ -42,10 +42,29 @@ class UrlHostTokenResolver {
     $serverPublicFilesUrl = self::getServerPublicFilesUrl();
     $serverPublicFilesUrl = isset($serverPublicFilesUrl) ? parse_url($serverPublicFilesUrl) : NULL;
     $serverHost = $serverPublicFilesUrl['host'] ?? \Drupal::request()->getHost();
+    $serverScheme = $serverPublicFilesUrl['scheme'] ?? \Drupal::request()->getScheme();
+    $serverPort = $serverPublicFilesUrl['port'] ?? \Drupal::request()->getPort();
+    $parsedResourceUrl = parse_url($resourceUrl);
+    $resourceUrlScheme = $parsedResourceUrl['scheme'] ?? NULL;
+    $resourceUrlPort = $parsedResourceUrl['port'] ?? NULL;
+    $hostToken = self::TOKEN;
+    if ($resourceUrlScheme !== $serverScheme) {
+      if (!empty($resourceUrlScheme)) {
+        $hostToken = $resourceUrlScheme . '://' . $hostToken;
+      }
+      $serverHost = $serverScheme . '://' . $serverHost;
+    }
+    if ($resourceUrlPort !== $serverPort) {
+      if (!empty($resourceUrlPort)) {
+        $hostToken .= ':' . $resourceUrlPort;
+      }
+      $serverHost .= ':' . $serverPort;
+    }
     // Determine whether the hostified token is present in the resource URL, and
     // replace the token if necessary.
     if (substr_count($resourceUrl, self::TOKEN) > 0) {
-      $resourceUrl = str_replace(self::TOKEN, $serverHost, $resourceUrl);
+      $resourceUrl = str_replace($hostToken, $serverHost, $resourceUrl);
+      //$resourceUrl = str_replace(self::TOKEN, $serverHost, $resourceUrl);
     }
     return $resourceUrl;
   }

But I am unsure if this is the right approach, seems a bit overblown at the moment. Maybe there is an easier way.

Would like to hear your feedback before maybe starting with a PR.

@dafeder
Copy link
Member

dafeder commented Dec 27, 2024

@stefan-korn this is one of those things done a long long time ago when certain things were supposed to be able to work independently of Drupal. I think that's the reason we're using that "host" token thing instead of just "public://". If we just standardized on that I think we could eliminate a lot of this stuff, and maybe leverage something like https://www.drupal.org/project/convert_url_filter.

In the meantime, if this is something that needs to be fixed soon I think something like this is fine, although I feel like there is probably an easier way to code this. Like just do a simple comparison of two arrays of URL parts rather than all this nested conditional logic. That's just an initial response though... if you move forward we can look at some options more in-depth.

@stefan-korn
Copy link
Contributor Author

@dafeder : Thanks for the background infos.

If we need a fix, we will probably patch it like shown above, but maybe we can also avoid the port differences in our setup altogether.

Regarding convert_url_filter: The module is for use in text format filters? So you mean using the logic this module is using for DKAN rather than using the module itself? So it's about the Regex the module uses primarily?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants