Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRM-20599 Identify a separate public files directory from the private one #10513

Closed
wants to merge 8 commits into from
Closed

Conversation

herbdool
Copy link
Contributor

@herbdool herbdool commented Jun 16, 2017

For where the site isn't using Apache the site builder needs a way to put certain CiviCRM files under a secure, private directory. This can be done by assigning CIVICRM_TEMPLATE_COMPILEDIR to a private path. But there are some CiviCRM files that need to be public (such as generated js files by CKEditor). So this PR proposes a method of assigning a public files path in civicrm.settings.php which is separate from the template compiled directory.

This works in testing. It's a simple but radical change so I'm mostly creating this PR to feel out the waters.


mlutfy
mlutfy previously requested changes Jun 20, 2017
@@ -596,6 +596,25 @@ public static function baseFilePath() {
}
return $_path;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious on the web, but seems like the syntax check detected a trailing whitespace on this line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've fixed it.

@totten
Copy link
Member

totten commented Jun 20, 2017

@herbdool, I like the idea of being able to separate the public/private data dirs.

A couple things that stick out:

  • It seems like the PR makes 'CIVICRM_PUBLIC_FILES_DIR mandatory.
    • Wouldn't that make upgrades more complicated?
    • Wouldn't we need patches for various installers to set %%publicFilesDir%%?

Have you noticed 313a57a (#10511) - which was merged for 4.7.21 around the same time that you opened this PR? This might be an alternative -- ie allowing you to (optionally) set the [civicrm.files] path in civicrm.settings.php:

$GLOBALS['civicrm_paths']['civicrm.files'] = array(
  'path' => '/var/www/public/files',
  'url' => 'http://example.com/files'
);

However, it would need similar updates for the civicrm.settings.php.template to make it more visible to sysadmins (and maybe update to the installers if you want to prepopulate).

@herbdool
Copy link
Contributor Author

herbdool commented Jun 21, 2017

@totten My initial PR was mostly to start the conversation so I hadn't really thought about some of this--thanks for bring up the issues.

Perhaps CIVICRM_PUBLIC_FILES_DIR should be commented out in the settings template and then in the method basePublicFilePath it checks if the constant is defined with a fallback on the baseFilePath if not.

Then if it becomes optional then we might want to skip it in the installers.

I think being able to set [civicrm.files] in settings is a good option, but I found that it doesn't help with separating public and private since IIRC we'd still need a way to differentiate things that need to be private while allowing things like the CKEditor generated js files to be stored in public directories.

So maybe we'd still want a [civicrm.publicfiles] and be able to set that as well? As an alternative to this constant.

@totten
Copy link
Member

totten commented Jun 22, 2017

  1. Agree we should have different variables for the public+private folders.

  2. Since the current interpretation of [civicrm.files] is to return the public data folder, and we want existing sites to continue working seamlessly, that variable should still be the public one. Just flip it around -- the new variable would be the private folder, e.g.

diff --git a/Civi/Core/Paths.php b/Civi/Core/Paths.php
index 718b4bc..8784027 100644
--- a/Civi/Core/Paths.php
+++ b/Civi/Core/Paths.php
@@ -35,10 +35,16 @@ class Paths {
         return \CRM_Core_Config::singleton()->userSystem->getCiviSourceStorage();
       })
       ->register('civicrm.files', function () {
         return \CRM_Core_Config::singleton()->userSystem->getDefaultFileStorage();
       })
+      ->register('civicrm.private', function () {
+        return array(
+          'path' => fixmeCalculatePrivatePath(),
+          'url' => '**INVALID-URL**', // URL for private folder would be nonsensical.
+        );
+      })
       ->register('cms', function () {
         return array(
           'path' => \CRM_Core_Config::singleton()->userSystem->cmsRootPath(),
           'url' => \CRM_Utils_System::baseCMSURL(),
         );
  1. At the risk of being pedantic, there are a few different paths/URLs, and it might be good to list out how they relate to the "public/private" split:
  • CIVICRM_TEMPLATE_COMPILEDIR => should be private
  • customFileUploadDir => should be private
  • imageUploadDir, imageUploadURL => should be public
  • uploadDir => not sure
  • extensionsDir, extensionsURL => should be public
  1. In terms of general flow of this kind of work... it seems like implementing the variable [civicrm.private] is the main starting-point, and it would be mergeable/useful within 4.7.x. These changes would also be OK in near-term:
  • as you mentioned, in civicrm.settings.php.template, there should be comments about setting these variables
  • in the help ui for "System Settings => Directories", display the [civicrm.private] value

@herbdool
Copy link
Contributor Author

herbdool commented Jun 22, 2017

Not pedantic at all! :) I keep referring back to a list here https://civicrm.org/advisory/civi-sa-2014-001-risk-information-disclosure which identified four directories which need to be "private":

configAndLogDir | Prohibit all web access | determined by templateCompileDir
imageUploadDir | Prohibit browsing directory listings | determined by settings
templateCompileDir | Prohibit all web access | determined by settings
uploadDir |Prohibit all web access | determined by settings

Maybe Civi would need to stop using CIVICRM_TEMPLATE_COMPILEDIR to figure out configAndLogDir, etc.

While imageUploadDir is typically sites/default/files/civicrm/persist/contribute and semi-private, CKEditor generated js files are stored in: sites/default/files/civicrm/persist which needs to be public. I had been putting imageUploadDir in a private directory under sites/default/files/private/civicrm/persist/contribute which caused the issue with CKEditor. But maybe all it needs is the empty index.html generated in the imageUploadDir to avoid a warning, and it can be in the public directory. At any rate, that's a side issue. [UPDATE: looks like Civi is already creating the index.html in checkDirectoriesAreNotBrowseable() so looks like this at least is not an issue].

@MegaphoneJon
Copy link
Contributor

uploadDir is a bit of a weird case, because the images can be used for multiple purposes. If you're storing headshots for internal use, you'll want uploadDir to be private. However, if you need the headshots to appear in a Drupal View for your membership directory, you'll need uploadDir to be public.

When I've encountered this in the past, I simply disable the check in CRM_Utils_Check that complains that uploadDir isn't protected, but since we're on the topic I thought I'd complicate things a bit.

@herbdool
Copy link
Contributor Author

So it makes sense to create a [civicrm.private] instead, but I think it'll mean that there will be an issue because of how [civicrm.file] is by default being determined by CIVICRM_TEMPLATE_COMPILEDIR which might be have [civicrm.private] in its path (or even an explicit path that's private).

Shouldn't we also change how getDefaultFileStorage() and baseFilePath() work? Using the compile directory, which needs to be private, to set the public file path won't work.

@totten
Copy link
Member

totten commented Jun 22, 2017

  1. Reflecting a bit... I want to emphasize my general thematic agreement: the idea of using CIVICRM_TEMPLATE_COMPILEDIR to determine the public files path is insane/brain-damaging poppycock. Moreover, the general structure of the data folders is poppycock. And presenting these as configurable GUI options is counter-productive. (If you have the necessary skills+commitment to maintain such fine-grained customizations, then you can handle civicrm.settings.php, and you'd probably prefer that anyway; and putting this stuff in the DB makes initialization more complicated.)

  2. I'm extremely nervous about changing the relationship among getDefaultFileStorage(), baseFilePath(), and CIVICRM_TEMPLATE_COMPILEDIR in a minor-revision. If getDefaultFileStorage() is computed from something else, then I think you'll break some percentage of sites.

  3. If we want to get a more sane arrangement, I would honestly say:

  • All customizations for system paths/URLs should go in civicrm.settings.php.
  • Remove the admin GUI ("System Settings => Resource URLs, Directories").
  • Remove the path/URL settings from the database (civicrm_setting).
  • By default (on existing/upgraded installations), civicrm.settings.php sets values for all paths/URLs. (Lock these in so that changes in the logic don't break existing sites.)
  • By default (on new installations), civicrm.settings.php sets all substantive folders (configAndLogDir, templateCompileDir, etc) using the public and private variables.
  • In the CRM_Utils_Check_* framework, compare the values in civicrm.settings.php with the values returned by CMS APIs. If they don't match, display a warning.
  1. Of course, that would break backward-compatibility. Absent a major-release in the near future, we could go in a few phases. I think this path could preserve our existing contracts:
  • Phase 1: Allow new notation (v4.7.x)
    • Define the updated notation for paths in civicrm.settings.php. (ex: https://gist.github.com/totten/3fa698d88bb6a52f49df62a3ffcdf034 )
    • Keep all the default computations untouched. You can safely upgrade without special actions or breakage.
    • If civicrm.settings.php has the new notation, then it overrides the old/default computations.
  • Phase 2: The gentle switch (v4.7.x)
    • Update civicrm.settings.php.template and the various installers to use new notation.
    • Add notices in CRM_Utils_Check_* to facilitate migration from civicrm_settings to civicrm.settings.php.
    • Add notices in "System Settings => Resource URLs, Directories" to direct users to civicrm.settings.php.
  • Phase 3: The hard switch (v5.0)
    • Remove legacy computations (baseFilePath(), etal)
    • Remove admin GUI
    • Remove path/URL settings

(Also ping @xurizaemon @kcristiano since you guys have been talking about similar issues.)

@agileware
Copy link
Contributor

@totten Great write up and I agree totally with your proposed changes in 3.

Would this change make it easier to support CiviCRM multi-site implementations?

@totten
Copy link
Member

totten commented Jun 22, 2017

@agileware re: multisite -- yes-ish

For me, "multisite" usually refers to "Drupal 7 + CiviCRM; one codebase; several DNS domains; each DNS domain has its own CMS+CRM DBs". As long as the installer provides a suitable copy of civicrm.settings.php for each DNS domain, #3 should be just as good.

"multisite" can mean different things for other orgs/use-cases (with files/URLs/paths/databases/tables wired up differently). I wouldn't expect to make ever permutation drop-in seamless, but IMHO #3 and 3fa6 have more transparency and less redundancy, which should make it easier to do custom weirdnesses in civicrm.settings.php.

@agileware
Copy link
Contributor

@totten thanks mate, yes that's the answer I was looking for. Cheers for that

@kcristiano
Copy link
Member

@totten I do think #3 can work, but we need to add additional paths and urls for WP.

If we base the admin url on ['cms.root']['url'] we will have issues if WP is in it's own directory. We also need to define the path to the wp-content directory, the plugins directory and the uploads directory. We've done most of this already and it's in 4.7.20. #10214 and civicrm/civicrm-wordpress#105 completes the changes we need to support WP and the most common directory configurations. We've done this via WP functions and I'd like to continue doing that - I'd like the paths in civicrm.settings.php to be built by finding the directories. It would be best to find the wp-admin url via the WP function admin_url as opposed to [cms.root]/wp-admin (and all other paths/urls as well) but we've hit issues when the CMS is not bootstrapped.

@xurizaemon and I did talk about ensuring we bootstrap the CMS as this will allow us to define the urls and paths via CMS functions. I don't believe there has been any movement on this.

As phase 1 leaves the existing computations, we'd be fine. we will just need to test and add these options by phase 2.

@herbdool
Copy link
Contributor Author

I think this is a good approach.

So we can already put in civicrm_paths into civicrm.settings.php as per the recent commit as you noted @totten. So we just need to override the old computation if [civicrm.private] and [civicrm.files] is defined in the settings. Would that be an if statement in baseFilePath() which will use [civicrm.files] if it is set instead of CIVICRM_TEMPLATE_COMPILEDIR? And then in CRM_Core_Config_Runtime setting $this->configAndLogDir should no longer rely on baseFilePath() if [civicrm.private] is set. Or something like that.

I'm interested in creating a patch for myself so phase 1 works, which I can use as these things shape up.

totten added a commit to totten/civicrm-core that referenced this pull request Jun 24, 2017
Following up on the discussion from
[civicrm#10513](civicrm#10513), this converts
the proposed constants `CIVICRM_UF_WP_BASEURL` and `CIVICRM_UF_ADMINURL` to
variables in the `Paths` system.

A few benefits:
 * Reduces code duplication between `civicrm.php` and `WordPress.php`.
 * Can construct sub-paths with prettier notation (`Civi::paths()->getUrl('[wp.frontend]/foo.txt')`)
 * Has options to output relative or absolute URLs
 * Can expand on `Paths` to provide more inspection/validation

Notes:

 * `CIVICRM_UF_WP_BASEURL` => `wp.frontend.base`
 * `CIVICRM_UF_ADMINURL` => `wp.backend.base`

----------------------------------------
* CRM-16421: Work to get CiviCRM for WordPress in WordPress' official Repository
  https://issues.civicrm.org/jira/browse/CRM-16421
@totten
Copy link
Member

totten commented Jun 24, 2017

@kcristiano I've opened PR's on your repos with some suggested updates for #10214 and #105. In short, they'd use $civicrm_paths instead of constants, and they consolidate back down to one settings file (civicrm.settings.php without civicrm.settings.extra.php).

@herbdool Those fallbacks sounds about right. (Note: In those particular cases, we want to know if the admin has explicitly overridden the values, so you'd check for $GLOBALS['civicrm_paths']['civicrm.files'] and $GLOBALS['civicrm_paths']['civicrm.private'].)

@herbdool
Copy link
Contributor Author

@totten I first tried using Civi::paths()->hasVariable('civicrm.files') in baseFilePath() but got fatal error which makes sense since I think it creates a self-referential, infinite loop. But accessing $GLOBALS directly seems fine.

totten added a commit to totten/civicrm-core that referenced this pull request Jun 28, 2017
This partially reverts changes made under CRM-19303:
 * The changes to `getDefaultFileStorage()` are preserved
 * The changes to `getCiviSourceStorage()` are reverted
 * To avoid conflicts, `parseDrupalSiteName()` has been split into the older `parseDrupalSiteNameFromRoot()` and the newer `parseDrupalSiteNameFromRequest()`.
   (IMHO, `parseDrupalSiteNameFromRequest()` is more correct, but `parseDrupalSiteNameFromRoot()` is more compatible with existing deployments).

As discussed in [PR civicrm#10513](civicrm#10513 (comment)),
we should stop trying so hard to autodetect these things.  We'll treat the
auto-detection stuff as legacy, and should shift toward a simpler/flatter arrangement
which encourages paths/URLs to be stored in `civicrm.settings.php`.
@herbdool
Copy link
Contributor Author

jenkins, test this please

@herbdool
Copy link
Contributor Author

@totten I've updated the PR. Not sure if this is the approach you prefer now.

kcristiano added a commit to kcristiano/civicrm-core that referenced this pull request Sep 27, 2017
…w for civicrn.settings.extra.php for WP. Create template for civicrm.settings.extra.php. Update install/civicrm.php for all needed params for different common install scenarios. Fix linting issues in civicrm.php.

CRM 16421 CRM 17633 - update CRM_Utils_System_WordPress to allow for common install configurations

CRM-16421 - Convert constants to `$civicrm_paths`

Following up on the discussion from
[civicrm#10513](civicrm#10513), this converts
the proposed constants `CIVICRM_UF_WP_BASEURL` and `CIVICRM_UF_ADMINURL` to
variables in the `Paths` system.

A few benefits:
 * Reduces code duplication between `civicrm.php` and `WordPress.php`.
 * Can construct sub-paths with prettier notation (`Civi::paths()->getUrl('[wp.frontend]/foo.txt')`)
 * Has options to output relative or absolute URLs
 * Can expand on `Paths` to provide more inspection/validation

Notes:

 * `CIVICRM_UF_WP_BASEURL` => `wp.frontend.base`
 * `CIVICRM_UF_ADMINURL` => `wp.backend.base`

----------------------------------------
* CRM-16421: Work to get CiviCRM for WordPress in WordPress' official Repository
  https://issues.civicrm.org/jira/browse/CRM-16421

CRM-16421 - Assimilate `civicrm.settings.extra.php` into `civicrm.settings.php`

----------------------------------------
* CRM-16421: Work to get CiviCRM for WordPress in WordPress' official Repository
  https://issues.civicrm.org/jira/browse/CRM-16421

CRM-17633 merge current master changes to civicrm.settings.php.template
seamuslee001 pushed a commit to seamuslee001/civicrm-core that referenced this pull request Oct 12, 2017
…w for civicrn.settings.extra.php for WP. Create template for civicrm.settings.extra.php. Update install/civicrm.php for all needed params for different common install scenarios. Fix linting issues in civicrm.php.

CRM 16421 CRM 17633 - update CRM_Utils_System_WordPress to allow for common install configurations

CRM-16421 - Convert constants to `$civicrm_paths`

Following up on the discussion from
[civicrm#10513](civicrm#10513), this converts
the proposed constants `CIVICRM_UF_WP_BASEURL` and `CIVICRM_UF_ADMINURL` to
variables in the `Paths` system.

A few benefits:
 * Reduces code duplication between `civicrm.php` and `WordPress.php`.
 * Can construct sub-paths with prettier notation (`Civi::paths()->getUrl('[wp.frontend]/foo.txt')`)
 * Has options to output relative or absolute URLs
 * Can expand on `Paths` to provide more inspection/validation

Notes:

 * `CIVICRM_UF_WP_BASEURL` => `wp.frontend.base`
 * `CIVICRM_UF_ADMINURL` => `wp.backend.base`

----------------------------------------
* CRM-16421: Work to get CiviCRM for WordPress in WordPress' official Repository
  https://issues.civicrm.org/jira/browse/CRM-16421

CRM-16421 - Assimilate `civicrm.settings.extra.php` into `civicrm.settings.php`

----------------------------------------
* CRM-16421: Work to get CiviCRM for WordPress in WordPress' official Repository
  https://issues.civicrm.org/jira/browse/CRM-16421

CRM-17633 merge current master changes to civicrm.settings.php.template
@dsnopek
Copy link
Contributor

dsnopek commented Feb 27, 2018

It seems like conversation has died down here. @herbdool and I were discussing issues with the compiled Smarty templates on Pantheon on the Pantheon power users mailing list, and this issue came up.

The main problem on Pantheon (or any infrastructure with multiple webservers where the user uploaded files are stored in a network file system) is that writing the compiled templates during a web request is super slow. A Pantheon engineer suggested storing the compiled templates in the "per instance temporary storage" (basically, a local file directory only for that site on that web server - it's NOT a network file system) which could certainly make sense.

The problem with doing that is setting CIVICRM_TEMPLATE_COMPILEDIR won't just move the compiled Smarty templates there, it'll also move some other, totally unrelated stuff. :-(

While that isn't the main goal of this issue, the patch here appears to solve that problem: it would allow setting pubilc and private files directories that are totally separate from the compiled templates directory!

@totten What is need to re-invigorate this PR?

@dsnopek
Copy link
Contributor

dsnopek commented Feb 27, 2018

Ah, sorry, I see now that discussion is going on here:

https://lab.civicrm.org/dev/cloud-native/issues/1

:-)

@herbdool
Copy link
Contributor Author

This might still be the better place to continue the discussion. So I'll take Tim's comments on Gitlab and update the PR.

@herbdool
Copy link
Contributor Author

herbdool commented Mar 6, 2018

I took another look at this to see how we can move this forward. To me it seems we can't change too much in the first phase.

@totten mentions here https://lab.civicrm.org/dev/cloud-native/issues/1#note_3124 that part of the change requires changing the boot order. But it's not clear to me if that's backwards-compatible. Tim suggests putting Civi\Core\Paths before CRM_Core_Config_Runtime, but the former relies on CRM_Core_Config::singleton()->userSystem->getDefaultFileStorage(), CRM_Core_Config::singleton()->userSystem->getCiviSourceStorage(), CRM_Core_Config::singleton()->userSystem->cmsRootPath(). Do they need to be implemented differently then?

@totten
Copy link
Member

totten commented Mar 7, 2018

@herbdool -Civi\Core\Paths does lazy lookups -- e.g. register('civicrm.files', function(...)...) defines a callback. The callback doesn't execute until someone actually requests [civicrm.files]. In theory, calling new \Civi\Core\Paths() is pretty simple/lightweight/self-contained.

Of course, we're in the middle of a pretty complex process -- the only way to see if it works is to try. But it seems like a plausible line of investigation.

@eileenmcnaughton
Copy link
Contributor

I'm going to close this PR as it seems like it's a conversation not something that can be reviewed. I think the conversation can continue while the PR is closed, and reduce workload on PR triagers. Once there is agreement on approach the PR can be re-opened for formatl review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants