Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check return value and improve error handling on certificate manager #35092

Merged

Conversation

Messj1
Copy link
Contributor

@Messj1 Messj1 commented Nov 11, 2022

close #33487

With S3 primary storage there was a problem with getting the CA bundle from the storage without having the CA bundle for the connection which causes that the CertificateManager was throwing an Error. (handled in bffa67c)
This commit improves the handling in CertificateManager and log unexpected behaviors.

this issue is releated to #31605 so it should also be backported to Nextcloud v22

@szaimen szaimen added bug 3. to review Waiting for reviews labels Nov 12, 2022
@szaimen szaimen added this to the Nextcloud 26 milestone Nov 12, 2022
@szaimen
Copy link
Contributor

szaimen commented Nov 12, 2022

/backport to stable25

@szaimen
Copy link
Contributor

szaimen commented Nov 12, 2022

/backport to stable24

@szaimen
Copy link
Contributor

szaimen commented Nov 12, 2022

/backport to stable23

@kesselb
Copy link
Contributor

kesselb commented Nov 30, 2022

Hi and thanks for your pull request 👍

I would prefer not to use Throwable here.
getLocalFile may return false and your suggestion to handle this case properly is valid.

Would you mind updating the PR to throw an exception (one from https://www.php.net/manual/en/spl.exceptions.php should do it) when bundlePath is false.

@Messj1
Copy link
Contributor Author

Messj1 commented Dec 2, 2022

@kesselb I don't see the reasons to make those changes since they don't improve or fix something. Maybe unofficial guideline? The developer manual itself has contains no chapter how to handle exception only how to throw them.

There are 2 change requests:

  1. Exception or Throwable

I would prefer not to use Throwable here. getLocalFile may return false and your suggestion to handle this case properly is valid.

Reason?

I think it is more robust if we left the Throwable since it doesn't matter if there is an Exception or an Error. There is a fallback which work in both cases.
The goal should be to give the application a chance to finish the work since there is a fallback. This was the root cause which lead to the issue #33487


2) Throwing an Exception

Would you mind updating the PR to throw an exception (one from https://www.php.net/manual/en/spl.exceptions.php should do it) when bundlePath is false.

Your Idea is to throw an Exception which gets catched in the same function?

				if ($bundlePath === false) {
+					throw(new \RuntimeException("Unable to get certificate bundle '" . $certificateBundle . "'. Fallback to default ca-bundle.crt"));
-					$this->logger->warning("Unable to get certificate bundle '" . $certificateBundle . "'. Fallback to default ca-bundle.crt");
-					return \OC::$SERVERROOT . '/resources/config/ca-bundle.crt';
				}
				$this->bundlePath = $bundlePath;
			}
			return $this->bundlePath;
		} catch (\Throwable $e) {
			$this->logger->error("Error occurred during fetch certificate bundle. Fallback to default ca-bundle.crt", ['exception' => $e]);
			return \OC::$SERVERROOT . '/resources/config/ca-bundle.crt';
		}

Reason?

The idea was to handle the "exception" directly since it is a known return value. IMHO this it is not directly an error since the source of this exception is unknown and it could be something simple like a poor connection.
So the difference is that waning is used instead of an error. The error should occur in the underlying function.

https://www.php-fig.org/psr/psr-3/

error
Runtime errors that do not require immediate action but should typically be logged and monitored.
warning
Exceptional occurrences that are not errors.
Example: Use of deprecated APIs, poor use of an API, undesirable things that are not necessarily wrong.

@kesselb
Copy link
Contributor

kesselb commented Dec 3, 2022

I think it is more robust if we left the Throwable since it doesn't matter if there is an Exception or an Error. There is a fallback which work in both cases.

https://stackoverflow.com/questions/6083248/is-it-a-bad-practice-to-catch-throwable is a good read.

You need to be as specific as possible. Otherwise unforeseen bugs might creep away this way.
Besides, Throwable covers Error as well and that's usually no point of return. You don't want to catch/handle that, you want your program to die immediately so that you can fix it properly.

The actual bug, that we are not handling getLocalFile properly, would be much harder to spot with catch throwable.

Your Idea is to throw an Exception which gets catched in the same function?

I find it easier to read when the error handling is in one place, especially for the same error handling.
But that's not a hill I want to die on.

@Messj1
Copy link
Contributor Author

Messj1 commented Dec 3, 2022

https://stackoverflow.com/questions/6083248/is-it-a-bad-practice-to-catch-throwable is a good read.

You need to be as specific as possible. Otherwise unforeseen bugs might creep away this way.
Besides, Throwable covers Error as well and that's usually no point of return. You don't want to catch/handle that, you want your program to die immediately so that you can fix it properly.

First of all. It's an opinion to let the script die. Then the end user see an error. => But this behavior seems not to be user friendly.
IMHO since there is a legal fallback, the important thing is to inform the administrator with a hint in the log file. (it is the main reason to have a log file ;)

The actual bug, that we are not handling getLocalFile properly, would be much harder to spot with catch throwable.

Yes, the bug was occurred cause of incorrect handling.
And also no, the issue was created cause of the missing error/fallback handling.

👍 Will change it.
But to be clear, there is no hidden information at all (since it get logged) and the use of Throwable is legal to use since it was developed for such purposes.

Your Idea is to throw an Exception which gets catched in the same function?

I find it easier to read when the error handling is in one place, especially for the same error handling. But that's not a hill I want to die on.

Thought the same as I started to write the code. Then I asked myself: Is this an error since it is legal to have false as return value?

@kesselb Simple question: Is this an error or warning?

  • Warning: Code stay as is
  • Error: return points can be merged (throw error)

@Messj1
Copy link
Contributor Author

Messj1 commented Dec 3, 2022

There is another unhandled case with null as return value from getLocalFile.

View.php

	public function getLocalFile($path) {
		...
		if (Filesystem::isValidPath($parent) and $storage) {
			return $storage->getLocalFile($internalPath);
		} else {
			return null;
		}
	}

@kesselb Should this lead to fallback? Currently the function returns null.

So the new if condition would be something like this:

+				if ($bundlePath === null || $bundlePath === false) {
-				if ($bundlePath === false) {

@Messj1
Copy link
Contributor Author

Messj1 commented Dec 4, 2022

One thing I would change is the logic with has Certificates. Currently if there are no certificates it caches the "system" ca bundle. And then it tries to write it in the storage. Is there any reason for that behavior since the files are then the same.

The try .. catch statement could also be changed inside the if block since it isn't needed to check a variable. And so we don't need it if the file is cached.

public function getAbsoluteBundlePath(): string {
-	try {
	if ($this->bundlePath === null) {
+		try {
-			if (!$this->hasCertificates()) {
-				$this->bundlePath = \OC::$SERVERROOT . '/resources/config/ca-bundle.crt';
-			}
+			if ($this->hasCertificates()) {

			if ($this->needsRebundling()) {
				$this->createCertificateBundle();
			}

			$certificateBundle = $this->getCertificateBundle();
			$bundlePath = $this->view->getLocalFile($certificateBundle);
			if ($bundlePath === null) {
				throw(new \RuntimeException("Unable to get certificate bundle cause of invalid path or nonexistent storage '" . $certificateBundle . "'."));
			} else if ($bundlePath === false) {
				throw(new \RuntimeException("Unable to get certificate bundle '" . $certificateBundle . "'."));
			}
			$this->bundlePath = $bundlePath;
+			} else {
+				$this->bundlePath = \OC::$SERVERROOT . '/resources/config/ca-bundle.crt';
+			}
-		}
-		return $this->bundlePath;
		} catch (\Exception $e) {
			$this->logger->error("Error occurred during fetch certificate bundle. Fallback to default ca-bundle.crt", ['exception' => $e]);
			return \OC::$SERVERROOT . '/resources/config/ca-bundle.crt';
		}
+	}
+	return $this->bundlePath;
}

Another Task is to handle the following error since it isn't honest to return an invalid fallback.

	public function createCertificateBundle(): void {
		...
		$defaultCertificates = file_get_contents(\OC::$SERVERROOT . '/resources/config/ca-bundle.crt');
		if (strlen($defaultCertificates) < 1024) { // sanity check to verify that we have some content for our bundle
			// log as exception so we have a stacktrace
			$e = new \Exception('Shipped ca-bundle is empty, refusing to create certificate bundle');
			$this->logger->error($e->getMessage(), ['exception' => $e]);
			return;
		}

Last but not least there are plenty of incomplete phpdoc for Example in the View::getLocalFile

-	 * @return string
+	 * @return string|null|false
+	 * @throws InvalidPathException
+	 * @throws NotFoundException if no mount for path existing
+	 * @throws HintException
+	 * @throws ServerNotAvailableException
+	 * @throws \Exception if no user home storage got configured
+	 * @throws \Exception if mount provider name exceeds the limit of 128 characters
+	 * @throws \Exception if the result from storage wrapper was invalid
+	 * @throws \Exception in general if storage got problem with plenty of concrete exceptions
+	 * @throws \Exception if the root storage could not be initialized

But I think those task aren't fit into this pull request.?

@kesselb
Copy link
Contributor

kesselb commented Dec 5, 2022

Thanks for your feedback 👍

Some context why we ship a ca-bundle copy: #32963

Documentation for occ security:certificates command: https://docs.nextcloud.com/server/latest/admin_manual/configuration_server/occ_command.html#security

Simple question: Is this an error or warning?

I don't have an opinion here.

One thing I would change is the logic with has Certificates. Currently if there are no certificates it caches the "system" ca bundle. And then it tries to write it in the storage. Is there any reason for that behavior since the files are then the same.

It's possible to import certificates with occ security:certificates:import.
Uploaded certificates are stored in $datadirectory/files_external/uploads.
Uploaded certificates and default ca-bundle are merged to $datadirectory/files_external/rootcerts.crt.

if ($this->needsRebundling()) {
$this->createCertificateBundle();
}

I believe this block is mostly relevant for an update situation.
Updater replaced the server files and we have a new /resources/config/ca-bundle.crt bundle.
Upgrade (occ upgrade) will initialize the CertificateManager and trigger a rebuild (cf. #33487, stack trace in the first post).

if (!$this->bundlePath) {
if (!$this->hasCertificates()) {
$this->bundlePath = \OC::$SERVERROOT . '/resources/config/ca-bundle.crt';
}
if ($this->needsRebundling()) {
$this->createCertificateBundle();
}
$this->bundlePath = $this->view->getLocalFile($this->getCertificateBundle());
}
return $this->bundlePath;

You are right, this looks weird.
hasCertificates() is true when a certificate was imported with occ security:certificates:import.
It looks good to me to use the default path when hasCertificates() is false.
hasCertificates() is expensive and we should cache the result.
I guess it's a bug that we run needsRebundling without user certificates.

getCertificateBundle and getAbsoluteBundlePath can return a different bundle.
Luckily getCertificateBundle() is not called externally and we should make the function private (in a different PR because that's an API change and not backportable).

Another Task is to handle the following error since it isn't honest to return an invalid fallback.

True. #2910 (comment) sounds unusual.
I tend to ignore the related code block. There is not a single bug report for the error message on GitHub.

But I think those task aren't fit into this pull request.?

Yes. Should be done in another one.

@Messj1
Copy link
Contributor Author

Messj1 commented Dec 5, 2022

@kesselb So it is like it now is. An Error 😄

The CI pipeline errors are not mine:

  • Still problem with cloning (this time the submodul)
  • Performance test is hanging on some crypto (openssl_seal) problem and uncleaned testing base (locking exceptions)

So there are following tasks left:

  1. Open issue to change the behavior as commented above.

I guess it's a bug that we run needsRebundling without user certificates.

I think the idea was to regenerate certificate bundle if it is missing. But is it nonsense since the getCertificateBundle() is not delivering the real path and has also no fallback in it. So in some race condition it will fail 😱

Will open new Issue with that.

  1. Add deprecated warning for external call of getCertificateBundle()

getCertificateBundle and getAbsoluteBundlePath can return a different bundle. Luckily getCertificateBundle() is not called externally and we should make the function private (in a different PR because that's an API change and not backportable).

I think we should at least backport an deprecated warning. So user will be informed if they use this API. Any suggestions?

  1. Open issue to extend missing php docs

@Messj1
Copy link
Contributor Author

Messj1 commented Dec 6, 2022

@kesselb I forgot to sign all my commits. They are not verified in github. Should I sign them with rebase and use force push to correct it?

@kesselb
Copy link
Contributor

kesselb commented Dec 6, 2022

I forgot to sign all my commits. They are not verified in github. Should I sign them with rebase and use force push to correct it?

Go ahead if you want to sign them but it's not necessary for us.
We require Signed-off-by: for every commit message, and that's already done.

@kesselb
Copy link
Contributor

kesselb commented Dec 7, 2022

Hi, please rebase onto master to pull the latest changes to fix some CI failures.

@Messj1
Copy link
Contributor Author

Messj1 commented Jan 4, 2023

@pserwylo What went wrong on performance testing? https://github.com/nextcloud/server/actions/runs/3644023086/jobs/6158316871

If works local on my computer without warnings and error if I try to reproduce the error. 😒

@kesselb
Copy link
Contributor

kesselb commented Jan 4, 2023

@Messj1 Performance testing is not a required check yet. It's fine to ignore it for now.

@kesselb
Copy link
Contributor

kesselb commented Jan 4, 2023

/rebase

@blizzz blizzz mentioned this pull request Feb 1, 2023
@skjnldsv skjnldsv mentioned this pull request Feb 23, 2023
@blizzz blizzz mentioned this pull request Mar 7, 2023
@blizzz blizzz modified the milestones: Nextcloud 26, Nextcloud 27 Mar 9, 2023
@Messj1 Messj1 force-pushed the bugfix/type-error-cert-manager-cache-path branch from f2269a0 to 32603da Compare April 4, 2023 20:03
@Messj1 Messj1 requested a review from kesselb April 4, 2023 20:05
With S3 primary storage there was a problem with getting the CA bundle from the storage without having the CA bundle for the connection which causes that the CertificateManager was throwing an Error.
This commit improves the handling in CertificateManager and log unexpected behaviors.

Signed-off-by: Jan Messer <jan@mtec-studios.ch>
…ndler (only exceptions are catch)

Signed-off-by: Jan Messer <jan@mtec-studios.ch>
@Messj1 Messj1 force-pushed the bugfix/type-error-cert-manager-cache-path branch from 32603da to 647c65a Compare April 6, 2023 21:05
@Messj1
Copy link
Contributor Author

Messj1 commented Apr 6, 2023

rebased and solved merge conflict.

Copy link
Member

@skjnldsv skjnldsv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code make sense

@skjnldsv skjnldsv changed the title [BUGFIX] check return value and improve error handling Check return value and improve error handling on certificate manager Apr 7, 2023
@skjnldsv skjnldsv mentioned this pull request May 3, 2023
@skjnldsv skjnldsv merged commit 46459ae into nextcloud:master May 4, 2023
@welcome
Copy link

welcome bot commented May 4, 2023

Thanks for your first pull request and welcome to the community! Feel free to keep them coming! If you are looking for issues to tackle then have a look at this selection: https://github.com/nextcloud/server/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22

@solracsf
Copy link
Member

solracsf commented May 4, 2023

/backport to stable26

@mschilli87
Copy link

mschilli87 commented May 11, 2023

Just wanted to report that I am still hitting an error while upgrading from 26.0.0 to 26.0.1:

Nextcloud or one of the apps require upgrade - only a limited number of commands are available
You may use your browser or the occ upgrade command to do the upgrade
Setting log level to debug
Turned on maintenance mode
Updating database schema
Updated database
An unhandled exception has been thrown:
TypeError: fwrite(): Argument #1 ($stream) must be of type resource, bool given in /usr/share/webapps/nextcloud/lib/private/Security/CertificateManager.php:161
Stack trace:
#0 /usr/share/webapps/nextcloud/lib/private/Security/CertificateManager.php(161): fwrite()
#1 /usr/share/webapps/nextcloud/lib/private/Security/CertificateManager.php(247): OC\Security\CertificateManager->createCertificateBundle()
#2 /usr/share/webapps/nextcloud/lib/private/Http/Client/Client.php(129): OC\Security\CertificateManager->getAbsoluteBundlePath()
#3 /usr/share/webapps/nextcloud/lib/private/Http/Client/Client.php(76): OC\Http\Client\Client->getCertBundle()
#4 /usr/share/webapps/nextcloud/lib/private/Http/Client/Client.php(226): OC\Http\Client\Client->buildRequestOptions()
#5 /usr/share/webapps/nextcloud/lib/private/App/AppStore/Fetcher/Fetcher.php(120): OC\Http\Client\Client->get()
#6 /usr/share/webapps/nextcloud/lib/private/App/AppStore/Fetcher/AppFetcher.php(86): OC\App\AppStore\Fetcher\Fetcher->fetch()
#7 /usr/share/webapps/nextcloud/lib/private/App/AppStore/Fetcher/Fetcher.php(190): OC\App\AppStore\Fetcher\AppFetcher->fetch()
#8 /usr/share/webapps/nextcloud/lib/private/App/AppStore/Fetcher/AppFetcher.php(187): OC\App\AppStore\Fetcher\Fetcher->get()
#9 /usr/share/webapps/nextcloud/lib/private/Installer.php(421): OC\App\AppStore\Fetcher\AppFetcher->get()
#10 /usr/share/webapps/nextcloud/lib/private/Updater.php(420): OC\Installer->isUpdateAvailable()
#11 /usr/share/webapps/nextcloud/lib/private/Updater.php(281): OC\Updater->upgradeAppStoreApps()
#12 /usr/share/webapps/nextcloud/lib/private/Updater.php(140): OC\Updater->doUpgrade()
#13 /usr/share/webapps/nextcloud/core/Command/Upgrade.php(225): OC\Updater->upgrade()
#14 /usr/share/webapps/nextcloud/3rdparty/symfony/console/Command/Command.php(255): OC\Core\Command\Upgrade->execute()
#15 /usr/share/webapps/nextcloud/3rdparty/symfony/console/Application.php(1009): Symfony\Component\Console\Command\Command->run()
#16 /usr/share/webapps/nextcloud/3rdparty/symfony/console/Application.php(273): Symfony\Component\Console\Application->doRunCommand()
#17 /usr/share/webapps/nextcloud/3rdparty/symfony/console/Application.php(149): Symfony\Component\Console\Application->doRun()
#18 /usr/share/webapps/nextcloud/lib/private/Console/Application.php(215): Symfony\Component\Console\Application->run()
#19 /usr/share/webapps/nextcloud/console.php(100): OC\Console\Application->run()
#20 /usr/share/webapps/nextcloud/occ(11): require_once('...')
#21 {main}%

Applying the workaround from #33487 (comment) fixed allowed the update to complete.


edit: I guess that is exxpected until #38091 finally gets merged. 😅

@mschilli87
Copy link

Just to follow up: The 26.0.1 -> 26.0.2 update was the first one since 24.0.5 to complete successfully without the need for any workarounds/hacks. Thanks to everybody contributing to fixing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Nextcloud 24 upgrade fail - bool given in /var/www/html/lib/private/Security/CertificateManager.php
7 participants