-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Daemonized daemon runners through circus #1217
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
giovannipizzi
previously approved these changes
Mar 5, 2018
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very good to me!
@giovannipizzi thanks. I will discuss this with Martin tomorrow. I just found a small problem with some more testing, so I will fix that first |
sphuber
force-pushed
the
fix_1055_daemon_process
branch
8 times, most recently
from
March 6, 2018 18:19
a7cb9af
to
5622e48
Compare
The AiiDA logger was still being hijacked by one of the old daemon tasks, in this case the workflow_stepper that is called by the new daemon, but this task was calling configure_logging with the old daemon log file 'celery.log'. By removing the logging configuration call and doing this only in the start_daemon call and passing the same log file as the circus watcher, the aiida logger now correctly writes to the profile specific logfile that the circus watcher also writes too
With the new circus daemon, every profile can have its very own daemon, that runs completely isolated and independent of all the other profile daemons. The original concept of a 'process' in the profile configuration, which allowed one to set a different profile for the 'daemon' and the 'verdi' process, no longer makes sense. Since every profile has its own daemon, a single profile is fully self contained, for both the verdi and daemon usage.
The functionality related to the circus daemon was too specific to be put in the basic profile configuration class and so a separate container is created
Now that each profile runs its own daemon, the concept of a daemon user is no longer relevant. Remove all the old code that dealt with this
Celery is no longer necessary now that the daemon is daemonized by the circus library
There was a problem with the config.json configuration file, which led to an exception in one of the profile_daemon.get_endpoint() calls in 'verdi daemon start'. Because the daemonize call was done before this, the exception was silently swallowed because the output was already redirected to the background. By moving the creating of the arbiter configuration before the daemonize call, the exception will now be displayed.
The default 'verdi daemon restart' will keep the circus daemon process alive and simply restart the workers. This will thus keep the number of workers that were running at the invocation of the command. The '--reset' flag will force the circus daemon process to also be shutdown and the whole thing to be restarted. This behavior emulates the calling of 'verdi daemon stop' followed by 'verdi daemon start' manually. After restart a new circus daemon process will be running with a single worker
The '--wait' option now properly changes the information that is printed to the screen
This allows us to call one single command in a subprocess from both the 'verdi daemon start' and 'verdi daemon restart --reset' commands. It also allows us to print a message that we're attempting to start the daemon and as soon as the subprocess is spawned we can try to call the client to verify whether it started and notify the user. This hopefully accomplishes that if the daemon fails to start, it does not do so completly silently
The recommended behavior for stopping or restarting the daemon should be to wait for it to be completed. The old default was not to wait and in order to wait the '--wait' flag had to be set. Here, we invert the logic and wait by default and if the user does not want to wait , they can set the '--no-wait' flag. This is the safer default behavior
On March 5 2018, tornado v5.0 was released, which would get installed automatically, however, this breaks another dependency, circus. As of this writing circus has not yet fixed this in a release or locked down the tornado dependency and so we have to lock it. Without this circus will not be able to start the daemon, which is rather crucial
sphuber
force-pushed
the
fix_1055_daemon_process
branch
from
March 6, 2018 18:36
adbcfe9
to
5927768
Compare
This was referenced Mar 7, 2018
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #1055
The initial work was done by @DropD in PR #1213. In this PR I built on that work to mostly rip out the old logic and remove anything that was made redundant.
This PR replaces
celery
withcircus
as the tool to daemonize theDaemonRunner
process, which has replaced the old polling based daemon. The new setup allows to easily run an individual and completely independent daemon for each profile. Callingverdi daemon start
will call theaiida.daemon.runner.start_daemon
function, which will start aDaemonRunner
, and the whole thing will be daemonized by acircus
process. The messages from the aiida logger will be logged toaiida-{profile_name}
in the log folder, and the messages from circus will go tocircus-{profile_name}
. Due to these changes a few concepts became redundant and have been removed:verdi
anddaemon
. This allowed for a different default profile to be set and used. This way a user could let the daemon use a different profile then the one they used for verdi commands. Again this concept has been made redundant and therefore removed.What I have tested so far, is that two different profiles that have two different databases and repositories can successfully have their respective daemons running in parallel and submitting for example a workchain under a specific profile, will be picked up by the correct daemon as evidenced by changes in the profiles database, repository and log file.
One potential problem that is not addressed here is what happens or should happen when two different launch their daemons in parallel and both point to the same database, repository or both. Probably this is not a good idea, and we should make the usage of a db/repo pair unique to a profile. Or at the very least, only allow one of them to run a daemon at one time. This will be addressed in a separate PR.
Things to do: