Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dropping of unknown configuration keys can lead to inconsistent config state #4523

Closed
greschd opened this issue Oct 28, 2020 · 9 comments
Closed
Milestone

Comments

@greschd
Copy link
Member

greschd commented Oct 28, 2020

Describe the bug

When an unknown key is encountered in the configuration, it is simply dropped instead of going through the config migrations. This can lead to an inconsistent configuration that is not recoverable without manually editing it.

Steps to reproduce

Steps to reproduce the behavior:

  1. pip install aiida-core==1.4.1
  2. Set up a new profile. The config will contain
    "CONFIG_VERSION": {
        "CURRENT": 4,
        "OLDEST_COMPATIBLE": 3
    }
    and something like
    "tmpfoo": {
        "PROFILE_UUID": "d925405ed3f14935b8fc04dba127ba18",
        "AIIDADB_ENGINE": "postgresql_psycopg2",
        "AIIDADB_BACKEND": "django",
        "AIIDADB_NAME": "tmpfoo",
        "AIIDADB_PORT": 5432,
        "AIIDADB_HOST": "localhost",
        "AIIDADB_USER": "tmpfoo",
        "AIIDADB_PASS": "1234",
        "broker_protocol": "amqp",
        "broker_username": "guest",
        "broker_password": "guest",
        "broker_host": "127.0.0.1",
        "broker_port": 5672,
        "broker_virtual_host": "",
        "AIIDADB_REPOSITORY_URI": "file:///home/a-dogres/.aiida/repository/tmpfoo",
        "options": {},
        "default_user_email": ""
    }
  3. pip install aiida-core==1.3.1
  4. Run a command, e.g. plain verdi. The config will still contain
    "CONFIG_VERSION": {
        "CURRENT": 4,
        "OLDEST_COMPATIBLE": 3
    }
    but now with
    tmpfoo": {
        "PROFILE_UUID": "d925405ed3f14935b8fc04dba127ba18",
        "AIIDADB_ENGINE": "postgresql_psycopg2",
        "AIIDADB_BACKEND": "django",
        "AIIDADB_NAME": "tmpfoo",
        "AIIDADB_PORT": 5432,
        "AIIDADB_HOST": "localhost",
        "AIIDADB_USER": "tmpfoo",
        "AIIDADB_PASS": "1234",
        "AIIDADB_REPOSITORY_URI": "file:///home/a-dogres/.aiida/repository/tmpfoo",
        "options": {},
        "default_user_email": ""
    }
    
  5. pip install aiida-core==1.4.1
  6. Try running something
    In [1]: from aiida.engine import workfunction
    
    In [2]: @workfunction
    ...: def foo():
    ...:     pass
    ...:
    
    In [3]: foo()
    ---------------------------------------------------------------------------
    KeyError                                  Traceback (most recent call last)
    <ipython-input-3-c19b6d9633cf> in <module>
    ----> 1 foo()
    
    ~/.virtualenvs/tmp-59f8558ca0453c0/lib/python3.7/site-packages/aiida/engine/processes/functions.py in decorated_function(*args, **kwargs)
        179         def decorated_function(*args, **kwargs):
        180             """This wrapper function is the actual function that is called."""
    --> 181             result, _ = run_get_node(*args, **kwargs)
        182             return result
        183
    
    ~/.virtualenvs/tmp-59f8558ca0453c0/lib/python3.7/site-packages/aiida/engine/processes/functions.py in run_get_node(*args, **kwargs)
        118             """
        119             manager = get_manager()
    --> 120             runner = manager.create_runner(with_persistence=False)
        121             inputs = process_class.create_inputs(*args, **kwargs)
        122
    
    ~/.virtualenvs/tmp-59f8558ca0453c0/lib/python3.7/site-packages/aiida/manage/manager.py in create_runner(self, with_persistence, **kwargs)
        283         if 'communicator' not in settings:
        284             # Only call get_communicator if we have to as it will lazily create
    --> 285             settings['communicator'] = self.get_communicator()
        286
        287         if with_persistence and 'persister' not in settings:
    
    ~/.virtualenvs/tmp-59f8558ca0453c0/lib/python3.7/site-packages/aiida/manage/manager.py in get_communicator(self)
        161         """
        162         if self._communicator is None:
    --> 163             self._communicator = self.create_communicator()
        164
        165         return self._communicator
    
    ~/.virtualenvs/tmp-59f8558ca0453c0/lib/python3.7/site-packages/aiida/manage/manager.py in create_communicator(self, task_prefetch_count, with_orm)
        183             task_prefetch_count = self.get_config().get_option('daemon.worker_process_slots', profile.name)
        184
    --> 185         url = profile.get_rmq_url()
        186         prefix = profile.rmq_prefix
        187
    
    ~/.virtualenvs/tmp-59f8558ca0453c0/lib/python3.7/site-packages/aiida/manage/configuration/profile.py in get_rmq_url(self)
        342         from aiida.manage.external.rmq import get_rmq_url
        343         return get_rmq_url(
    --> 344             protocol=self.broker_protocol,
        345             username=self.broker_username,
        346             password=self.broker_password,
    
    ~/.virtualenvs/tmp-59f8558ca0453c0/lib/python3.7/site-packages/aiida/manage/configuration/profile.py in broker_protocol(self)
        192     @property
        193     def broker_protocol(self):
    --> 194         return self._attributes[self.KEY_BROKER_PROTOCOL]
        195
        196     @broker_protocol.setter
    
    KeyError: 'broker_protocol'

Expected behavior

The unknown keys should either remain in place, or the CURRENT_VERSION of the config needs to be changed to match its content.

Some thoughts

The original intent of the CURRENT_VERSION and OLDEST_COMPATIBLE in the config is that an older version of AiiDA can happily run on a newer config version, as long as all the keys it expects are still there and unchanged. As such, the config doesn't need migrating, and once we go back to the newer AiiDA version no migration is necessary.
Since there were quite some changes to all that since way-back-when, I'm not sure this is still the case. Thinking of it, adding a new profile while on the old AiiDA version (or doing any other kind of change) is also bound to confuse the newer AiiDA version. Maybe then the "compatible versions" concepts needs to be dropped, but we should always make sure to go through migrations (in both directions)?

@csadorf
Copy link
Contributor

csadorf commented Nov 5, 2020

I encountered this bug in the context of AiiDAlab.

@greschd
Copy link
Member Author

greschd commented Nov 5, 2020

Tagging @sphuber for comment.

@sphuber
Copy link
Contributor

sphuber commented Nov 5, 2020

I think we first need to decide whether we want to support bidirectional migrations of the config. Because if we decide to do so, that will come with significant development costs and complications. I am also not sure that we can always provide fully compatible backwards migrations. Take for example the migration that added a profile UUID. This was generated randomly. If the user uses the profile at that stage, the RabbitMQ queues would be based on this and their active processes associated to it. If they then migrate backwards, the profile UUID is dropped and when they migrate forwards again, a different UUID is generated, and they will have lost all their active processes. There might be an ad-hoc solution to this, but it is just an example that providing this functionality in a bullet-proof way is far from trivial.

The alternative, of not guaranteeing backwards migration, seems like the more reasonable option to me. Especially, given that we also don't provide this on the database level. Once you migrate to a specific version of the database schema, there is no going back. Of course config and database do not change their versions in sync, so there would be scenarios possible where you can change versions of aiida-core with the schema changing and not the database, but I think this is an edge case. Ultimately, the people running into the problem presented in this issue are mostly developers that are switching between AiiDA versions in the same environment. Maybe it is acceptable to not have full bidirectional migrations for that case.

That being said, maybe we can make the life of the developer a bit easier. We could think to remove the logic that drops unknown keys from the config once it is parsed, and instead simply ignore them, optionally printing a warning. I will leave it up to you to decide which of these three scenarios are most developer-friendly.

@greschd
Copy link
Member Author

greschd commented Nov 5, 2020

I don't really understand why the unknown configuration keys need to be dropped here in the first place. The

"CONFIG_VERSION": {
    "CURRENT": 4,
    "OLDEST_COMPATIBLE": 3
}

means (or is supposed to mean) that the latest AiiDA version that created this config has config version 4, but AiiDA versions with config version 3 should also be able to run under this config (without needing to change it back).

By dropping the keys, the assumption behind this system is broken. If we want to keep the OLDEST_COMPATIBLE system, the keys shouldn't be dropped. Maybe the version needs to be per profile, because an older version of AiiDA can come along and create a new profile.
Or we can go the "simple" route of only ever allowing one config version to exist. Note that the "warning" code should already exist, it's what happens when the CURRENT_VERSION is lower than OLDEST_COMPATIBLE.

I agree that bi-directional migrations are probably not the right solution.

@greschd
Copy link
Member Author

greschd commented Nov 5, 2020

Another option: Only allow config changes to be made when CURRENT_CONFIG_VERSION (of the code) matches the CURRENT key in the config.
That would still allow older AiiDA versions to run, but they can no longer change the config after a version with newer (but still backwards-compatible) config has run.

@csadorf
Copy link
Contributor

csadorf commented Nov 5, 2020

In my case it is possible that I accidentally ran an older version of AiiDA or so on my instance and it reverted the config (dropped the keys), but did not change the config version number. Then I ran a newer version and I started to observe this error with no idea what was going on. As a user I would have much preferred that AiiDA had transparently explained to me that I'm running an old version of AiiDA against a newer config and this just can't work instead of silently corrupting my configuration.

@greschd
Copy link
Member Author

greschd commented Nov 5, 2020

Yeah, it's clear the status quo is a bug - it's less clear what the "right" way to fix it would be.

@chrisjsewell
Copy link
Member

A few things to note here:

  1. This problem is somewhat alleviated since 6d8ec0b, as now the config will not be overwritten if you simply run any command, only if you run a command that specifically modifies the config.
  2. As of ♻️ REFACTOR: configuration management API and CLI #4712, the config.json is validated on load (after migrations) against https://github.com/aiidateam/aiida-core/blob/develop/aiida/manage/configuration/schema/config-v5.schema.json, which should help to identify early any corruption in the config

I would also be in favour of ditching OLDEST_COMPATIBLE, which I think is overcomplicates things

@chrisjsewell chrisjsewell added this to the v2.0.0 milestone Apr 30, 2021
@sphuber sphuber modified the milestones: v2.0.0, Post 2.0 Sep 22, 2021
@sphuber
Copy link
Contributor

sphuber commented Mar 14, 2022

This is fixed for v2.0.0b1

@sphuber sphuber closed this as completed Mar 14, 2022
@sphuber sphuber modified the milestones: Post 2.0, v2.0.0 Apr 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants