Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloudinit/importer.py: print a meaningful message if import fails #1170

Merged
merged 1 commit into from
Jan 12, 2022

Conversation

esposem
Copy link
Contributor

@esposem esposem commented Jan 7, 2022

Proposed Commit Message

cloudinit/importer.py: print a meaningful message if import fails

Sometimes an import might fail for different reasons: the string
is wrongly typed, or the module has a dependency that is not
installed in python.

We should print that there is an import error, otherwise it might be
really difficult to understand what is the root cause of this
issue. Currently, cloud-init just ignores the error and continues.
This can have fatal consequences, especially when used to pick
the datasource to use.

For example, we can have a missing/misspelled datasource file:
$ ls cloudinit/sources/DataSourceInexistent.py
ls: cannot access 'cloudinit/sources/DataSourceInexistent.py': No such file or directory
$ python
>>> __import__("cloudinit.sources.DataSourceInexistent")

and in this case cloud-init should also hint that the error is
"No module named 'cloudinit.sources.DataSourceInexistent'"

If instead the datasource exists but uses a dependency that
is not installed (for example netifaces for DataSourceVMware):
$ ls cloudinit/sources/DataSourceVMWare.py
cloudinit/sources/DataSourceVMWare.py
$ python
>>> __import__("cloudinit.sources.DataSourceVMware")

then cloud-init should hint that the error is
"No module named 'netifaces'"

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>

Checklist:

  • My code follows the process laid out in the documentation
  • I have updated or added any unit tests accordingly
  • I have updated or added any documentation accordingly

@esposem esposem force-pushed the import_fix branch 3 times, most recently from 6cdd047 to ece452c Compare January 7, 2022 11:57
@TheRealFalcon TheRealFalcon self-assigned this Jan 10, 2022
@TheRealFalcon
Copy link
Member

Hey @esposem , theoretically, I agree that it's better to log when there's a failure, but with this enabled, we get about 50 or so lines about failed import. Those failed imports aren't a problem, but all of the log messages about them would be misleading. It looks like in the past logging around this was intentionally silenced: 141caf7#diff-5ad5eef05a628e49ee49dfa5c2128663ea6dd587e5af03d44a230aee9f7bdf8cR48-R49

Is it possible to add additional logging around where the find_import is being called?

@esposem
Copy link
Contributor Author

esposem commented Jan 11, 2022

Hi @TheRealFalcon, yes I also was not too happy about the tests I had to modify because of this additional log, but I did not expect it would be so many honestly.

The problem is that list_sources in cloudinit/sources/__init.py__ calls the find_module function, that then returns an empty list. And we don't know why the list is empty: it might be that the attributes of the datasource seached do not match, that the import fails, that the name is wrong (or file is missing) and so on. Do you have a better idea on how to handle this?

I don't really know how to handle this better (Update: see below), but I agree that this might not be the optimal solution, even though I wonder why are there all these imported modules missing, mainly the ubuntu. If they are missing, they should not be imported maybe?

@esposem
Copy link
Contributor Author

esposem commented Jan 11, 2022

@TheRealFalcon See if this is better: add the log as optional, and print only in cloudinit/sources/__init__.py, which is where I would like to have the log reported 😄

@TheRealFalcon
Copy link
Member

@esposem , this still results in some false negatives as it can search for a datasource in multiple places.

Would something like this (against main) work?

diff --git a/cloudinit/sources/__init__.py b/cloudinit/sources/__init__.py
index 9083f3995..d55997bab 100644
--- a/cloudinit/sources/__init__.py
+++ b/cloudinit/sources/__init__.py
@@ -929,6 +929,11 @@ def list_sources(cfg_list, depends, pkg_list):
         m_locs, _looked_locs = importer.find_module(
             ds_name, pkg_list, ["get_datasource_list"]
         )
+        if not m_locs:
+            LOG.error(
+                f"Could not import {ds_name}. Does the DataSource exist and "
+                "is it importable?"
+            )
         for m_loc in m_locs:
             mod = importer.import_module(m_loc)
             lister = getattr(mod, "get_datasource_list")

@esposem
Copy link
Contributor Author

esposem commented Jan 12, 2022

Not exactly what I was hoping for, as the exact cause of the empty list is still not explained, but I guess there is nothing else we can do about it, isn't it?
I will change the PR with the suggested change.

Thank you!
Emanuele

@esposem esposem force-pushed the import_fix branch 2 times, most recently from 2cd48e6 to 68c002e Compare January 12, 2022 11:06
Sometimes an import might fail for different reasons: the string
is wrongly typed, or the module has a dependency that is not
installed in python.

We should print that there is an import error, otherwise it might be
really difficult to understand what is the root cause of this
issue. Currently, cloud-init just ignores the error and continues.
This can have fatal consequences, especially when used to pick
the datasource to use.

For example, we can have a missing/misspelled datasource file:
$ ls cloudinit/sources/DataSourceInexistent.py
ls: cannot access 'cloudinit/sources/DataSourceInexistent.py': No such file or directory
$ python
>>> __import__("cloudinit.sources.DataSourceInexistent")

and in this case cloud-init should also hint that the error is
"No module named 'cloudinit.sources.DataSourceInexistent'"

If instead the datasource exists but uses a dependency that
is not installed (for example netifaces for DataSourceVMware):
$ ls cloudinit/sources/DataSourceVMWare.py
cloudinit/sources/DataSourceVMWare.py
$ python
>>> __import__("cloudinit.sources.DataSourceVMware")

then cloud-init should hint that the error is
"No module named 'netifaces'"

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Copy link
Member

@TheRealFalcon TheRealFalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'm open to refactoring the import logic so we can get some more granularity without additional failures, but that would obviously be a larger undertaking.

@TheRealFalcon TheRealFalcon merged commit fe745d4 into canonical:main Jan 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants