Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors during running aiida workflows #61

Closed
Angmar1989 opened this issue Apr 7, 2021 · 37 comments · Fixed by #60
Closed

Errors during running aiida workflows #61

Angmar1989 opened this issue Apr 7, 2021 · 37 comments · Fixed by #60

Comments

@Angmar1989
Copy link

Angmar1989 commented Apr 7, 2021

Still learning aiida, and follow the instructions in the aiida tutorials and try to run a workflow to calculate the Si structure step by step, then the codes and errors are as follows:

max@qmobile:~$ workon aiida
(aiida) max@qmobile:~$ verdi shell
Python 3.7.10 (default, Feb 20 2021, 21:17:23) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.21.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: PwBandsWorkChain = WorkflowFactory('quantumespresso.pw.bands')

In [2]: code = load_code(1)
   ...: structure = load_node(176)

In [3]: builder = PwBandsWorkChain.get_builder_from_protocol(code=code, structu
   ...: re=structure)
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/codes/aiida-core/aiida/orm/implementation/entities.py in get_extra(self, key)
    322         try:
--> 323             return self._dbmodel.extras[key]
    324         except KeyError as exception:

KeyError: '_default_stringency'

The above exception was the direct cause of the following exception:

AttributeError                            Traceback (most recent call last)
~/.virtualenvs/aiida/lib/python3.7/site-packages/aiida_pseudo/groups/mixins/cutoffs.py in get_default_stringency(self)
     85         try:
---> 86             return self.get_extra(self._key_default_stringency)
     87         except AttributeError as exception:

~/codes/aiida-core/aiida/orm/entities.py in get_extra(self, key, default)
    503         try:
--> 504             extra = self.backend_entity.get_extra(key)
    505         except AttributeError:

~/codes/aiida-core/aiida/orm/implementation/entities.py in get_extra(self, key)
    324         except KeyError as exception:
--> 325             raise AttributeError(f'extra `{exception}` does not exist') from exception
    326 

AttributeError: extra `'_default_stringency'` does not exist

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
<ipython-input-3-38edd14c83bc> in <module>
----> 1 builder = PwBandsWorkChain.get_builder_from_protocol(code=code, structure=structure)

~/.virtualenvs/aiida/lib/python3.7/site-packages/aiida_quantumespresso/workflows/pw/bands.py in get_builder_from_protocol(cls, code, structure, protocol, overrides, **kwargs)
    129         builder = cls.get_builder()
    130 
--> 131         relax = PwRelaxWorkChain.get_builder_from_protocol(*args, overrides=inputs.get('relax', None), **kwargs)
    132         scf = PwBaseWorkChain.get_builder_from_protocol(*args, overrides=inputs.get('scf', None), **kwargs)
    133         bands = PwBaseWorkChain.get_builder_from_protocol(*args, overrides=inputs.get('bands', None), **kwargs)

~/.virtualenvs/aiida/lib/python3.7/site-packages/aiida_quantumespresso/workflows/pw/relax.py in get_builder_from_protocol(cls, code, structure, protocol, overrides, relax_type, **kwargs)
    117         builder = cls.get_builder()
    118 
--> 119         base = PwBaseWorkChain.get_builder_from_protocol(*args, overrides=inputs.get('base', None), **kwargs)
    120         base_final_scf = PwBaseWorkChain.get_builder_from_protocol(
    121             *args, overrides=inputs.get('base_final_scf', None), **kwargs

~/.virtualenvs/aiida/lib/python3.7/site-packages/aiida_quantumespresso/workflows/pw/base.py in get_builder_from_protocol(cls, code, structure, protocol, overrides, electronic_type, spin_type, initial_magnetic_moments, **_)
    177             ) from exception
    178 
--> 179         cutoff_wfc, cutoff_rho = pseudo_family.get_recommended_cutoffs(structure=structure)
    180 
    181         parameters = inputs['pw']['parameters']

~/.virtualenvs/aiida/lib/python3.7/site-packages/aiida_pseudo/groups/mixins/cutoffs.py in get_recommended_cutoffs(self, elements, structure, stringency)
    160         cutoffs_wfc = []
    161         cutoffs_rho = []
--> 162         cutoffs = self.get_cutoffs(stringency=stringency)
    163 
    164         for element in symbols:

~/.virtualenvs/aiida/lib/python3.7/site-packages/aiida_pseudo/groups/mixins/cutoffs.py in get_cutoffs(self, stringency)
    127         :raises ValueError: if the requested stringency is not defined for this family.
    128         """
--> 129         stringency = stringency or self.get_default_stringency()
    130 
    131         try:

~/.virtualenvs/aiida/lib/python3.7/site-packages/aiida_pseudo/groups/mixins/cutoffs.py in get_default_stringency(self)
     86             return self.get_extra(self._key_default_stringency)
     87         except AttributeError as exception:
---> 88             raise ValueError('no default stringency has been defined.') from exception
     89 
     90     def get_cutoff_stringencies(self) -> tuple:

ValueError: no default stringency has been defined.`

I didn't see the stringency define or stringency setting command line in the tutorial, the error description said it has something to do with the cutoff.py? I remember when I download the pseudo library, I also download a cutoffs table file with an extra json (however, I don't know how to use it at the moment), did they have some connection? My learning stuck again, and totally at a loss right now.

@Angmar1989
Copy link
Author

It seems that the KeyError: '_default_stringency' while appling entities.py file is the main problem, but what is that?

@Angmar1989
Copy link
Author

@sphuber , I really need your help!

@sphuber
Copy link
Contributor

sphuber commented Apr 7, 2021

The problem is that you probably installed the pseudo family with the aiida-pseudo install family command instead of the automated aiida-pseudo install sssp. This will give you a working pseudo potetnial family, but it won't have the recommended cutoffs installed. Only the automated installer does this. The QE protocol input generators rely on these recommended cutoffs to automatically determine the required cutoffs though, so that is why it is failing.

I already foresaw this problem and have added a feature that allows defining the cutoffs manually as well. There is an open PR for this: #55
Once this gets merged and released you can also create working families manually.

@Angmar1989
Copy link
Author

The problem is that you probably installed the pseudo family with the aiida-pseudo install family command instead of the automated aiida-pseudo install sssp. This will give you a working pseudo potetnial family, but it won't have the recommended cutoffs installed. Only the automated installer does this. The QE protocol input generators rely on these recommended cutoffs to automatically determine the required cutoffs though, so that is why it is failing.

I already foresaw this problem and have added a feature that allows defining the cutoffs manually as well. There is an open PR for this: #55
Once this gets merged and released you can also create working families manually.

Thanks very much, I'll try it right away.

@Angmar1989
Copy link
Author

Angmar1989 commented Apr 8, 2021

I read the #55 as well as #52 and still have no idea how to add a json file to make this work, should this added cutoff family have a suggested label as the sssp family did? And how the codes published in this website be used after downloaded? Which folder should I place them? Ah, so many questions. Maybe I should try to auto install sssp pseudo library again.

@Angmar1989
Copy link
Author

Angmar1989 commented Apr 8, 2021

And I already installed sssp, so how to uninstall it?

@sphuber
Copy link
Contributor

sphuber commented Apr 8, 2021

And I already installed sssp, so how to uninstall it?

You can simply delete the corresponding group, so for example

verdi group delete SSSP/1.1/PBE/efficiency

@Angmar1989
Copy link
Author

And I already installed sssp, so how to uninstall it?

You can simply delete the corresponding group, so for example

verdi group delete SSSP/1.1/PBE/efficiency

(aiida) max@qmobile:~$ aiida-pseudo install sssp -t
Info: downloading selected pseudo potentials archive... [OK]
Info: downloading selected pseudo potentials metadata... [OK]
Info: unpacking archive and parsing pseudos... [FAILED]
Critical: Compressed file ended before the end-of-stream marker was reached
Traceback (most recent call last):
File "/home/max/.virtualenvs/aiida/lib/python3.7/site-packages/aiida_pseudo/cli/utils.py", line 23, in attempt
yield
File "/home/max/.virtualenvs/aiida/lib/python3.7/site-packages/aiida_pseudo/cli/install.py", line 128, in cmd_install_sssp
family = create_family_from_archive(SsspFamily, label, filepath_archive)
File "/home/max/.virtualenvs/aiida/lib/python3.7/site-packages/aiida_pseudo/cli/utils.py", line 55, in create_family_from_archive
shutil.unpack_archive(filepath_archive, dirpath, format=fmt)
File "/usr/lib/python3.7/shutil.py", line 1002, in unpack_archive
func(filename, extract_dir, **kwargs)
File "/usr/lib/python3.7/shutil.py", line 937, in _unpack_tarfile
tarobj.extractall(extract_dir)
File "/usr/lib/python3.7/tarfile.py", line 2002, in extractall
numeric_owner=numeric_owner)
File "/usr/lib/python3.7/tarfile.py", line 2044, in extract
numeric_owner=numeric_owner)
File "/usr/lib/python3.7/tarfile.py", line 2114, in _extract_member
self.makefile(tarinfo, targetpath)
File "/usr/lib/python3.7/tarfile.py", line 2163, in makefile
copyfileobj(source, target, tarinfo.size, ReadError, bufsize)
File "/usr/lib/python3.7/tarfile.py", line 247, in copyfileobj
buf = src.read(bufsize)
File "/usr/lib/python3.7/gzip.py", line 287, in read
return self._buffer.read(size)
File "/usr/lib/python3.7/_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "/usr/lib/python3.7/gzip.py", line 493, in read
raise EOFError("Compressed file ended before the "
EOFError: Compressed file ended before the end-of-stream marker was reached

Strange, auto install still falures, but this time, error report is much longer than before.

@sphuber
Copy link
Contributor

sphuber commented Apr 8, 2021

do you have enough empty space on your filesystem? The command fails while it is unpacking the downloaded archive, so the download was fine. I am not sure why the unpacking is failing but it might be you just don't have enough space on your disk

@Angmar1989
Copy link
Author

do you have enough empty space on your filesystem? The command fails while it is unpacking the downloaded archive, so the download was fine. I am not sure why the unpacking is failing but it might be you just don't have enough space on your disk

Empty space is sufficient for extraction, 64 G was assigned to the system.

@Angmar1989
Copy link
Author

do you have enough empty space on your filesystem? The command fails while it is unpacking the downloaded archive, so the download was fine. I am not sure why the unpacking is failing but it might be you just don't have enough space on your disk

Since I could install the downloaded compressed Pseudo library file, the empty space is not a problem.

@Angmar1989
Copy link
Author

do you have enough empty space on your filesystem? The command fails while it is unpacking the downloaded archive, so the download was fine. I am not sure why the unpacking is failing but it might be you just don't have enough space on your disk

I still prefer to install pseudo library offline, choose my own pseudo library freely.

@sphuber
Copy link
Contributor

sphuber commented Apr 8, 2021

Since I could install the downloaded compressed Pseudo library file, the empty space is not a problem.

That is not necessarily true. Let' say the archive is 1 GB and you have 1.5 GB available. After download you only have 0.5 GB remaining and so there is not enough space to extract. What does df -h print?

@Angmar1989
Copy link
Author

Since I could install the downloaded compressed Pseudo library file, the empty space is not a problem.

That is not necessarily true. Let' say the archive is 1 GB and you have 1.5 GB available. After download you only have 0.5 GB remaining and so there is not enough space to extract. What does df -h print?

无标题

@sphuber
Copy link
Contributor

sphuber commented Apr 8, 2021

Ok, it seems there is indeed enough space. Then maybe the downloaded archive is corrupted and that is why unzipping fails. I should maybe add a check to the download that verifies the md5 checksum of the archive to make sure it downloaded correctly. I have no idea why the download step would be successful for a corrupted file though.

@Angmar1989
Copy link
Author

Ok, it seems there is indeed enough space. Then maybe the downloaded archive is corrupted and that is why unzipping fails. I should maybe add a check to the download that verifies the md5 checksum of the archive to make sure it downloaded correctly. I have no idea why the download step would be successful for a corrupted file though.

It seems that unpacking starts before file is fully downloaded, because the pseudo library pack is about 38 Mb, but unpack start at 34 or 36 Mb downloaded, strange phenomenon.

@Angmar1989
Copy link
Author

Ok, it seems there is indeed enough space. Then maybe the downloaded archive is corrupted and that is why unzipping fails. I should maybe add a check to the download that verifies the md5 checksum of the archive to make sure it downloaded correctly. I have no idea why the download step would be successful for a corrupted file though.

I wonder how could I add the changes you presented in this website to my quantum mobile? Just download these codes, and place them in the HOME folder?

@sphuber
Copy link
Contributor

sphuber commented Apr 8, 2021

You just clone this repo, checkout whatever branch you need and install it, e.g.:

git clone https://github.com/aiidateam/aiida-pseudo
cd aiida-pseudo
git checkout  fix/050/cli-set-recommended-cutoffs
pip install -e .
reentry scan

Note that the PR might not necessarily provide all the functionality you need as there are multiple interdependent open PRs.

@Angmar1989
Copy link
Author

You just clone this repo, checkout whatever branch you need and install it, e.g.:

git clone https://github.com/aiidateam/aiida-pseudo
cd aiida-pseudo
git checkout  fix/050/cli-set-recommended-cutoffs
pip install -e .
reentry scan

Note that the PR might not necessarily provide all the functionality you need as there are multiple interdependent open PRs.

Thanks for your patient explanation, but is there any way that I could introduce my downloaded cutoff json file to the aiida workflow to solve the problem I mentioned in this issue? On the basis of the aiida I have now, with sssp installed offline.

@sphuber
Copy link
Contributor

sphuber commented Apr 8, 2021

with the fix/050/cli-set-recommended-cutoffs branch checked out and installed as I explained above, you should be able to run

aiida-pseudo family cutoffs set SSSP/1.1/PBE/efficiency SSSP_1.1_PBE_efficiency.json

@Angmar1989
Copy link
Author

with the fix/050/cli-set-recommended-cutoffs branch checked out and installed as I explained above, you should be able to run

aiida-pseudo family cutoffs set SSSP/1.1/PBE/efficiency SSSP_1.1_PBE_efficiency.json

无标题
What the argument of the -s should be?

@sphuber
Copy link
Contributor

sphuber commented Apr 8, 2021

Use -s normal

@Angmar1989
Copy link
Author

Use -s normal

无标题1
Is that mean I should delete everything in the json file except cutoff and rho_cutoff? The json file is as follows
无标题2

@sphuber
Copy link
Contributor

sphuber commented Apr 8, 2021

You need to transform it into the format:

{
    "Ag": {
        "cutoff_wfc": 50.0,
        "cutoff_rho": 200.0
    }
}

i.e. turn cutoff into cutoff_wfc and add cutoff_rho which is cutoff * dual

@Angmar1989
Copy link
Author

You need to transform it into the format:

{
    "Ag": {
        "cutoff_wfc": 50.0,
        "cutoff_rho": 200.0
    }
}

i.e. turn cutoff into cutoff_wfc and add cutoff_rho which is cutoff * dual

It works! Very thankful for your incredable patience, the workflow could use the cutoff table I downloaded! I'm sure I'll be back with more questions as I continue to learn aiida calculation, many thanks!
无标题3
无标题4

@sphuber sphuber self-assigned this Apr 9, 2021
@sphuber
Copy link
Contributor

sphuber commented Apr 9, 2021

I have added additional information to the README in this PR #60 so with that I will close this issue

@Angmar1989
Copy link
Author

You need to transform it into the format:

{
    "Ag": {
        "cutoff_wfc": 50.0,
        "cutoff_rho": 200.0
    }
}

i.e. turn cutoff into cutoff_wfc and add cutoff_rho which is cutoff * dual

When I continue, new problem emerges, in the tutorial the job state is run not queue, why is that?
无标题5
无标题6
无标题7

@Angmar1989
Copy link
Author

I have added additional information to the README in this PR #60 so with that I will close this issue

It may not have something to do with the pseudo potential and cutoff table we install in this unusual way I hope......

@sphuber
Copy link
Contributor

sphuber commented Apr 9, 2021

This should have nothing to do with the pseudopotentials. This should be specific to the aiida-quantumespresso plugin. However, looking at the images, there doesn't seem to be a problem with the plugin. The workchain failed because the PwCalculation<479> failed with exit code 305, which is an unrecoverable error. Both the stdout and the output xml of the job were invalid. You can have a look at them with:

verdi calcjob outputcat 479 | less

That should tell you why the calculation failed. If you expect this is due to a problem with the plugin, please open an issue on that repository: https://github.com/aiidateam/aiida-quantumespresso/issues/new

@Angmar1989
Copy link
Author

This should have nothing to do with the pseudopotentials. This should be specific to the aiida-quantumespresso plugin. However, looking at the images, there doesn't seem to be a problem with the plugin. The workchain failed because the PwCalculation<479> failed with exit code 305, which is an unrecoverable error. Both the stdout and the output xml of the job were invalid. You can have a look at them with:

verdi calcjob outputcat 479 | less

That should tell you why the calculation failed. If you expect this is due to a problem with the plugin, please open an issue on that repository: https://github.com/aiidateam/aiida-quantumespresso/issues/new

The result is like this
无标题8

@Angmar1989
Copy link
Author

This should have nothing to do with the pseudopotentials. This should be specific to the aiida-quantumespresso plugin. However, looking at the images, there doesn't seem to be a problem with the plugin. The workchain failed because the PwCalculation<479> failed with exit code 305, which is an unrecoverable error. Both the stdout and the output xml of the job were invalid. You can have a look at them with:

verdi calcjob outputcat 479 | less

That should tell you why the calculation failed. If you expect this is due to a problem with the plugin, please open an issue on that repository: https://github.com/aiidateam/aiida-quantumespresso/issues/new

2 porcessors means 2 cpu or 2 core?

@Angmar1989
Copy link
Author

This should have nothing to do with the pseudopotentials. This should be specific to the aiida-quantumespresso plugin. However, looking at the images, there doesn't seem to be a problem with the plugin. The workchain failed because the PwCalculation<479> failed with exit code 305, which is an unrecoverable error. Both the stdout and the output xml of the job were invalid. You can have a look at them with:

verdi calcjob outputcat 479 | less

That should tell you why the calculation failed. If you expect this is due to a problem with the plugin, please open an issue on that repository: https://github.com/aiidateam/aiida-quantumespresso/issues/new

I think I should reset the ibrav value

@Angmar1989
Copy link
Author

This should have nothing to do with the pseudopotentials. This should be specific to the aiida-quantumespresso plugin. However, looking at the images, there doesn't seem to be a problem with the plugin. The workchain failed because the PwCalculation<479> failed with exit code 305, which is an unrecoverable error. Both the stdout and the output xml of the job were invalid. You can have a look at them with:

verdi calcjob outputcat 479 | less

That should tell you why the calculation failed. If you expect this is due to a problem with the plugin, please open an issue on that repository: https://github.com/aiidateam/aiida-quantumespresso/issues/new

No, that's not all...
无标题9

@sphuber
Copy link
Contributor

sphuber commented Apr 9, 2021

As I said, this is a problem with QE and so please open an issue on aiida-quantumespresso. Report with what inputs you ran the PwBandsWorkChain and show the error too many bands, or too few plane waves. It would be good to also include the input file generated by the plugin. You can get it with verdi calcjob inputcat 479

@Angmar1989
Copy link
Author

As I said, this is a problem with QE and so please open an issue on aiida-quantumespresso. Report with what inputs you ran the PwBandsWorkChain and show the error too many bands, or too few plane waves. It would be good to also include the input file generated by the plugin. You can get it with verdi calcjob inputcat 479

OK, I'll open an issue there, thanks.

@Angmar1989
Copy link
Author

As I said, this is a problem with QE and so please open an issue on aiida-quantumespresso. Report with what inputs you ran the PwBandsWorkChain and show the error too many bands, or too few plane waves. It would be good to also include the input file generated by the plugin. You can get it with verdi calcjob inputcat 479

I still have one more question, all codes are set to default to use 2 CPUs, how to reset the CPU number?

@Angmar1989
Copy link
Author

As I said, this is a problem with QE and so please open an issue on aiida-quantumespresso. Report with what inputs you ran the PwBandsWorkChain and show the error too many bands, or too few plane waves. It would be good to also include the input file generated by the plugin. You can get it with verdi calcjob inputcat 479

I mean reset the CPU number in the code(for example pw code)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants