Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate and resolve Python 3.8 METplus interdependencies #2260

Closed
7 of 20 tasks
JohnHalleyGotway opened this issue Sep 14, 2022 · 8 comments
Closed
7 of 20 tasks

Investigate and resolve Python 3.8 METplus interdependencies #2260

JohnHalleyGotway opened this issue Sep 14, 2022 · 8 comments
Assignees
Labels
component: build process Build process issue component: testing Software testing issue MET: Python Embedding priority: high High Priority requestor: METplus Team METplus Development Team type: task An actionable item of work
Milestone

Comments

@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented Sep 14, 2022

Describe the Task

During the MET-11.0.0-beta3 development cycle, issue #2196 updated the version of python used in the MET Dockerfiles from 3.6 to 3.8.6. This was intended as an improvement to have MET use the same python version as what's used for automated testing in the METplus GHA. However this change had unintended consequences and caused some METplus use cases that leverage Python embedding to fail. A significant amount of time and debugging did not reveal an obvious solution. Prior to the met-11.0.0-beta3 development release, we rolled back to python version 3.6 by switching to base image dtcenter/met-base:v1.0.

This issue is to further investigate and hopefully resolve the downstream METplus Python embedding use case failures. If that is successful, update the MET python version to 3.8 in the following spots:

  • In Dockerfile and Dockerfile.copy, switch to dtcenter/met-base:v1.1.
  • In development.docker, switch libraries from 3.6 to 3.8.
  • In jobs/set_job_controls.sh, switch to met_base_tag=v1.1.
  • In build_docker_and_trigger_metplus.yml, switch to MET_BASE_TAG: v1.1.
    Also consider whether we could define the MET base image version in fewer spots.

Here are several things to consider:

  1. First, replicate the behavior. Replicating the METplus failures via Docker is rather involved. But the environment issue can be seen rather easily on seneca:
# Compile MET using python 3.8
# From top-level MET directory run a python embedding example and confirm that it works
bin/plot_data_plane PYTHON_NUMPY plot.ps 'name="scripts/python/read_ascii_numpy.py data/python/fcst.txt FCST";'
# Prepend to the path
export PATH=/home/met_test/.conda/envs/metplus_base.v5/bin:${PATH}
# Rerun the example
bin/plot_data_plane PYTHON_NUMPY plot.ps 'name="scripts/python/read_ascii_numpy.py data/python/fcst.txt FCST";'
# Observe the failure
ModuleNotFoundError: No module named 'numpy'
WARNING: 
WARNING: python_dataplane() -> an error occurred importing module "scripts/python/read_ascii_numpy"

Placing a bad (i.e. without required numpy and netcdf4 packages) python version in the path causes the python embedding to fail. We'd assumed that the python embedding version was defined at compilation time and were surprised that the bad python in the path triggered this error.

  1. Perhaps this situation has existed for a long time but was just being masked? Python 3.6 links to libraries that include 3.6 in their names while 3.8 has libraries with 3.8 in their names. So a bad METplus python 3.8 instance and it's 3.8 libraries wouldn't interfere with MET linking to libraries named 3.6.

  2. Recommend looking at the actual MET code (src/libcode/vx_data2d_python/python_dataplane.cc) to understand how python is being invoked. In particular, the "sys.path" lines should be examined. Recommend printing out the contents of the system path that the GlobalPython instance is using. See if/how it differs between the successful and unsuccessful runs in 1.

Time Estimate

Unknown

Sub-Issues

Consider breaking the task down into sub-issues.
Likely no sub-issues needed.

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

2792541

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required
  • Select scientist(s) or no scientist required

Labels

  • Select component(s)
  • Select priority
  • Select requestor(s)

Projects and Milestone

  • Select Repository and/or Organization level Project(s) or add alert: NEED PROJECT ASSIGNMENT label
  • Select Milestone as the next official version or Future Versions

Define Related Issue(s)

Consider the impact to the other METplus components.

Task Checklist

See the METplus Workflow for details.

  • Complete the issue definition above, including the Time Estimate and Funding Source.
  • Fork this repository or create a branch of develop.
    Branch name: feature_<Issue Number>_<Description>
  • Complete the development and test your changes.
  • Add/update log messages for easier debugging.
  • Add/update unit tests.
  • Add/update documentation.
  • Push local changes to GitHub.
  • Submit a pull request to merge into develop.
    Pull request: feature <Issue Number> <Description>
  • Define the pull request metadata, as permissions allow.
    Select: Reviewer(s) and Linked issues
    Select: Repository level development cycle Project for the next official release
    Select: Milestone as the next official version
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Close this issue.
@hankenstein2
Copy link
Contributor

I have a similar but slightly different situation. I have MET compile with a 3.9 version of Python installed via conda that has all the required packages installed along with it. If I then run python embedding without using PYTHON_EXE, the user environment will use a generic, system installed python 3.9 that does not contain required packages.

My solution, which worked for MET but messed up the users environment for other python tools, was to use a LD_LIBRARY_PATH pointing to the conda python environment.

@JohnHalleyGotway JohnHalleyGotway moved this from Backlog to To Do in MET-11.0.0-beta4 (11/02/22) Sep 21, 2022
@JohnHalleyGotway
Copy link
Collaborator Author

Recommend that @jprestop take the lead on this issue with @JohnHalleyGotway and @georgemccabe providing support.

@jprestop jprestop removed the alert: NEED ACCOUNT KEY Need to assign an account key to this issue label Oct 4, 2022
@jprestop
Copy link
Collaborator

I performed the steps listed in the issue and was able to recreate the problem. @JohnHalleyGotway, thank you for the detailed, clear instructions and the information on what the problem is. Tagging @georgemccabe in case you're interested in the update here too.

Setting the environment variable PYTHONHOME resolves this problem on seneca (see below), but does not resolve this problem in Docker. I'm still working on that. Putting notes below from testing on seneca in case I need to reference it later.

PATH is set such that /home/met_test/.conda/envs/metplus_base.v5/bin is before the version of Python that MET was compiled with (/usr/local/met-python3/bin):

(metplus_dev) jpresto@seneca:/d1/personal/jpresto/MET/git/MET-feature_2260_py3.8$ echo $PATH
/d1/personal/jpresto/MET/git/MET-feature_2260_py3.8/bin:/home/met_test/.conda/envs/metplus_base.v5/bin:/usr/local/netcdf-4.7.0/gcc-8.3.0/bin:/home/met_test/.conda/envs/metplus_dev/bin:/usr/local/anaconda3-20210517/condabin:/home/mccabe/METplus/ush:/usr/local/python3/bin:/usr/local/met-python3/bin:/usr/local/intel-psxe-2018u3/bin:/usr/bin:/bin:/etc:/rap/bin:/usr/local/bin:/home/jpresto/scripts/python:/home/jpresto/scripts/perl:/home/jpresto/scripts/perl/misc_scripts:/usr/local/R/bin

PYTHONHOME is not set:

(metplus_dev) jpresto@seneca:/d1/personal/jpresto/MET/git/MET-feature_2260_py3.8$ echo $PYTHONHOME

Run the code and receive the expected error:

(metplus_dev) jpresto@seneca:/d1/personal/jpresto/MET/git/MET-feature_2260_py3.8$ bin/plot_data_plane -v 4 PYTHON_NUMPY plot.ps 'name="scripts/python/read_ascii_numpy.py data/python/fcst.txt FCST";' 
DEBUG 1: Start plot_data_plane by jpresto(7414) at 2022-10-11 18:07:07Z  cmd: bin/plot_data_plane -v 4 PYTHON_NUMPY plot.ps name="scripts/python/read_ascii_numpy.py data/python/fcst.txt FCST"; 
DEBUG 1: Opening data file: PYTHON_NUMPY
DEBUG 4: Met2dDataFileFactory::new_met_2d_data_file() -> created new Met2dDataFile object of type "FileType_Python_Numpy".
DEBUG 4: VarInfoFactory::new_var_info() -> created new VarInfo object of type "FileType_Python_Numpy".
DEBUG 3: Calling /usr/local/met-python3/bin/python3 to run user's python script (scripts/python/read_ascii_numpy).
DEBUG 4: Writing temporary Python dataplane file:
DEBUG 4: 	/usr/local/met-python3/bin/python3 /d1/personal/jpresto/MET/git/MET-feature_2260_py3.8/share/met/wrappers/write_tmp_dataplane.py /tmp/tmp_met_nc_23158_0 scripts/python/read_ascii_numpy data/python/fcst.txt FCST
Python Script:	'/d1/personal/jpresto/MET/git/MET-feature_2260_py3.8/share/met/wrappers/write_tmp_dataplane.py'
User Command:	'scripts/python/read_ascii_numpy data/python/fcst.txt FCST'
Temporary File:	'/tmp/tmp_met_nc_23158_0'
Python Script:	'scripts/python/read_ascii_numpy'
Input File:	'data/python/fcst.txt'
Data Name:	'FCST'
Data Shape:	(129, 185)
Data Type:	dtype('float64')
Attributes:	{'valid': '20050807_120000', 'init': '20050807_000000', 'lead': '120000', 'accum': '120000', 'name': 'FCST', 'long_name': 'FCST_word', 'level': 'Surface', 'units': 'None', 'grid': {'type': 'Lambert Conformal', 'hemisphere': 'N', 'name': 'FooGrid', 'scale_lat_1': 25.0, 'scale_lat_2': 25.0, 'lat_pin': 12.19, 'lon_pin': -135.459, 'x_pin': 0.0, 'y_pin': 0.0, 'lon_orient': -95.0, 'd_km': 40.635, 'r_km': 6371.2, 'nx': 185, 'ny': 129}}
DEBUG 4: Reading temporary Python dataplane file: /tmp/tmp_met_nc_23158_0
Traceback (most recent call last):
  File "/d1/personal/jpresto/MET/git/MET-feature_2260_py3.8/share/met/wrappers/read_tmp_dataplane.py", line 10, in <module>
    import numpy as np
ModuleNotFoundError: No module named 'numpy'
WARNING: 
WARNING: tmp_nc_dataplane() -> an error occurred importing module "read_tmp_dataplane"
WARNING: 
ERROR  : 
ERROR  : plot_data_plane -> trouble getting field "name="scripts/python/read_ascii_numpy.py data/python/fcst.txt FCST";" from file "PYTHON_NUMPY"
ERROR  : 

Set PYTHONHOME:

(metplus_dev) jpresto@seneca:/d1/personal/jpresto/MET/git/MET-feature_2260_py3.8$ export PYTHONHOME=/usr/local/met-python3
(metplus_dev) jpresto@seneca:/d1/personal/jpresto/MET/git/MET-feature_2260_py3.8$ echo $PYTHONHOME
/usr/local/met-python3

and rerun, getting a successful result:

(metplus_dev) jpresto@seneca:/d1/personal/jpresto/MET/git/MET-feature_2260_py3.8$ bin/plot_data_plane PYTHON_NUMPY plot.ps 'name="scripts/python/read_ascii_numpy.py data/python/fcst.txt FCST";' 
DEBUG 1: Start plot_data_plane by jpresto(7414) at 2022-10-11 18:15:55Z  cmd: bin/plot_data_plane PYTHON_NUMPY plot.ps name="scripts/python/read_ascii_numpy.py data/python/fcst.txt FCST"; 
DEBUG 1: Opening data file: PYTHON_NUMPY
Python Script:	'/d1/personal/jpresto/MET/git/MET-feature_2260_py3.8/share/met/wrappers/write_tmp_dataplane.py'
User Command:	'scripts/python/read_ascii_numpy data/python/fcst.txt FCST'
Temporary File:	'/tmp/tmp_met_nc_23836_0'
Python Script:	'scripts/python/read_ascii_numpy'
Input File:	'data/python/fcst.txt'
Data Name:	'FCST'
Data Shape:	(129, 185)
Data Type:	dtype('float64')
Attributes:	{'valid': '20050807_120000', 'init': '20050807_000000', 'lead': '120000', 'accum': '120000', 'name': 'FCST', 'long_name': 'FCST_word', 'level': 'Surface', 'units': 'None', 'grid': {'type': 'Lambert Conformal', 'hemisphere': 'N', 'name': 'FooGrid', 'scale_lat_1': 25.0, 'scale_lat_2': 25.0, 'lat_pin': 12.19, 'lon_pin': -135.459, 'x_pin': 0.0, 'y_pin': 0.0, 'lon_orient': -95.0, 'd_km': 40.635, 'r_km': 6371.2, 'nx': 185, 'ny': 129}}
DEBUG 1: Creating postscript file: plot.ps
DEBUG 1: Finish plot_data_plane by jpresto(7414) at 2022-10-11 18:15:55Z

Despite PYTHONHOME being set, the Python version in the PATH is still used first as expected:

(metplus_dev) jpresto@seneca:/d1/personal/jpresto/MET/git/MET-feature_2260_py3.8$ which python
/home/met_test/.conda/envs/metplus_base.v5/bin/python
(metplus_dev) jpresto@seneca:/d1/personal/jpresto/MET/git/MET-feature_2260_py3.8$ python
Python 3.8.6 | packaged by conda-forge | (default, Jan 25 2021, 23:21:18) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
Traceback (most recent call last):
...

As noted above, more work is needed to get this to work in Docker.

@georgemccabe
Copy link
Collaborator

@jprestop , finding a function call in either the Python C++ library or the relevant Python script called by MET to set the same setting that occurs by setting PYTHONHOME may be a good solution to ensure that this happens only in that environment. Setting PYTHONHOME in the environment may introduce other issues.

@jprestop jprestop moved this from To Do to In Progress in MET-11.0.0-beta4 (11/02/22) Oct 13, 2022
@JohnHalleyGotway JohnHalleyGotway removed their assignment Oct 13, 2022
@jprestop jprestop moved this from Backlog to In Progress in MET-11.0.0-beta5 (11/20/22) Oct 24, 2022
@JohnHalleyGotway JohnHalleyGotway added the alert: NEED CYCLE ASSIGNMENT Need to assign to a release development cycle label Nov 10, 2022
@jprestop
Copy link
Collaborator

I'm going through some GitHub Issues. @JohnHalleyGotway and @georgemccabe Now that we are using Python 3.10 and it is working, is this still an issue that needs to be worked on?

@georgemccabe
Copy link
Collaborator

If I understand this issue correctly, I believe this was resolved with PR #2407

@georgemccabe
Copy link
Collaborator

This may be a duplicate of #2388

@DanielAdriaansen
Copy link
Contributor

FYI @jprestop @JohnHalleyGotway:

@georgemccabe and myself reviewed this issue and #2388, which was closed via #2407. We agreed that those two items addressed the problem outlined in this issue, and that this issue can be closed.

@JohnHalleyGotway JohnHalleyGotway moved this from 📋 Backlog to ✅ Done in MET-11.1.0 Development Jun 15, 2023
@JohnHalleyGotway JohnHalleyGotway removed the alert: NEED CYCLE ASSIGNMENT Need to assign to a release development cycle label Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: build process Build process issue component: testing Software testing issue MET: Python Embedding priority: high High Priority requestor: METplus Team METplus Development Team type: task An actionable item of work
Projects
No open projects
Status: Done
Development

No branches or pull requests

5 participants