Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError reading head file #801

Closed
johnrobertcraven opened this issue Feb 4, 2020 · 10 comments
Closed

ValueError reading head file #801

johnrobertcraven opened this issue Feb 4, 2020 · 10 comments

Comments

@johnrobertcraven
Copy link

Not sure if this is an error or a question for Stack Overflow. Please advise if SO is the better place and I'll remove this issue.

I'm getting the following error while trying to extract records from a head file.

flopy is installed in C:\Anaconda\lib\site-packages\flopy
Traceback (most recent call last):
File "read_mdl.py", line 20, in
recs = hdobj.get_alldata()
File "C:\Anaconda\lib\site-packages\flopy\utils\datafile.py", line 476, in get_alldata
h = self.get_data(totim=totim, mflay=mflay)
File "C:\Anaconda\lib\site-packages\flopy\utils\datafile.py", line 438, in get_data
data = self._get_data_array(totim1)
File "C:\Anaconda\lib\site-packages\flopy\utils\datafile.py", line 343, in _get_data_array
data = np.empty((self.nlay, nrow, ncol), dtype=self.realtype)
ValueError: array is too big; arr.size * arr.dtype.itemsize is larger than the maximum possible size.

The file is fairly large, 1.73 GB. The model is 9 layers, ~300 rows, ~150 cols. I'm saving ~500 time steps.

I'm calling the function here:

hdobj = flopy.utils.HeadFile(h_file,precision='double')
recs = hdobj.get_alldata()

trying to look at the time steps in the file seems like it's returning junk:

hdobj.get_times()
[0.0, 6.013469302926925e-154, 6.013470016999068e-154]

hdobj.nrow
160
hdobj.ncol
1145128264
hdobj.nlay
2147005518

If I exclude precision:

Error. Precision could not be determined for heads_cut_test.hds
Traceback (most recent call last):
File "", line 1, in
File "C:\Anaconda\lib\site-packages\flopy\utils\binaryfile.py", line 450, in init
raise Exception()
Exception

I'm using flopy v3.2.11, Python 3.7.1 (64-bit), and my machine has 32 GB RAM installed.

Thanks!

@jdhughes-usgs
Copy link
Contributor

jdhughes-usgs commented Feb 4, 2020

What program wrote the head file? Only MODFLOW 6 is double precision by default. Also you need to use a version of MODFLOW that creates true binary files (no fortran headers). MODFLOW 6 and windows versions of MODFLOW from the USGS website are correctly configured. You can also get versions with the correct binary file format from:

https://github.com/MODFLOW-USGS/executables/releases

Also, it probably isn't ideal to use get_alldata() for your head file since it loads all of the head data into a 4D array (size(times), nlay, nrow, ncol). I would iterate over get_times() and use get_data(totim=time).

If you could, it would be nice to add this as question on stack overflow once we resolve it.

@johnrobertcraven
Copy link
Author

I'm using the mflgr_double.exe that originated here (in the bin.zip)

https://water.usgs.gov/GIS/metadata/usgswrd/XML/sir2019-5052.xml
https://pubs.er.usgs.gov/publication/sir20195052

I'm running the 2050y_lgr model using the batch file that was provided. In the parent model I modified the output control to save specific sp/ts.

Are you familiar with this model/exe?

Thanks for the note re: get_alldata(). I've used this in the past without any performance issues.

Thank you for your help.

@jdhughes-usgs
Copy link
Contributor

I am not familiar with that specific executable but I suspect the problem is a result of the binary file header format type. Currently there isn't a double-precision version of mflgr in the windows executables zipfile on github. There are double-precision linux and osx versions of mflgr on github.

If you need that particular version of mflgr you will have to compile it for yourself if you want it to work with flopy.

@johnrobertcraven
Copy link
Author

Thanks for taking a look.

I stopped the model so that the hds file would be smaller to test whether or not the file size was the issue. The partial run hds file is less than half a gig.

It looks like the nrow/ncol are correct however a huge number is being read for nlay.

See below:

recs = hdobj.get_alldata()
Traceback (most recent call last):
File "", line 1, in
File "C:\Anaconda\lib\site-packages\flopy\utils\datafile.py", line 476, in get_alldata
h = self.get_data(totim=totim, mflay=mflay)
File "C:\Anaconda\lib\site-packages\flopy\utils\datafile.py", line 438, in get_data
data = self._get_data_array(totim1)
File "C:\Anaconda\lib\site-packages\flopy\utils\datafile.py", line 343, in _get_data_array
data = np.empty((self.nlay, nrow, ncol), dtype=self.realtype)
MemoryError: Unable to allocate array with shape (2147066436, 312, 160) and data type float64

Does that confirm your suspicion re: binary file header format type?

Thanks,

John

@jdhughes-usgs
Copy link
Contributor

Probably. Please try the executable in the attached zip file.
mflgrdbl.zip

@johnrobertcraven
Copy link
Author

The exe you provided results in the following:

MODFLOW-LGR2
U.S. GEOLOGICAL SURVEY MODULAR FINITE-DIFFERENCE GROUND-WATER FLOW MODEL
Version 2.0.0 06/25/2013

RUNNING MODFLOW WITH LGR
NGRIDS = 2
Using NAME file: ../2050y_parent_hydros/tran_projmedDmedS.nam
Run start date and time (yyyy/mm/dd hh:mm:ss): 2020/02/04 14:14:49

Using NAME file: ../2050y_child/child4_projmedDmedS.nam
forrtl: severe (157): Program Exception - access violation
Image PC Routine Line Source

mflgrdbl.exe 00007FF6D55915FB GWF2MNW27AD 1868 gwf2mnw27.f
mflgrdbl.exe 00007FF6D550A4D3 MAIN__ 236 mflgr.f
mflgrdbl.exe 00007FF6D5863322 Unknown Unknown Unknown
mflgrdbl.exe 00007FF6D586EB0C Unknown Unknown Unknown
KERNEL32.DLL 00007FFA50937BD4 Unknown Unknown Unknown
ntdll.dll 00007FFA522ECED1 Unknown Unknown Unknown

I recompiled the modified source code for the USGS model (found in the above links).
The version I recompiled ran. When I stopped the model and tried to read the hds file I got the following:

File "C:\Anaconda\lib\site-packages\flopy\utils\datafile.py", line 476, in get_alldata
h = self.get_data(totim=totim, mflay=mflay)
File "C:\Anaconda\lib\site-packages\flopy\utils\datafile.py", line 438, in get_data
data = self._get_data_array(totim1)
File "C:\Anaconda\lib\site-packages\flopy\utils\datafile.py", line 343, in _get_data_array
data = np.empty((self.nlay, nrow, ncol), dtype=self.realtype)
MemoryError: Unable to allocate array with shape (1183774756, 1, 312) and data type float64

This is slightly different from the previous error I was seeing.

@jdhughes-usgs
Copy link
Contributor

Not sure why the executable I provided is not working.

Since you can compile the source code on your end, replace openspec.inc in your source directory with the attached file. This modified file will create the correct header information in binary output files. I would delete your existing head and binary budget file before running your recompiled executable.

openspec.inc.zip

@johnrobertcraven
Copy link
Author

johnrobertcraven commented Feb 5, 2020

Thanks for sending that over.

Using that file makes the exe very slow.
The model crashes with a segmentation fault:

Program received signal SIGSEGV: Segmentation
fault - invalid memory reference.

Backtrace for this error:
-0 0x6510d3
-1 0x6480cb
-2 0x6390ee
-3 0x755fa88f

*replaced pound sign with - so as to not link to other issues

@jdhughes-usgs
Copy link
Contributor

That file should not have any effect on run times since it just changes the output header information. It could be an issue with the optimization you are using with your compiler.

I don't have any idea why there a segfault in your compiled version of the code. Could be a difference between the compiler used to compile the original version of the executable and your compiler (intel, gfortran/gcc, g95, etc.).

If you want to compile the code yourself from the source files in the model repository, you could uncomment DATA FORM/'BINARY'/ and comment out DATA FORM/'UNFORMATTED'/ if you have the Intel fortran (ifort) compiler in the original openspec.inc file . It is unlikely the makefile_intel makefile in the model repository will work for you unless the paths to the compilers are the same as in the makefile. The makefile makefile in the model repository is also set up for the g95 compiler. Also it does not appear the two makefiles are setup the same to create a double-precision executable.

I have compiled and attached a single- (mflgr.exe) version of MODFLOW-LGR2 for you. It was compiled with the Intel fortran (ifort) and Microsoft C++ (cl) compilers and /O2 optimization. I suspect a standard single-precision version may be good enough for the model.

mflgr.exe.zip

At this point this is no longer a flopy issue. You will need to reach out to the authors of the model for further assistance if the attached executables don't work for you.

@johnrobertcraven
Copy link
Author

Thank you for all of your help. The exe you provided runs, but ends before the first stress period finishes with
"pause
Press any key to continue . . ."

I've reached out to the authors of the model. I'll close this issue.
If I'm able to get it worked out and the solution seems relevant I'll update this thread with the solution and move it over to the Stack Overflow page.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants