Reimplement the NumArray class based on an STL template. #1899

JohnHalleyGotway · 2021-08-31T16:34:59Z

Describe the Enhancement

This issue arose during development for MET #1875. For that issue, we found/fixed a bug that caused Stat-Analysis to consume way too much memory. However testing for that issue revealed an opportunity for additional optimization of memory usage, as described in this comment.

When running jobs over many, many combinations of station id, variable, level, and lead times using the -by option several times (over 30,000 jobs!), the Stat-Analysis tool consumes a great deal of memory. For each case, Stat-Analysis stores several NumArray objects to store arrays of values (i.e. fcst, obs, climo mean, stdev, and so on). However in MET version 10.0.0, the default allocation increment for NumArray is 1000. So any array with length less than 1000 actually consumes the same amount of memory an array of length 1000.

For the test described in comment, each case processes less than 100 time points. And for that we changed the default NumArray allocation increment from 1000 down to 100, which led to approximately a 75% reduction in memory usage for these two tests.

While running tens of thousands of jobs in Stat-Analysis is not done routinely, it is a good stress test and reveals an opportunity for optimization. Recommend that we reimplement the NumArray class using an STL template (like array or vector) rather than managing chunks of memory ourselves. Hopefully that'll slim down our memory usage.

When these changes are available on a feature branch for testing, recommend that we:

Configure DockerHub to build that feature branch.
Coordinate with @lindsayrblank rerun tests on that feature branch.

Time Estimate

3 days.

Sub-Issues

Consider breaking the enhancement down into sub-issues.
No sub-issues needed.

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

Split 2793541 and 2702691

Define the Metadata

Assignee

Select engineer(s) or no engineer required: John HG
Select scientist(s) or no scientist required: No scientist needed but coordinate with @lindsayrblank for testing.

Labels

Select component(s)
Select priority
Select requestor(s)

Projects and Milestone

Select Repository and/or Organization level Project(s) or add alert: NEED PROJECT ASSIGNMENT label
Select Milestone as the next official version or Future Versions

Define Related Issue(s)

Consider the impact to the other METplus components.

METplus, MET, METdatadb, METviewer, METexpress, METcalcpy, METplotpy
No impacts.

Enhancement Checklist

See the METplus Workflow for details.

The text was updated successfully, but these errors were encountered:

…r<double> e. Modifying functions accordingly. SL

…tions instead of arrays. In progress. SL

…es to vector. SL

…Array::clear() instead of vector::clear().

…ging. In progress. SL

… In progress. SL

…. In progres. SL

…vector for storage (instead of array). In progress. SL

… debugging. SL

…tc_poly_array_alloc_inc instead of num_array_alloc_inc in tc_poly.cc.

…ning make test. SL

…fter switching to STL::vector, it'll ALWAYS be an exact allocation. Previously, we always rounded up to the next allocation increment, but there's no allocation increment anymore. Updating the other code to removes calls to exact.

…nd vector.reserve(). SL

sethlinden · 2021-10-11T22:13:44Z

Ran a set of stress tests to compare running this feature-branch version of stat_analysis that uses the new num_array class (uses vectors) to the development branch version of stat_analysis that uses the original num_array class (uses arrays). Below I show the command line and results of running the command with "/usr/bin/time -v" which gives memory usage and how long it takes to run.

sethlinden · 2021-10-11T22:14:53Z

Here is the command line I used for both version of stat_analysis:

/usr/bin/time -v ./stat_analysis -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -lookin fake_test_data -job aggregate_stat -line_type MPR -out_line_type CNT -by OBS_SID,FCST_VAR,FCST_LEV -out MPR_to_CNT_by_station.out

sethlinden · 2021-10-11T22:25:32Z

Here is the tar-ball that contains the input files:

fake_test_data.tar.gz

sethlinden · 2021-10-11T22:28:43Z

So looking at the "time -v" output from the two large, stress tests, overall the new version (from this feature branch) uses quite a bit less memory but takes just slightly longer to run. Older version uses more memory but runs slightly faster.

Note, we look at Maximum resident set size: "A process's resident set size is the amount of memory that belongs to it and is currently present (resident) in RAM (real RAM, not swapped or otherwise not-resident)."

Here is the results from the feature branch version (uses new num_array class)

    User time (seconds): 2803.46
    System time (seconds): 17.20
    Percent of CPU this job got: 99%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 47:01.24
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 760896
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 186849
    Voluntary context switches: 75
    Involuntary context switches: 2977
    Swaps: 0
    File system inputs: 0
    File system outputs: 2351864
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0

Here is the results from the development branch version (uses original num_array class)

    User time (seconds): 2661.18
    System time (seconds): 18.28
    Percent of CPU this job got: 99%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 44:39.88
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 3276800
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 815905
    Voluntary context switches: 12
    Involuntary context switches: 2442
    Swaps: 0
    File system inputs: 0
    File system outputs: 2351888
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0

sethlinden · 2021-10-13T21:59:24Z

Tried some vector techniques within the num_array class to try and speed up stat analysis (in regards to the big stress test). Within num_array.cc, modified the clear() function to call e.reserve(80); Played around with several reserve values and ran several smaller stress tests along with some iterations of the large stress test.

From Slack: "Initial results from the big stress (with 80 "-lookin fake_test_data") does not show an improvement. I used a reserve value of 80. It took a bit more time to run and used more memory compared to the baseline (from yesterday without reserve):"

    User time (seconds): 2819.30
    System time (seconds): 15.80
    Percent of CPU this job got: 99%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 47:15.67
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 967416
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 238427
    Voluntary context switches: 23
    Involuntary context switches: 3016

sethlinden · 2021-10-13T22:00:39Z

Also tried running with a much bigger reserve number of 880.

From Slack: "Here are the results of the large stress test using 880 for the reserve value:"

    User time (seconds): 2748.81
    System time (seconds): 17.02
    Percent of CPU this job got: 99%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 46:06.35
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 3486980
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 868346
    Voluntary context switches: 19
    Involuntary context switches: 17044

sethlinden · 2021-10-13T22:01:36Z

Today, tried one more test: to check using the swap function in erase(), It actually uses slightly less memory but took longer to run, so not worth it.

    User time (seconds): 2928.14
    System time (seconds): 14.29
    Percent of CPU this job got: 99%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 49:03.63
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 760628
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 11
    Minor (reclaiming a frame) page faults: 186850
    Voluntary context switches: 275
    Involuntary context switches: 3151

sethlinden · 2021-10-13T22:02:32Z

So we will leave the num_array class as is. It use 23% less memory but takes 5% longer to run...this deemed acceptable.

sethlinden · 2021-10-14T16:35:09Z

Forgot to mention that initially, I ran an regression test to makes sure the output is exactly the same between this feature branch and the development branch:

mkdir -p /d1/projects/MET/MET_pull_requests/met-10.1.0/met-10.1.0_beta4/feature_1899
cd /d1/projects/MET/MET_pull_requests/met-10.1.0/met-10.1.0_beta4/feature_1899
/d1/projects/MET/MET_pull_requests/MET/scripts/regression/test_regression.sh feature_1899_num_array develop >& test_regression.log&

* Per issue #1899, changed the main storage array (double * e) to vector<double> e. Modifying functions accordingly. SL * Per issue #1899: removed commented out code. Still in progress. SL * Per issue #1899: modified more functions to use vector or vector functions instead of arrays. In progress. SL * Per issue #1899: updated add() functions to use push_back to add values to vector. SL * Per #1899, correct logic in NumArray::init_from_scratch() to call NumArray::clear() instead of vector::clear(). * Per issue #1899: added some temporary print statements for debugging. SL * Per issue #1899: added some more temporary print statements for debugging. In progress. SL * Per issue #1899: added some temporary print statements for debugging. In progress. SL * Per issue #1899: replaced Nelements with n_elements(). SL * Per issue #1899: modified extend function some more for using vectors. In progres. SL * Per issue #1899: continued to modify some functions for using a base vector for storage (instead of array). In progress. SL * Per issue #1899: removed / added a few temporary print statements for debugging. SL * Per #1899, this change seems to clearly be a bug. We should be using tc_poly_array_alloc_inc instead of num_array_alloc_inc in tc_poly.cc. * Per issue #1899: commented out some print statements for now. For running make test. SL * Per #1899, updating NumArray class to use STL::vector. * Per #1899, I removed the exact option from NumArray::extend() since after switching to STL::vector, it'll ALWAYS be an exact allocation. Previously, we always rounded up to the next allocation increment, but there's no allocation increment anymore. Updating the other code to removes calls to exact. * Per #1899, removing debug cout statements. * Per #1899, cleaning up one leftover debug cout line. * Per issue #1899: updated the erase() funciton to use vector.clear() and vector.reserve(). SL * Per issue: #1899, cleaned up clear() and erase() functions. SL Co-authored-by: Seth Linden <linden@kiowa.rap.ucar.edu> Co-authored-by: John Halley Gotway <johnhg@ucar.edu>

JohnHalleyGotway added this to the MET 10.1.0 milestone Aug 31, 2021

JohnHalleyGotway self-assigned this Aug 31, 2021

JohnHalleyGotway mentioned this issue Aug 31, 2021

Stat-Analysis consumes too much memory. #1875

Closed

22 tasks

JohnHalleyGotway removed the alert: NEED CYCLE ASSIGNMENT Need to assign to a release development cycle label Aug 31, 2021

JohnHalleyGotway assigned sethlinden Sep 16, 2021

sethlinden pushed a commit that referenced this issue Sep 22, 2021

Per issue #1899, changed the main storage array (double * e) to vecto…

515ba18

…r<double> e. Modifying functions accordingly. SL

TaraJensen added reporting: DTC NCAR Base NCAR Base DTC Project reporting: DTC NOAA R2O NOAA Research to Operations DTC Project and removed alert: NEED ACCOUNT KEY Need to assign an account key to this issue labels Sep 23, 2021

sethlinden pushed a commit that referenced this issue Sep 23, 2021

Per issue #1899: removed commented out code. Still in progress. SL

cc3efcc

sethlinden pushed a commit that referenced this issue Sep 24, 2021

Per issue #1899: modified more functions to use vector or vector func…

f2c6a53

…tions instead of arrays. In progress. SL

sethlinden pushed a commit that referenced this issue Sep 24, 2021

Per issue #1899: updated add() functions to use push_back to add valu…

0499b28

…es to vector. SL

JohnHalleyGotway added a commit that referenced this issue Sep 27, 2021

Per #1899, correct logic in NumArray::init_from_scratch() to call Num…

a8e3d0e

…Array::clear() instead of vector::clear().

sethlinden pushed a commit that referenced this issue Oct 1, 2021

Per issue #1899: added some temporary print statements for debugging. SL

4f62e4b

sethlinden pushed a commit that referenced this issue Oct 4, 2021

Per issue #1899: added some more temporary print statements for debug…

3894514

…ging. In progress. SL

sethlinden pushed a commit that referenced this issue Oct 4, 2021

Per issue #1899: added some temporary print statements for debugging.…

a700cc7

… In progress. SL

sethlinden pushed a commit that referenced this issue Oct 5, 2021

Per issue #1899: replaced Nelements with n_elements(). SL

6e90bec

sethlinden pushed a commit that referenced this issue Oct 5, 2021

Per issue #1899: modified extend function some more for using vectors…

6c59ed9

…. In progres. SL

sethlinden pushed a commit that referenced this issue Oct 6, 2021

Per issue #1899: continued to modify some functions for using a base …

512c45a

…vector for storage (instead of array). In progress. SL

sethlinden pushed a commit that referenced this issue Oct 6, 2021

Per issue #1899: removed / added a few temporary print statements for…

96661aa

… debugging. SL

JohnHalleyGotway added a commit that referenced this issue Oct 7, 2021

Per #1899, this change seems to clearly be a bug. We should be using …

3952c17

…tc_poly_array_alloc_inc instead of num_array_alloc_inc in tc_poly.cc.

sethlinden pushed a commit that referenced this issue Oct 7, 2021

Per issue #1899: commented out some print statements for now. For run…

6c3cfb4

…ning make test. SL

JohnHalleyGotway added a commit that referenced this issue Oct 7, 2021

Per #1899, updating NumArray class to use STL::vector.

fc268e3

JohnHalleyGotway added a commit that referenced this issue Oct 7, 2021

Per #1899, removing debug cout statements.

2131c00

JohnHalleyGotway added a commit that referenced this issue Oct 7, 2021

Per #1899, manually merge conflicts.

3332ac5

JohnHalleyGotway added a commit that referenced this issue Oct 7, 2021

Per #1899, cleaning up one leftover debug cout line.

20614f4

sethlinden pushed a commit that referenced this issue Oct 7, 2021

Per issue #1899: updated the erase() funciton to use vector.clear() a…

5786193

…nd vector.reserve(). SL

sethlinden closed this as completed Oct 11, 2021

sethlinden reopened this Oct 11, 2021

sethlinden pushed a commit that referenced this issue Oct 13, 2021

Per issue: #1899, cleaned up clear() and erase() functions. SL

afaef91

sethlinden mentioned this issue Oct 14, 2021

Feature 1899 num array #1944

Merged

12 tasks

JohnHalleyGotway removed their assignment Oct 14, 2021

sethlinden linked a pull request Oct 14, 2021 that will close this issue

Feature 1899 num array #1944

Merged

12 tasks

sethlinden closed this as completed Oct 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reimplement the NumArray class based on an STL template. #1899

Reimplement the NumArray class based on an STL template. #1899

JohnHalleyGotway commented Aug 31, 2021 •

edited by TaraJensen

Loading

sethlinden commented Oct 11, 2021

sethlinden commented Oct 11, 2021

sethlinden commented Oct 11, 2021 •

edited

Loading

sethlinden commented Oct 11, 2021 •

edited

Loading

sethlinden commented Oct 13, 2021

sethlinden commented Oct 13, 2021

sethlinden commented Oct 13, 2021

sethlinden commented Oct 13, 2021

sethlinden commented Oct 14, 2021

Reimplement the NumArray class based on an STL template. #1899

Reimplement the NumArray class based on an STL template. #1899

Comments

JohnHalleyGotway commented Aug 31, 2021 • edited by TaraJensen Loading

Describe the Enhancement

Time Estimate

Sub-Issues

Relevant Deadlines

Funding Source

Define the Metadata

Assignee

Labels

Projects and Milestone

Define Related Issue(s)

Enhancement Checklist

sethlinden commented Oct 11, 2021

sethlinden commented Oct 11, 2021

sethlinden commented Oct 11, 2021 • edited Loading

sethlinden commented Oct 11, 2021 • edited Loading

sethlinden commented Oct 13, 2021

sethlinden commented Oct 13, 2021

sethlinden commented Oct 13, 2021

sethlinden commented Oct 13, 2021

sethlinden commented Oct 14, 2021

JohnHalleyGotway commented Aug 31, 2021 •

edited by TaraJensen

Loading

sethlinden commented Oct 11, 2021 •

edited

Loading

sethlinden commented Oct 11, 2021 •

edited

Loading