Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Framework Bump for QoL and Performance monitoring features #1251

Merged
merged 5 commits into from
Jan 23, 2024

Conversation

tomeichlersmith
Copy link
Member

@tomeichlersmith tomeichlersmith commented Jan 15, 2024

I am updating ldmx-sw, here are the details.

What are the issues that this addresses?

This addresses several non-critical Quality of Life (QoL) issues within Framework itself. I am opening this draft PR to do a first run of the validations. More documentation about what has changed will come as I merge the changes in the various submodules and make releases.

Check List

  • I successfully compiled ldmx-sw with my developments
  • I ran my developments and the following shows that they are successful. (ldmx ctest passes locally)
  • I attached any sub-module related changes to this PR.

Related Sub-Module PRs

Patches after opening

@tomeichlersmith
Copy link
Member Author

There is only one histogram failing the KS test, but it does not appear to be separated.

hcal_dqm_back_hcal_dqm_back_noise-1

I'm going to leave this aside and check in more detail when I am closer to actually merging this. I suspect this is just a numerical issue that is not something we need to fix.


Cropping done with

unzip it_pileup-pr-validation.zip
mkdir it_pileup
cd it_pileup
tar xzf ../it_pileup_recon_validation_plots.tar.gz
pdftoppm -png -x 580 -W 1080 -H 900 fail/hcal_dqm_back_hcal_dqm_back_noise.pdf fail/hcal_dqm_back_hcal_dqm_back_noise

@EinarElen
Copy link
Contributor

I'm going to leave this aside and check in more detail when I am closer to actually merging this. I suspect this is just a numerical issue that is not something we need to fix.

I've seen the hcal noise simulation being extremely sensitive before so that wouldn't surprise me

@EinarElen
Copy link
Contributor

EinarElen commented Jan 16, 2024

The removal of the EventDef header should probably be recorded in our list of backwards-incompatible changes. Not because it changes any physics but I expect at least some people ending up with mixed versions[1]. So having it written down in one place would be good.

Might as well note the removal of the generator, for posterity.

[1] I certainly did not run into this myself...

@tomeichlersmith
Copy link
Member Author

All of the tests that are failing are failing due to a seg-vio during the configure python constructor

Stack trace from the CI:

  1: #5  <signal handler called>
  1: #6  __GI___fileno (fp=0x0) at ./libio/fileno.c:35
  1: #7  0x00007fecb61021cd in _Py_FdIsInteractive () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0
  1: #8  0x00007fecb610c27a in _PyRun_AnyFileObject () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0
  1: #9  0x00007fecb610c34e in PyRun_AnyFileExFlags () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0
  1: #10 0x00007fecb6567cdb in framework::ConfigurePython::ConfigurePython(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, char**, int) () from /home/runner/work/ldmx-sw/ldmx-sw/install/lib/libFramework.so
  1: #11 0x0000556f7da0eb17 in CATCH2_INTERNAL_TEST_0() ()

I don't know why it would be failing now, but it looks like PyRun_AnyFileExFlags is tripping over itself or (more likely) we are calling it in correctly.

@tomeichlersmith
Copy link
Member Author

It looks like I merged LDMX-Software/cmake#11 which included

@@ -348,11 +355,21 @@ macro(build_test)
   # Install the run_test  executable
   foreach(entry ${test_modules})
     add_test(
-      NAME ${entry}
-      COMMAND run_test "[${entry}]"
-      WORKING_DIRECTORY ${CMAKE_BINARY_DIR}/test)
+      NAME ${entry} 
+      COMMAND run_test "[${entry}]" 
+      WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}/${entry}
+      )
   endforeach()

which would be running stuff in the wrong directory, potentially causing a seg-vio if the file doesn't exist.

@tomeichlersmith
Copy link
Member Author

I was able to confirm that this was the case, going to do two things:

  1. Patch Framework to have a nicer error if the passed config is not accessible.
  2. Patch cmake to run tests from within a module's test directory.

@tomeichlersmith
Copy link
Member Author

The tests are now failing when trying to run the basic.py config in SimCore:

       Start 7: Run_SimCore_basic.py
  7/7 Test #7: Run_SimCore_basic.py .............***Not Run   0.00 sec
  
  86% tests passed, 1 tests failed out of 7
  
  Total Test time (real) =  10.44 sec
  
  The following tests FAILED:
  	  7 - Run_SimCore_basic.py (Not Run)
  Errors while running CTest
  Output from these tests are in: /home/runner/work/ldmx-sw/ldmx-sw/build/Testing/Temporary/LastTest.log

I do not see this locally, but there may be something special about my local setup that I didn't push.

@tomeichlersmith
Copy link
Member Author

Force-pushed to cleanup commit history and label the versions of the submodules I'm bumping

@tomeichlersmith tomeichlersmith marked this pull request as ready for review January 22, 2024 19:46
@tomeichlersmith
Copy link
Member Author

Same single-plot failure as before...

hcal_dqm_back_hcal_dqm_back_noise-1

🤔 hmmmm

@tomeichlersmith
Copy link
Member Author

I cannot find an actual, factual cause of this slight change in behavior, so I'm going to chalk it up to floating-point error and move on.

@tomeichlersmith tomeichlersmith merged commit 505a213 into trunk Jan 23, 2024
6 of 7 checks passed
@tomeichlersmith tomeichlersmith deleted the Framework-bump branch January 23, 2024 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants