Settings validation resolution when user can't be prompted #651

keckler · 2022-05-02T18:45:17Z

Within the settingsValidation module, there is the ability for a case settings object to be inspected using an Inspector. The Inspector will go through all valid queries and check the user's case settings to see that a bunch of common-sense criteria are met. All of those queries in the framework are located here:

armi/armi/operators/settingsValidation.py

Lines 311 to 622 in bbec775

    
           def _inspectSettings(self): 
        
               """Check settings for inconsistencies.""" 
        
               # import here to avoid cyclic issues 
        
               from armi import operators 
        
               self.addQueryBadLocationWillLikelyFail("operatorLocation") 
        
               self.addQuery( 
        
                   lambda: self.cs["outputFileExtension"] == "pdf" and self.cs["genReports"], 
        
                   "Output files of '.pdf' format are not supported by the reporting HTML generator. '.pdf' " 
        
                   "images will not be included.", 
        
                   "Switch to '.png'?", 
        
                   lambda: self._assignCS("outputFileExtension", "png"), 
        
               ) 
        
               self.addQuery( 
        
                   lambda: ( 
        
                       self.cs[globalSettings.CONF_ZONING_STRATEGY] == "manual" 
        
                       and not self.cs["zoneDefinitions"] 
        
                   ), 
        
                   "`manual` zoningStrategy requires that `zoneDefinitions` setting be defined. Run will have " 
        
                   "no zones.", 
        
                   "", 
        
                   self.NO_ACTION, 
        
               ) 
        
               self.addQuery( 
        
                   lambda: ( 
        
                       ( 
        
                           self.cs["beta"] 
        
                           and isinstance(self.cs["beta"], list) 
        
                           and not self.cs["decayConstants"] 
        
                       ) 
        
                       or (self.cs["decayConstants"] and not self.cs["beta"]) 
        
                   ), 
        
                   "Both beta components and decay constants should be provided if either are " 
        
                   "being supplied.", 
        
                   "", 
        
                   self.NO_ACTION, 
        
               ), 
        
               self.addQuery( 
        
                   lambda: self.cs["skipCycles"] > 0 and not self.cs["reloadDBName"], 
        
                   "You have chosen to do a restart case without specifying a database to load from. " 
        
                   "Run will load from output files, if they exist but burnup, etc. will not be updated.", 
        
                   "", 
        
                   self.NO_ACTION, 
        
               ) 
        
               self.addQuery( 
        
                   lambda: self.cs["runType"] != operators.RunTypes.SNAPSHOTS 
        
                   and self.cs["loadStyle"] == "fromDB" 
        
                   and self.cs["startCycle"] == 0 
        
                   and self.cs["startNode"] == 0, 
        
                   "Starting from cycle 0, and time node 0 was chosen. Restart runs load from " 
        
                   "the time node just before the restart. There is no time node to load from " 
        
                   "before cycle 0 node 0. Either switch to the snapshot operator, start from " 
        
                   "a different time step or load from inputs rather than database as " 
        
                   "`loadStyle`.", 
        
                   "", 
        
                   self.NO_ACTION, 
        
               ) 
        
               self.addQuery( 
        
                   lambda: self.cs["runType"] == operators.RunTypes.SNAPSHOTS 
        
                   and not (self.cs["dumpSnapshot"] or self.cs["defaultSnapshots"]), 
        
                   "The Snapshots operator was specified, but no dump snapshots were chosen." 
        
                   "Please specify snapshot steps with the `dumpSnapshot` setting.", 
        
                   "", 
        
                   self.NO_ACTION, 
        
               ) 
        
               self.addQuery( 
        
                   lambda: self.cs.caseTitle.lower() 
        
                   == os.path.splitext(os.path.basename(self.cs["reloadDBName"].lower()))[0], 
        
                   "Snapshot DB ({0}) and main DB ({1}) cannot have the same name." 
        
                   "Change name of settings file and resubmit.".format( 
        
                       self.cs["reloadDBName"], self.cs.caseTitle 
        
                   ), 
        
                   "", 
        
                   self.NO_ACTION, 
        
               ) 
        
               self.addQuery( 
        
                   lambda: self.cs["reloadDBName"] != "" 
        
                   and not os.path.exists(self.cs["reloadDBName"]), 
        
                   "Reload database {} does not exist. \nPlease point to an existing DB, " 
        
                   "or set to empty and load from input.".format(self.cs["reloadDBName"]), 
        
                   "", 
        
                   self.NO_ACTION, 
        
               ) 
        
               def _willBeCopiedFrom(fName): 
        
                   for copyFile in self.cs["copyFilesFrom"]: 
        
                       if fName == os.path.split(copyFile)[1]: 
        
                           return True 
        
                   return False 
        
               self.addQuery( 
        
                   lambda: self.cs["explicitRepeatShuffles"] 
        
                   and not self._csRelativePathExists(self.cs["explicitRepeatShuffles"]) 
        
                   and not _willBeCopiedFrom(self.cs["explicitRepeatShuffles"]), 
        
                   "The specified repeat shuffle file `{0}` does not exist, and won't be copied from elsewhere. " 
        
                   "Run will crash.".format(self.cs["explicitRepeatShuffles"]), 
        
                   "", 
        
                   self.NO_ACTION, 
        
               ) 
        
               self.addQuery( 
        
                   lambda: not self.cs["power"], 
        
                   "No power level set. You must always start by importing a base settings file.", 
        
                   "", 
        
                   self.NO_ACTION, 
        
               ) 
        
               # The gamma cross sections generated for MC2-3 by ANL were done with NJOY with 
        
               # P3 scattering. MC2-3 would have to be modified and the gamma cross sections 
        
               # re-generated with NJOY for MC2-3 to allow any other scattering order with 
        
               # gamma cross sections enabled. 
        
               self.addQuery( 
        
                   lambda: ( 
        
                       "MC2v3" in self.cs["xsKernel"] 
        
                       and neutronics.gammaXsAreRequested(self.cs) 
        
                       and self.cs["xsScatteringOrder"] != 3 
        
                   ), 
        
                   "MC2-3 will crash if a scattering order is not set to 3 when generating gamma XS.", 
        
                   "Would you like to set the `xsScatteringOrder` to 3?", 
        
                   lambda: self._assignCS("xsScatteringOrder", 3), 
        
               ) 
        
               self.addQuery( 
        
                   lambda: self.cs["outputCacheLocation"] 
        
                   and not os.path.exists(self.cs["outputCacheLocation"]), 
        
                   "`outputCacheLocation` path {} does not exist. Please specify a location that exists.".format( 
        
                       self.cs["outputCacheLocation"] 
        
                   ), 
        
                   "", 
        
                   self.NO_ACTION, 
        
               ) 
        
               self.addQuery( 
        
                   lambda: self.cs["numCoupledIterations"] > 0, 
        
                   "You have {0} coupling iterations selected.".format( 
        
                       self.cs["numCoupledIterations"] 
        
                   ), 
        
                   "1 coupling iteration doubles run time (2 triples, etc). Do you want to use 0 instead? ", 
        
                   lambda: self._assignCS("numCoupledIterations", 0), 
        
               ) 
        
               def _factorsAreValid(factors, maxVal=1.0): 
        
                   try: 
        
                       expandedList = expandRepeatedFloats(factors) 
        
                   except (ValueError, IndexError): 
        
                       return False 
        
                   return ( 
        
                       all(0.0 <= val <= maxVal for val in expandedList) 
        
                       and len(expandedList) == self.cs["nCycles"] 
        
                   ) 
        
               self.addQuery( 
        
                   lambda: self.cs["availabilityFactors"] 
        
                   and not _factorsAreValid(self.cs["availabilityFactors"]), 
        
                   "`availabilityFactors` was not set to a list compatible with the number of cycles. " 
        
                   "Please update input or use constant duration.", 
        
                   "Use constant availability factor specified in `availabilityFactor` setting?", 
        
                   lambda: self._assignCS("availabilityFactors", []), 
        
               ) 
        
               self.addQuery( 
        
                   lambda: self.cs["powerFractions"] 
        
                   and not _factorsAreValid(self.cs["powerFractions"]), 
        
                   "`powerFractions` was not set to a compatible list. " 
        
                   "Please update input or use full power at all cycles.", 
        
                   "Use full power for all cycles?", 
        
                   lambda: self._assignCS("powerFractions", []), 
        
               ) 
        
               self.addQuery( 
        
                   lambda: ( 
        
                       self.cs["cycleLengths"] 
        
                       and not _factorsAreValid(self.cs["cycleLengths"], maxVal=1e10) 
        
                   ), 
        
                   "The number of cycles defined in `cycleLengths` is not equal to the number of cycles in " 
        
                   "the run `nCycles`." 
        
                   "Please ensure that there is exactly one duration for each cycle in the run or use " 
        
                   "{} days for all cycles.".format(self.cs["cycleLength"]), 
        
                   "Use {} days for all cycles?".format(self.cs["cycleLength"]), 
        
                   lambda: self._assignCS("cycleLengths", []), 
        
               ) 
        
               def _correctCycles(): 
        
                   newSettings = {"nCycles": 1, "burnSteps": 0} 
        
                   self.cs = self.cs.modified(newSettings=newSettings) 
        
               self.addQuery( 
        
                   lambda: not self.cs["cycleLengths"] and self.cs["nCycles"] == 0, 
        
                   "Cannot run 0 cycles. Set burnSteps to 0 to activate a single time-independent case.", 
        
                   "Set 1 cycle and 0 burnSteps for single time-independent case?", 
        
                   _correctCycles, 
        
               ) 
        
               self.addQuery( 
        
                   lambda: ( 
        
                       self.cs["runType"] == "Standard" 
        
                       and self.cs["burnSteps"] == 0 
        
                       and (len(self.cs["cycleLengths"]) > 1 or self.cs["nCycles"] > 1) 
        
                   ), 
        
                   "Cannot run multi-cycle standard cases with 0 burnSteps per cycle. Please update settings.", 
        
                   "", 
        
                   self.NO_ACTION, 
        
               ) 
        
               def decayCyclesHaveInputThatWillBeIgnored(): 
        
                   """Check if there is any decay-related input that will be ignored.""" 
        
                   try: 
        
                       powerFracs = expandRepeatedFloats(self.cs["powerFractions"]) 
        
                       availabilities = expandRepeatedFloats( 
        
                           self.cs["availabilityFactors"] 
        
                       ) or ([self.cs["availabilityFactor"]] * self.cs["nCycles"]) 
        
                   except:  # pylint: disable=bare-except 
        
                       return True 
        
                   for pf, af in zip(powerFracs, availabilities): 
        
                       if pf > 0.0 and af == 0.0: 
        
                           # this will be a full decay step and any power fraction will be ignored. May be ok, but warn. 
        
                           return True 
        
                   return False 
        
               self.addQuery( 
        
                   lambda: ( 
        
                       self.cs["cycleLengths"] 
        
                       and self.cs["powerFractions"] 
        
                       and decayCyclesHaveInputThatWillBeIgnored() 
        
                   ), 
        
                   "At least one cycle has a non-zero power fraction but an availability of zero. Please " 
        
                   "update the input.", 
        
                   "", 
        
                   self.NO_ACTION, 
        
               ) 
        
               self.addQuery( 
        
                   lambda: self.cs["operatorLocation"] 
        
                   and self.cs["runType"] != operators.RunTypes.STANDARD, 
        
                   "The `runType` setting is set to `{0}` but there is a `custom operator location` defined".format( 
        
                       self.cs["runType"] 
        
                   ), 
        
                   "", 
        
                   self.NO_ACTION, 
        
               ) 
        
               self.addQuery( 
        
                   lambda: self.cs["operatorLocation"] 
        
                   and self.cs["runType"] != operators.RunTypes.STANDARD, 
        
                   "The `runType` setting is set to `{0}` but there is a `custom operator location` defined".format( 
        
                       self.cs["runType"] 
        
                   ), 
        
                   "", 
        
                   self.NO_ACTION, 
        
               ) 
        
               self.addQuery( 
        
                   lambda: self.cs["skipCycles"] > 0 
        
                   and not os.path.exists(self.cs.caseTitle + ".restart.dat"), 
        
                   "This is a restart case, but the required restart file {0}.restart.dat is not found".format( 
        
                       self.cs.caseTitle 
        
                   ), 
        
                   "", 
        
                   self.NO_ACTION, 
        
               ) 
        
               self.addQuery( 
        
                   lambda: self.cs["deferredInterfacesCycle"] > self.cs["nCycles"], 
        
                   "The deferred interface activation cycle exceeds set cycle occurrence. " 
        
                   "Interfaces will not be activated in this run!", 
        
                   "", 
        
                   self.NO_ACTION, 
        
               ) 
        
               self.addQuery( 
        
                   lambda: ( 
        
                       self.cs["boundaries"] != neutronics.GENERAL_BC 
        
                       and self.cs["bcCoefficient"] 
        
                   ), 
        
                   "General neutronic boundary condition was not selected, but `bcCoefficient` was defined. " 
        
                   "Please enable `Generalized` neutronic boundary condition or disable `bcCoefficient`.", 
        
                   "", 
        
                   self.NO_ACTION, 
        
               ) 
        
               self.addQuery( 
        
                   lambda: self.cs["geomFile"] 
        
                   and str(self.geomType) not in geometry.VALID_GEOMETRY_TYPE, 
        
                   "{} is not a valid geometry Please update geom type on the geom file. " 
        
                   "Valid (case insensitive) geom types are: {}".format( 
        
                       self.geomType, geometry.VALID_GEOMETRY_TYPE 
        
                   ), 
        
                   "", 
        
                   self.NO_ACTION, 
        
               ) 
        
               self.addQuery( 
        
                   lambda: self.cs["geomFile"] 
        
                   and not geometry.checkValidGeomSymmetryCombo( 
        
                       self.geomType, self.coreSymmetry 
        
                   ), 
        
                   "{}, {} is not a valid geometry and symmetry combination. Please update " 
        
                   "either geometry or symmetry on the geom file.".format( 
        
                       str(self.geomType), str(self.coreSymmetry) 
        
                   ), 
        
                   "", 
        
                   self.NO_ACTION, 
        
               )

When the user submits a job locally and one or more of the queries fails inspection, a prompt will be sent to the user asking how to resolve the issue. For instance

Each of the queries is also associated with a potential resolution (or sometimes only the option to do nothing), and if the user opts to use that resolution, then ARMI will make the necessary change to the case settings object, re-inspect the settings, and keep prompting the user until everything looks kosher.

However, in cases where the user cannot be prompted for whatever reason (perhaps they are running on a remote machine), the question is instead printed in the log file, but the user cannot make any decisions, nothing is changed in the case settings, and the run continues even though ostensibly something is wrong.

I think that this system of just continuing the run when a setting-clash is known to exist is probably bad practice, because it gives the user the impression that their run is valid when in fact something is known to be wrong.

I think there are two alternatives that we might consider for cases when the user cannot be prompted to make a decision about clashing settings:

When a query fails, the run should fail, forcing the user to actually update their case settings to be compatible with the queries and resubmit. This is annoying to the user and might break some existing workflows, but feels more in line with the intent of the settings validation system in the first place.
When a query fails, the default action is auto-implemented and a note is printed in the log that some case setting has been changed. This is nicer to the user, but may somewhat hide from them that they did something wrong in the first place. Also, this option would need to handle the possibility that an infinite loop may be created through the resolution of one query breaking another.

Any thoughts?

@ntouran @jakehader @onufer @john-science

The text was updated successfully, but these errors were encountered:

ntouran · 2022-05-03T14:30:37Z

We have fought with the settings validation system a lot and gone back and forth. Lots of times the questions are just to guide you through things that might be wrong. When the system just crashes in a validation error, lots of power users complain that they can't get their jobs to run even though they know what they're doing. This can be really frustrating.

Running remotely must be automatically triggering the --batch argument, right (https://terrapower.github.io/armi/.apidocs/armi.context.html#armi.context.Mode)? That's why they're not crashing?

Maybe we could have it crash automatically in INTERACTIVE mode but run no matter what in BATCH mode.

keckler · 2022-05-03T15:36:07Z

I sympathize with the frustration of the power users.

But I also see the other side of things, where a developer (such as myself) might try to use the settings validation to force a user to make decisions about certain case settings that don't have clear defaults. So if I, as a developer, need a value to exist in the case settings under certain conditions but can't pick a reasonable default for that value, the settings validation system looks like a pretty good way to force the user to pick something. But if they just submit their job in BATCH mode and it effectively bypasses the settings validation, then that won't work.

Maybe we could have it crash automatically in INTERACTIVE mode but run no matter what in BATCH mode.

I think with this suggestion you are effectively saying that we no longer even prompt the user for a response at all? From my understanding, INTERACTIVE mode is currently the scenario in which a user can answer the prompt and fix the issue. So I don't think understand your suggestion, but that could be a misunderstanding on my part.

Another option I could think of is to add a new case setting to ARMI that let's the user decide how to proceed when the settings validation has questions. Something like settingsValidationDefaultBehavior with the following options:

askOrFail : The default case. Prompt the user, if possible, or kill the job if not
askOrAutoCorrect : Prompt the user, if possible, or just accept the suggested answer to the query if not
autoCorrect : Don't even bother prompting the user, just accept the suggested answer to all queries
ignore : Don't even bother prompting the user and do not take any action on all queries. Continue running the case as-is

This should satisfy power users, and makes it very explicitly clear that they are overriding behavior that a developer believes is incorrect. It's like SUDO capabilities to override the settings validator.

Nebbychadnezzar · 2023-01-27T18:08:21Z

I think that this system of just continuing the run when a setting-clash is known to exist is probably bad practice, because it gives the user the impression that their run is valid when in fact something is known to be wrong.

I'd like to log here that I didn't know this was an issue and ended up making this assumption, and so was burned by it. 😢

keckler · 2023-09-11T23:01:56Z

Ultimately it is on the user to check their output and justify any warnings. On this note, I will close this ticket because otherwise we would have to make an API-breaking change.

I still don't like it, but that's okay 😄

keckler mentioned this issue May 3, 2022

Upgrade the cycles inputs #637

Merged

6 tasks

Nebbychadnezzar mentioned this issue Mar 23, 2023

Settings validation resolution appears to not work when submitted via cmd line #1221

Closed

keckler self-assigned this Sep 11, 2023

keckler added the enhancement New feature or request label Sep 11, 2023

keckler closed this as completed Sep 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Settings validation resolution when user can't be prompted #651

Settings validation resolution when user can't be prompted #651

keckler commented May 2, 2022

ntouran commented May 3, 2022

keckler commented May 3, 2022

Nebbychadnezzar commented Jan 27, 2023

keckler commented Sep 11, 2023

Settings validation resolution when user can't be prompted #651

Settings validation resolution when user can't be prompted #651

Comments

keckler commented May 2, 2022

ntouran commented May 3, 2022

keckler commented May 3, 2022

Nebbychadnezzar commented Jan 27, 2023

keckler commented Sep 11, 2023