Skip to content

the `options` data structure

Lucy Forrest edited this page Sep 21, 2023 · 2 revisions

The options data structure collects information from multiple sources (here in order of priority):

  • the command line;
  • the mandatory instruction file specified in the command line;
  • the default values in the load_options function of the initialize_repository library.

The data structure is an ordered dictionary (collections.OrderedDict) containing the following sections (strictly in this order):

  • GENERAL: mandatory options;
  • PATHS: external paths EncoMPASS must have for running dependencies;
  • RUN: defining run mode: in local, on a cluster, etc. + the relative options
  • PARAMETERS: tunable parameters in EncoMPASS (e.g. thresholds for internal filters/definitions, etc.)
  • EXECUTION: execution mode: debug, rerun, update, etc.
  • ALL: contains all the keys from the aforementioned sections in a more convenient format

The keys inside each section (except the ALL section) are 2-uples composed of the short and long key names, as defined in load options. Both short and long key names are unique (except for keys without a short name): this is what allows ALL to have all short and long keys connected to the relative option.

Example: you can call

 options['PARAMETERS'][('seqid_thr', 'sequence_identity_threshold')]

or equivalently

options['ALL']['sequence_identity_threshold']

or

options['ALL']['seqid_thr']

The structure is filled in the following steps:

  • The function command_line_parser is called: a local copy of the options data structure is initialized via load_options;
  • The function instruction_file_parser is called: another local copy of the options data structure is initialized via load_options;
  • The function check_options is called, taking the two copies of the options data structure. The function sets the correct priorities (e.g., if the same option is specified in the command line and in the instruction file, the command line one will be considered), checks the compatibility of the chosen options, and returns the final options data structure.
Clone this wiki locally