This package allows you to configure functions explicitly and safely. You will be able to create an intuitive type-checked configuration file that directly sets function arguments, globally.
This is a lightweight package with only two widely used dependencies and only a couple hundred line of code.
Make sure you have Python 3, then you can install pythonic-config
using pip
:
pip install pythonic-config
Install the latest code via:
pip install git+https://github.com/PetrochukM/Config.git
Any function can be configured, and then used anywhere, see below:
import config as cf
# Define function
def do_something_cool(how_many_times: int):
pass
# Configure function
cf.add({do_something_cool: cf.Args(how_many_times=5)})
# Use the configured function anywhere! 🎉
do_something_cool(how_many_times=cf.get())
This approach is simple but powerful. Now, each configuration can be directly attributed to a documented function argument.
Furthermore, config
incorporates typeguard
💂♀️ so every configuration is type checked at runtime.
The simple example above can be extended to create a configuration file, for example:
import config as cf
import data
import train
cf.add({
data.get_data: cf.Args(
train_data_path="url_lists/all_train.txt",
val_data_path="url_lists/all_val.txt"
),
data.dataset_reader: cf.Args(
type_="cnn_dm",
source_max_tokens=1022,
target_max_tokens=54,
),
train.make_model: cf.Args(type_="bart"),
train.Trainer.make_optimizer: cf.Args(
type_="huggingface_adamw",
lr=3e-5,
correct_bias=True
)
train.Trainer.__init__: cf.Args(
num_epochs=3,
learning_rate_scheduler="polynomial_decay",
grad_norm=1.0,
)
})
With this approach, this configuration file will make it clear which (hyper)parameters are set and where. This improves overall readability of the configuration file.
🐍 Last but not least, the configuration file is written in Python, you can use variables, lambdas, etc to further modularize.
In case you want to change one variable at a time, this package supports configuration from the command line, for example:
python example.py --sorted='Args(reverse=True)'
import sys
import config as cf
cf.add(cf.parse_cli_args(sys.argv[1:]))
Lastly, it's useful to track the configuration file by logging it. This package supports that
via config.log
. In the example below, we log the configuration to
Comet.
from comet_ml import Experiment
import config as cf
experiment = Experiment()
experiment.log_parameters(cf.log())
In multiprocessing, it may be useful to share the configuration file between processes. In this case, the configuration can be exported to another process and then subsequently imported, see below:
from multiprocessing import Process
import config as cf
def handler(configs: cf.Config):
cf.add(configs)
if __name__ == "__main__":
process = Process(target=handler, args=(cf.export(),))
process.start()
process.join()
In a large code base, it might be hard to tell if the configuration has been set for every function
call. In this case, we've exposed config.trace
which can double check every function call
against the configuration, see below:
import sys
import config as cf
def configured(a=111):
pass
sys.settrace(cf.trace)
cf.add({configured: cf.Args(a=1)})
configured() # `cf.trace` issues a WARNING!
configured(a=cf.get())
We also have another option for faster tracing with config.enable_fast_trace
. Instead of a system
wide trace, this traces the configured functions by modifying their code and inserting a trace
function at the beginning of the function definition. This has a MUCH lower overhead; however, it is
still in beta due to the number of edge cases.
In a large code base, you may have a lot of configurations, some of which are no longer being used.
purge
can be run on a process exit, and it'll warn you if configurations were not used.
import atexit
import config as cf
atexit.register(cf.purge)