diff --git a/docs/environment.md b/docs/environment.md index 13713b4..c5792b2 100644 --- a/docs/environment.md +++ b/docs/environment.md @@ -309,6 +309,19 @@ The idea here is to allow a catch-all Uvicorn config variable in the spirit of ` - `LOG_COLORS="true"` - `LOG_COLORS="false"` +`LOG_FILTERS` + +- Comma-separated string identifying log records to filter out. The string will be split on commas and converted to a set. Each log message will then be checked for each filter in the set. If any matches are present in the log message, the logger will not log that message. +- Default: `None` (don't filter out any log records, just log every record) +- Custom: `LOG_FILTERS="/health, /heartbeat"` (filter out log messages that contain either the string `"/health"` or the string `"/heartbeat"`, to avoid logging health checks) +- See also: + - [AWS Builders' Library: Implementing health checks](https://aws.amazon.com/builders-library/implementing-health-checks/) + - [AWS Elastic Load Balancing docs: Target groups - Health checks for your target groups](https://docs.aws.amazon.com/elasticloadbalancing/latest/application/target-group-health-checks.html) + - [benoitc/gunicorn#1781](https://github.com/benoitc/gunicorn/issues/1781) + - [Python 3 docs: How-To - Logging Cookbook - Using Filters to impart contextual information](https://docs.python.org/3/howto/logging-cookbook.html#using-filters-to-impart-contextual-information) + - [Python 3 docs: What's new in Python 3.2 - logging](https://docs.python.org/3/whatsnew/3.2.html#logging) + - [Django 4.0 docs: Topics - Logging](https://docs.djangoproject.com/en/4.0/topics/logging/) + `LOG_FORMAT` - [Python logging format](https://docs.python.org/3/library/logging.html#formatter-objects). diff --git a/docs/logging.md b/docs/logging.md index a2edb38..7b358e1 100644 --- a/docs/logging.md +++ b/docs/logging.md @@ -10,6 +10,82 @@ See [environment variable reference](environment.md). - When running Uvicorn alone, logging is configured programmatically from within the [`start.py` start script](https://github.com/br3ndonland/inboard/blob/HEAD/inboard/start.py), by passing the `LOGGING_CONFIG` dictionary to `uvicorn.run()`. - When running Gunicorn with the Uvicorn worker, the logging configuration dictionary is specified within the [`gunicorn_conf.py`](https://github.com/br3ndonland/inboard/blob/HEAD/inboard/gunicorn_conf.py) configuration file. +## Filtering log messages + +[Filters](https://docs.python.org/3/howto/logging-cookbook.html#using-filters-to-impart-contextual-information) identify log messages to filter out, so that the logger does not log messages containing any of the filters. If any matches are present in a log message, the logger will not output the message. + +One of the primary use cases for log message filters is health checks. When applications with APIs are deployed, it is common to perform "health checks" on them. Health checks are usually performed by making HTTP requests to a designated API endpoint. These checks are made at frequent intervals, and so they can fill up the access logs with large numbers of unnecessary log records. To avoid logging health checks, add those endpoints to the `LOG_FILTERS` environment variable. + +The `LOG_FILTERS` environment variable can be used to specify filters as a comma-separated string, like `LOG_FILTERS="/health, /heartbeat"`. To then add the filters to a class instance, the `LogFilter.set_filters()` method can produce the set of filters from the environment variable value. + +```py +# start a REPL session in a venv in which inboard is installed +.venv ❯ python +>>> import os +>>> import inboard +>>> os.environ["LOG_FILTERS"] = "/health, /heartbeat" +>>> inboard.LogFilter.set_filters() +{'/heartbeat', '/health'} +``` + +inboard will do this automatically by reading the `LOG_FILTERS` environment variable. + +Let's see this in action by using the [VSCode debugger](https://code.visualstudio.com/docs/python/debugging), with the configuration in _inboard/.vscode/launch.json_. We'll have one terminal instance open to see the server logs from Uvicorn, and another one open to make client HTTP requests. + +Start the server: + +```log +/path/to/inboard +❯ export LOG_FILTERS="/health, /heartbeat" + +/path/to/inboard +❯ /usr/bin/env /path/to/inboard/.venv/bin/python \ + /path/to/the/python/vscode/extension/pythonFiles/lib/python/debugpy/launcher \ + 61527 -- -m inboard.start +DEBUG: Logging dict config loaded from inboard.logging_conf. +DEBUG: Checking for pre-start script. +DEBUG: Running pre-start script with python /path/to/inboard/inboard/app/prestart.py. +[prestart] Hello World, from prestart.py! Add database migrations and other scripts here. +DEBUG: Ran pre-start script with python /path/to/inboard/inboard/app/prestart.py. +DEBUG: App module set to inboard.app.main_fastapi:app. +DEBUG: Running Uvicorn without Gunicorn. +INFO: Will watch for changes in these directories: ['/path/to/inboard/inboard'] +INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit) +INFO: Started reloader process [69531] using watchgod +INFO: Started server process [69552] +INFO: Waiting for application startup. +INFO: Application startup complete. +``` + +Make a request to an endpoint that should be logged, using an HTTP client like [HTTPie](https://httpie.io/) or the [HTTPX CLI](https://www.python-httpx.org/): + +```sh +❯ http :8000 -b +{ + "Hello": "World" +} + +``` + +The request will be logged through `uvicorn.access`: + +```log +INFO: 127.0.0.1:61575 - "GET / HTTP/1.1" 200 +``` + +Next, make a request to an endpoint that should be filtered out of the logs. The username and password you see here are just test values set in the _launch.json_. + +```sh +❯ http :8000/health -a test_user:r4ndom_bUt_memorable -b +{ + "application": "inboard", + "message": null, + "status": "active" +} +``` + +The server does not display a log message. + ## Extending the logging config If inboard is installed from PyPI with `poetry add inboard` or `pip install inboard`, the logging configuration can be easily customized as explained in the [Python logging configuration docs](https://docs.python.org/3/library/logging.config.html). diff --git a/inboard/__init__.py b/inboard/__init__.py index 5eccfb4..e53f516 100644 --- a/inboard/__init__.py +++ b/inboard/__init__.py @@ -13,10 +13,11 @@ from .app.utilities_starlette import BasicAuth as StarletteBasicAuth except ImportError: # pragma: no cover pass -from .logging_conf import LOGGING_CONFIG, configure_logging +from .logging_conf import LOGGING_CONFIG, LogFilter, configure_logging __all__ = ( "LOGGING_CONFIG", + "LogFilter", "StarletteBasicAuth", "configure_logging", "fastapi_basic_auth", diff --git a/inboard/logging_conf.py b/inboard/logging_conf.py index 5bf64ed..c9a41b7 100644 --- a/inboard/logging_conf.py +++ b/inboard/logging_conf.py @@ -4,7 +4,7 @@ import os import sys from pathlib import Path -from typing import Optional +from typing import Optional, Set def find_and_load_logging_conf(logging_conf: str) -> dict: @@ -48,6 +48,62 @@ def configure_logging( raise +class LogFilter(logging.Filter): + """Subclass of `logging.Filter` used to filter log messages. + --- + + Filters identify log messages to filter out, so that the logger does not log + messages containing any of the filters. If any matches are present in a log + message, the logger will not output the message. + + The environment variable `LOG_FILTERS` can be used to specify filters as a + comma-separated string, like `LOG_FILTERS="/health, /heartbeat"`. To then + add the filters to a class instance, the `LogFilter.set_filters()` + method can produce the set of filters from the environment variable value. + """ + + __slots__ = "name", "nlen", "filters" + + def __init__( + self, + name: str = "", + filters: Optional[Set[str]] = None, + ) -> None: + """Initialize a filter.""" + self.name = name + self.nlen = len(name) + self.filters = filters + + def filter(self, record: logging.LogRecord) -> bool: + """Determine if the specified record is to be logged. + + Returns True if the record should be logged, or False otherwise. + """ + if self.filters is None: + return True + message = record.getMessage() + return all(match not in message for match in self.filters) + + @staticmethod + def set_filters(input_filters: Optional[str] = None) -> Optional[Set[str]]: + """Set log message filters. + + Filters identify log messages to filter out, so that the logger does not + log messages containing any of the filters. The argument to this method + should be supplied as a comma-separated string. The string will be split + on commas and converted to a set of strings. + + This method is provided as a `staticmethod`, instead of as part of `__init__`, + so that it only runs once when setting the `LOG_FILTERS` module-level constant. + In contrast, the `__init__` method runs each time a logger is instantiated. + """ + return ( + {log_filter.strip() for log_filter in str(log_filters).split(sep=",")} + if (log_filters := input_filters or os.getenv("LOG_FILTERS")) + else None + ) + + LOG_COLORS = ( True if (value := os.getenv("LOG_COLORS")) and value.lower() == "true" @@ -55,11 +111,15 @@ def configure_logging( if value and value.lower() == "false" else sys.stdout.isatty() ) +LOG_FILTERS = LogFilter.set_filters() LOG_FORMAT = str(os.getenv("LOG_FORMAT", "simple")) LOG_LEVEL = str(os.getenv("LOG_LEVEL", "info")).upper() LOGGING_CONFIG: dict = { "version": 1, "disable_existing_loggers": False, + "filters": { + "filter_log_message": {"()": LogFilter, "filters": LOG_FILTERS}, + }, "formatters": { "simple": { "class": "logging.Formatter", @@ -87,6 +147,7 @@ def configure_logging( "handlers": { "default": { "class": "logging.StreamHandler", + "filters": ["filter_log_message"], "formatter": LOG_FORMAT, "level": LOG_LEVEL, "stream": "ext://sys.stdout", diff --git a/tests/test_logging_conf.py b/tests/test_logging_conf.py index de3dfb4..764b0eb 100644 --- a/tests/test_logging_conf.py +++ b/tests/test_logging_conf.py @@ -1,5 +1,6 @@ import os from pathlib import Path +from typing import Set import pytest from pytest_mock import MockerFixture @@ -145,6 +146,9 @@ class TestLoggingOutput: --- """ + def _uvicorn_access_log_args(self, path: str) -> tuple: + return ('%s - "%s %s HTTP/%s" %d', "127.0.0.1:60364", "GET", path, "1.1", 200) + def test_logging_output_default(self, capfd: pytest.CaptureFixture) -> None: """Test logger output with default format.""" logger = logging_conf.logging.getLogger() @@ -177,3 +181,74 @@ def test_logging_output_custom_format( assert log_level_output in captured.out assert f"Logging dict config loaded from {logging_conf_file}." in captured.out assert "Hello, Customized World!" in captured.out + + @pytest.mark.parametrize( + "log_filters_input,log_filters_output", + ( + ("/health", {"/health"}), + ("/health, /heartbeat", {"/health", "/heartbeat"}), + ("foo, bar, baz", {"foo", "bar", "baz"}), + ), + ) + def test_logging_filters( + self, + capfd: pytest.CaptureFixture, + log_filters_input: str, + log_filters_output: Set[str], + mocker: MockerFixture, + monkeypatch: pytest.MonkeyPatch, + ) -> None: + """Test that log message filters are applied as expected.""" + monkeypatch.setenv("LOG_FILTERS", log_filters_input) + mocker.patch.object( + logging_conf, "LOG_FILTERS", logging_conf.LogFilter.set_filters() + ) + mocker.patch.dict( + logging_conf.LOGGING_CONFIG["filters"]["filter_log_message"], + {"()": logging_conf.LogFilter, "filters": logging_conf.LOG_FILTERS}, + clear=True, + ) + path_to_log = "/status" + logger = logging_conf.logging.getLogger("test.logging_conf.output.filters") + logging_conf.configure_logging(logger=logger) + logger.info(*self._uvicorn_access_log_args(path_to_log)) + logger.info(log_filters_input) + for log_filter in log_filters_output: + logger.info(*self._uvicorn_access_log_args(log_filter)) + captured = capfd.readouterr() + assert logging_conf.LOG_FILTERS == log_filters_output + assert path_to_log in captured.out + for log_filter in log_filters_output: + assert log_filter not in captured.out + + @pytest.mark.parametrize( + "log_filters_input", ("/health", "/healthy /heartbeat", "/healthy foo bar") + ) + def test_logging_filters_with_known_limitations( + self, + capfd: pytest.CaptureFixture, + log_filters_input: str, + mocker: MockerFixture, + ) -> None: + """Test known limitations of log message filters. + + - Filters in the input string should be separated with commas, not spaces. + - Filters are applied with string matching, so a filter of `/health` will + also filter out messages including `/healthy`. + """ + filters = logging_conf.LogFilter.set_filters(log_filters_input) + mocker.patch.dict( + logging_conf.LOGGING_CONFIG["filters"]["filter_log_message"], + {"()": logging_conf.LogFilter, "filters": filters}, + clear=True, + ) + logger = logging_conf.logging.getLogger("test.logging_conf.output.filtererrors") + logging_conf.configure_logging(logger=logger) + logger.info(log_filters_input) + logger.info("/healthy") + captured = capfd.readouterr() + assert log_filters_input not in captured.out + if log_filters_input == "/health": + assert "/healthy" not in captured.out + else: + assert "/healthy" in captured.out diff --git a/tests/test_start.py b/tests/test_start.py index cf45721..cc459e4 100644 --- a/tests/test_start.py +++ b/tests/test_start.py @@ -258,11 +258,16 @@ def test_set_uvicorn_options_default_from_json( def test_set_uvicorn_options_custom_from_json( self, uvicorn_options_custom: dict, + mocker: MockerFixture, monkeypatch: pytest.MonkeyPatch, ) -> None: """Assert that the correct Uvicorn server options are set, in the correct order, when the `UVICORN_CONFIG_OPTIONS` environment variable is also set. + + The `LOGGING_CONFIG` dictionary can't be encoded as a JSON string when it has + a class definition (the filter class), so the "filters" dict has to be removed. """ + mocker.patch.dict(uvicorn_options_custom["log_config"]["filters"], clear=True) uvicorn_options_json = start.json.dumps(uvicorn_options_custom) monkeypatch.setenv("UVICORN_CONFIG_OPTIONS", uvicorn_options_json) monkeypatch.setenv("WITH_RELOAD", "false")