Merge branch 'master' into regisb/docs-devstack-settings

openedx · Dec 10, 2024 · 784b732 · 784b732
2 parents 99d8ff3 + 31476e3
commit 784b732
Show file tree

Hide file tree

Showing 32 changed files with 585 additions and 563 deletions.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -16,7 +16,7 @@ jobs:
     strategy:
       matrix:
         os: [ubuntu-latest]
-        python-version: ['3.8', '3.11', '3.12']
+        python-version: ['3.12']
         toxenv: [docs, quality, django42]
 
     steps:

diff --git a/.github/workflows/pypi-publish.yml b/.github/workflows/pypi-publish.yml
@@ -16,7 +16,7 @@ jobs:
       - name: setup python
         uses: actions/setup-python@v5
         with:
-          python-version: 3.8
+          python-version: 3.12
 
       - name: Install pip
         run: pip install -r requirements/pip.txt

diff --git a/.gitignore b/.gitignore
@@ -22,6 +22,7 @@ lib64
 
 # Installer logs
 pip-log.txt
+venv
 
 # Unit test / coverage reports
 .cache/

diff --git a/CHANGELOG.rst b/CHANGELOG.rst
@@ -11,6 +11,49 @@ Change Log
 
 .. There should always be an "Unreleased" section for changes pending release.
 
+7.1.0 - 2024-12-05
+------------------
+Added
+~~~~~
+* Added signals monitoring_support_process_request, monitoring_support_process_response, and monitoring_support_process_exception to the MonitoringSupportMiddleware to enable plugging in monitoring capabilities.
+
+7.0.1 - 2024-11-21
+------------------
+Fixed
+~~~~~
+* Fix bug where code owner custom attributes were being defined, even when the CODE_OWNER_MAPPINGS setting was not defined.
+
+7.0.0 - 2024-10-16
+------------------
+Removed
+~~~~~~~
+* Remove unused ``background_task`` monitoring function.
+* Remove ``get_current_transaction`` (used internally only) from the public API.
+
+[6.1.0] - 2024-10-15
+---------------------
+Changed
+~~~~~~~
+* Added Datadog implementation of  ``set_monitoring_transaction_name`` and refactored the functionality.
+
+[6.0.0] - 2024-10-09
+---------------------
+Added
+~~~~~
+* Added support for python3.12
+* Dropped support for python<3.12 versions
+
+[5.16.0] - 2024-09-27
+---------------------
+Added
+~~~~~
+* Added a new method to backends for ``tag_root_span_with_error`` and added Datadog implementation of the functionality.
+* Uses the new method to tag the root span when processing exceptions.
+
+Changed
+~~~~~~~
+* Renamed ``CachedCustomMonitoringMiddleware`` to ``MonitoringSupportMiddleware`` and deprecated the old name. It will be removed in a future release.
+
 [5.15.0] - 2024-07-29
 ---------------------
 Added

diff --git a/docs/decisions/0006-content-security-policy-middleware.rst b/docs/decisions/0006-content-security-policy-middleware.rst
@@ -17,14 +17,18 @@ Context
 - Gather reports of violations, allowing deployers to discover vulnerabilities even while CSP prevents them from being exploited
 - Enforce the use of a business process for adding new external scripts to the site
 
-At the most basic level, CSP allows the server to tell the browser to refuse to load any Javascript that isn't on an allowlist. More advanced deployments can restrict other kinds of resources (including service workers, child frames, images, and XHR connections). A static set of headers can only support an allowlist that is based on domain names and paths, but features such as ``strict-dynamic`` and ``nonce`` allow the server to vouch for scripts individually. This is more secure but requires more integration with application code.
+The most basic use of CSP is to allow the server to tell the browser to refuse to load any Javascript that isn't on an allowlist. More complete deployments can restrict other kinds of resources (including service workers, child frames, images, and XHR connections). A static set of headers can only support an allowlist that is based on domain names and paths, but features such as ``strict-dynamic`` and ``nonce`` allow the server to vouch for ``<script>`` elements individually. This is more secure but requires more integration with application code.
 
 If fully deployed, there is a reduction of risk to learners, partners, and the site's reputation. However, for full effect, these security headers must be returned on all responses, including non-HTML resources and error pages. CSP headers must also be tailored not only to each IDA and MFE, but to each *deployment*, as different deployers will load images and scripts from different domains.
 
 CSP policies are often very long (1 kB in one example) and are built from a number of directives that each contain an allowlist or configuration. They cannot be loosened by the addition of later headers, but only tightened. This means that the server must return the entire policy in one shot, or alternatively emit some directives as separate headers. The use of ``strict-dynamic`` can shrink the header and make the policy more flexible, but requires integration between body and headers: The headers must either contain hashes of the scripts, or nonces that are also referenced in the HTML.
 
 Microfrontends are generally served from static file hosting such as Amazon S3 and cannot take advantage of ``strict-dynamic``; they are limited to source allowlists set by the fileserver or a CDN. In contrast, Django-based IDAs could use ``strict-dynamic`` (via nonces and hashes) because of the ability to customize header and body contents for each response.
 
+Different views also have different needs. For example, some of our views need to allow framing by arbitrary other websites, but by default we want all other pages to have a restrictive ``frame-ancestors`` directive. Because CSP is purely additive, this means we can't take the approach of returning one default header with a restrictive ``frame-ancestors`` and then a second header to override it for this handful of views; unlike ``X-Frame-Options``, CSP does not have a concept of overrides. Instead, these views need to return a different version of the default header. String manipulation in a post-response middleware is unpalatable, so an ideally flexible solution might involve views registering their individual needs with a framework that then compiles the results into a final header (with all overrides applied at the individual directive level). Mozilla's `django-csp package <https://github.com/mozilla/django-csp>`__ takes this approach.
+
+Because it's easy to accidentally break the entire website by use of an overly restrictive CSP, there is a way to try out a policy without enforcing it, via the ``Content-Security-Policy-Report-Only`` header. This runs the same policy but only reports violations to a specified server rather than enforcing the policy. For example, if you have a policy that restricts to loading scripts from domains A, B, C, and D and you want to see if you can remove D from that list, you would deploy the pair ``Content-Security-Policy: script-src A B C D`` and ``Content-Security-Policy-Report-Only: script-src A B C`` and check for reports about domain D. This reporting mechanism is critical for low-impact rollouts, so any CSP solution needs to accomodate it.
+
 As of April 2023, most MFEs and other IDAs include inline Javascript, which severely limits the protection that CSP can provide. However, an initial deployment of CSP that locks down script source domains to an allowlist and reports on the locations of remaining inline scripts would be a good start towards cleaning up these inline scripts. In other words, the goal here is to get *started* on CSP.
 
 Decision
@@ -37,7 +41,7 @@ We will enable this middleware for all Django-based IDAs.
 Consequences
 ************
 
-Django-based IDAs will for now be restricted to using static headers, and will not support ``strict-dynamic``. This is equivalent in configurability to having a CDN attach headers, and is good enough for a first pass, but will almost certainly need to be replaced with a more integrated system later as IDAs are adjusted to become amenable to ``strict-dynamic``.
+Django-based IDAs will for now be restricted to using static headers, and will not support ``strict-dynamic`` or view-specific variation. This is equivalent in configurability to having a CDN attach headers, and is good enough for a first pass, but will almost certainly need to be replaced with a more integrated system later as IDAs are adjusted to become amenable to ``strict-dynamic`` and we've already collected all the gains we can achieve with service-wide policies.
 
 Rejected Alternatives
 *********************

diff --git a/edx_django_utils/__init__.py b/edx_django_utils/__init__.py
@@ -2,7 +2,7 @@
 EdX utilities for Django Application development..
 """
 
-__version__ = "5.15.0"
+__version__ = "7.1.0"
 
 default_app_config = (
     "edx_django_utils.apps.EdxDjangoUtilsConfig"

diff --git a/edx_django_utils/ip/internal/tests/test_ip.py b/edx_django_utils/ip/internal/tests/test_ip.py
@@ -150,7 +150,8 @@ def test_get_client_ips_via_xff(self, xff, remote_addr, expected_strs):
         # Warning: Bad IP address
         ('Some-Thing', 0, {'HTTP_SOME_THING': 'XXXXXXXXX'}, None, "invalid IP"),
     )
-    def test_get_trusted_header_ip(self, header_name, index, add_meta, expected, warning_substr):
+    def test_get_trusted_header_ip(  # pylint: disable=too-many-positional-arguments
+            self, header_name, index, add_meta, expected, warning_substr):
         self.request.META.update(add_meta)
 
         with warning_messages() as caught_warnings:

diff --git a/edx_django_utils/monitoring/README.rst b/edx_django_utils/monitoring/README.rst
@@ -32,18 +32,18 @@ Feature support matrix for built-in telemetry backends:
      - ✅
      - ❌
      - ✅
-   * - Retrieve and manipulate spans (``get_current_transaction``, ``ignore_transaction``, ``set_monitoring_transaction_name``)
+   * - Set local root span name (``set_monitoring_transaction_name``)
+     - ✅
+     - ❌
+     - ✅
+   * - Manipulate spans (``ignore_transaction``)
      - ✅
      - ❌
      - ❌
    * - Record exceptions (``record_exception``)
      - ✅
      - ✅
      - ✅
-   * - Instrument non-web tasks (``background_task``)
-     - ✅
-     - ❌
-     - ❌
 
 Additional requirements for using these backends:
 
@@ -78,18 +78,26 @@ Here is how you add the middleware:
         'edx_django_utils.cache.middleware.RequestCacheMiddleware',
 
         # Add monitoring middleware immediately after RequestCacheMiddleware
+        'edx_django_utils.monitoring.MonitoringSupportMiddleware',
         'edx_django_utils.monitoring.DeploymentMonitoringMiddleware',
         'edx_django_utils.monitoring.CookieMonitoringMiddleware',
         'edx_django_utils.monitoring.CodeOwnerMonitoringMiddleware',
-        'edx_django_utils.monitoring.CachedCustomMonitoringMiddleware',
         'edx_django_utils.monitoring.FrontendMonitoringMiddleware',
         'edx_django_utils.monitoring.MonitoringMemoryMiddleware',
     )
 
-Cached Custom Monitoring Middleware
------------------------------------
+Monitoring Support Middleware and Monitoring Plugins
+----------------------------------------------------
+
+The middleware ``MonitoringSupportMiddleware`` provides a number of monitoring capabilities:
+
+* It enables plugging in outside monitoring capabilities, by sending the signals ``monitoring_support_process_request``, ``monitoring_support_process_response``, and ``monitoring_support_process_exception``. These can be useful because this middleware should be available in all Open edX services, and should appear early enough in the list of middleware to monitor most requests, even those that respond early from another middleware.
+* It allows certain utility methods, like ``accumulate`` and ``increment``, to work appropriately.
+* It adds error span tags to the root span.
+
+In order to use the monitoring signals, import them as follows::
 
-The middleware ``CachedCustomMonitoringMiddleware`` is required to allow certain utility methods, like ``accumulate`` and ``increment``, to work appropriately.
+    from edx_django_utils.monitoring.signals import monitoring_support_process_response
 
 Code Owner Custom Attribute
 ---------------------------

diff --git a/edx_django_utils/monitoring/__init__.py b/edx_django_utils/monitoring/__init__.py
@@ -1,7 +1,8 @@
 """
-Metrics utilities public api
+Monitoring utilities public api
 
-See README.rst for details.
+Does not include signals.py, which is also part of the public api.
+See README.rst for additional details.
 """
 from .internal.backends import DatadogBackend, NewRelicBackend, OpenTelemetryBackend, TelemetryBackend
 from .internal.code_owner.middleware import CodeOwnerMonitoringMiddleware
@@ -15,17 +16,18 @@
     CookieMonitoringMiddleware,
     DeploymentMonitoringMiddleware,
     FrontendMonitoringMiddleware,
-    MonitoringMemoryMiddleware
+    MonitoringMemoryMiddleware,
+    MonitoringSupportMiddleware
 )
-from .internal.transactions import get_current_transaction, ignore_transaction, set_monitoring_transaction_name
+from .internal.transactions import ignore_transaction
 from .internal.utils import (
     accumulate,
-    background_task,
     function_trace,
     increment,
     record_exception,
     set_custom_attribute,
-    set_custom_attributes_for_course_key
+    set_custom_attributes_for_course_key,
+    set_monitoring_transaction_name
 )
 # "set_custom_metric*" methods are deprecated
 from .utils import set_custom_metric, set_custom_metrics_for_course_key
diff --git a/edx_django_utils/monitoring/internal/backends.py b/edx_django_utils/monitoring/internal/backends.py
@@ -62,6 +62,20 @@ def create_span(self, name):
         or create a new root span if not currently in a span.
         """
 
+    @abstractmethod
+    def tag_root_span_with_error(self, exception):
+        """
+        Tags the root span with the given exception. This is primarily useful for
+        Datadog as New Relic handles this behavior correctly. Unclear if this is also needs to
+        be implemented for OTEL.
+        """
+
+    @abstractmethod
+    def set_local_root_span_name(self, name, group=None, priority=None):
+        """
+        Sets the name, group, and priority for a span.
+        """
+
 
 class NewRelicBackend(TelemetryBackend):
     """
@@ -96,6 +110,13 @@ def create_span(self, name):
             nr_transaction = newrelic.agent.current_transaction()
             return newrelic.agent.FunctionTrace(nr_transaction, name)
 
+    def tag_root_span_with_error(self, exception):
+        # Does not need to be implemented for NewRelic, because it is handled automatically.
+        pass
+
+    def set_local_root_span_name(self, name, group=None, priority=None):
+        newrelic.agent.set_transaction_name(name, group, priority)
+
 
 class OpenTelemetryBackend(TelemetryBackend):
     """
@@ -121,6 +142,14 @@ def create_span(self, name):
         # Currently, this is not implemented.
         pass
 
+    def tag_root_span_with_error(self, exception):
+        # Currently, this is not implemented for OTel
+        pass
+
+    def set_local_root_span_name(self, name, group=None, priority=None):
+        # Currently this is not implemented
+        pass
+
 
 class DatadogBackend(TelemetryBackend):
     """
@@ -145,6 +174,15 @@ def record_exception(self):
     def create_span(self, name):
         return self.dd_tracer.trace(name)
 
+    def tag_root_span_with_error(self, exception):
+        root_span = self.dd_tracer.current_root_span()
+        root_span.set_exc_info(type(exception), exception, exception.__traceback__)
+
+    def set_local_root_span_name(self, name, group=None, priority=None):
+        # For Datadog, this updates the 'resource_name' to the given name.
+        local_root_span = self.dd_tracer.current_root_span()
+        local_root_span.resource = name
+
 
 # We're using an lru_cache instead of assigning the result to a variable on
 # module load. With the default settings (pointing to a TelemetryBackend

diff --git a/edx_django_utils/monitoring/internal/code_owner/middleware.py b/edx_django_utils/monitoring/internal/code_owner/middleware.py
@@ -6,7 +6,6 @@
 from django.urls import resolve
 from django.urls.exceptions import Resolver404
 
-from ..transactions import get_current_transaction
 from ..utils import set_custom_attribute
 from .utils import (
     _get_catch_all_code_owner,
@@ -15,9 +14,48 @@
     set_code_owner_custom_attributes
 )
 
+try:
+    import newrelic.agent
+except ImportError:
+    newrelic = None  # pylint: disable=invalid-name
+
 log = logging.getLogger(__name__)
 
 
+class MonitoringTransaction():
+    """
+    Represents a monitoring transaction (likely the current transaction).
+    """
+    def __init__(self, transaction):
+        self.transaction = transaction
+
+    @property
+    def name(self):
+        """
+        The name of the transaction.
+
+        For NewRelic, the name may look like:
+            openedx.core.djangoapps.contentserver.middleware:StaticContentServer
+
+        """
+        if self.transaction and hasattr(self.transaction, 'name'):
+            return self.transaction.name
+        return None
+
+
+def get_current_transaction():
+    """
+    Returns the current transaction. This is only used internally and won't
+    be ported over to the backends framework, because transactions will be
+    very different based on the backend.
+    """
+    current_transaction = None
+    if newrelic:
+        current_transaction = newrelic.agent.current_transaction()
+
+    return MonitoringTransaction(current_transaction)
+
+
 class CodeOwnerMonitoringMiddleware:
     """
     Django middleware object to set custom attributes for the owner of each view.
@@ -107,7 +145,7 @@ def _get_module_from_request_path(self, request):
         # TODO: Replace ImportError with ModuleNotFoundError when Python 3.5 support is dropped.
         except (ImportError, Resolver404) as e:
             return None, str(e)
-        except Exception as e:  # pylint: disable=broad-except; #pragma: no cover
+        except Exception as e:  # pragma: no cover
             # will remove broad exceptions after ensuring all proper cases are covered
             set_custom_attribute('deprecated_broad_except__get_module_from_request_path', e.__class__)
             return None, str(e)
@@ -131,7 +169,7 @@ def _get_module_from_current_transaction(self):
             module = transaction_name.split(':')[0]
             set_custom_attribute('code_owner_transaction_name', transaction_name)
             return module, None
-        except Exception as e:  # pylint: disable=broad-except
+        except Exception as e:
             # will remove broad exceptions after ensuring all proper cases are covered
             set_custom_attribute('deprecated_broad_except___get_module_from_current_transaction', e.__class__)
             return None, str(e)
diff --git a/edx_django_utils/monitoring/internal/code_owner/utils.py b/edx_django_utils/monitoring/internal/code_owner/utils.py
@@ -88,7 +88,9 @@ def get_code_owner_mappings():
     # .. setting_description: Used for monitoring and reporting of ownership. Use a
     #      dict with keys of code owner name and value as a list of dotted path
     #      module names owned by the code owner.
-    code_owner_mappings = getattr(settings, 'CODE_OWNER_MAPPINGS', {})
+    code_owner_mappings = getattr(settings, 'CODE_OWNER_MAPPINGS', None)
+    if code_owner_mappings is None:
+        return None
 
     try:
         for code_owner in code_owner_mappings:
@@ -121,7 +123,7 @@ def _get_catch_all_code_owner():
     try:
         code_owner = get_code_owner_from_module('*')
         return code_owner
-    except Exception as e:  # pylint: disable=broad-except; #pragma: no cover
+    except Exception as e:  # pragma: no cover
         # will remove broad exceptions after ensuring all proper cases are covered
         set_custom_attribute('deprecated_broad_except___get_module_from_current_transaction', e.__class__)
         return None
-Original file line number
+Diff line change
@@ Expand Up / @@ -22,6 +22,7 @@ lib64 @@
     # Installer logs
     pip-log.txt
+    venv
     # Unit test / coverage reports
     .cache/
@@ Expand Down @@