Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeated headers list for ASGI frameworks #2361

Merged
merged 24 commits into from
Jun 20, 2024
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
c7f28eb
avoid loosing repeated HTTP headers
samuelcolvin Feb 23, 2024
2f6a998
fix fof wsgi, test in falcon
samuelcolvin Feb 23, 2024
88a6553
add changelog
samuelcolvin Feb 26, 2024
109ac8c
add more tests
samuelcolvin Feb 26, 2024
d71e093
linting
samuelcolvin Feb 26, 2024
8c89f4a
Merge branch 'main' into avoid-loosing-repeated-headers
samuelcolvin Mar 18, 2024
3ad7587
fix falcon and flask
samuelcolvin Mar 18, 2024
f29c5b9
remove unused test
samuelcolvin Mar 18, 2024
113d346
Merge branch 'main' into avoid-loosing-repeated-headers
lzchen Mar 18, 2024
0fa0a36
Use a list for repeated HTTP headers
samuelcolvin Mar 20, 2024
8f7ff48
linting
samuelcolvin Mar 20, 2024
a9a260a
Merge branch 'main' into repeated-headers-list
samuelcolvin May 1, 2024
dc39259
add changelog entry
samuelcolvin May 1, 2024
2ea4ebd
update docs and improve fastapi tests
samuelcolvin May 1, 2024
94ad9c1
revert changes in wsgi based webframeworks
samuelcolvin May 1, 2024
299ea99
fix linting
samuelcolvin May 1, 2024
7fe3d12
Merge branch 'main' into repeated-headers-list
ocelotl May 6, 2024
b32716b
Merge branch 'main' into repeated-headers-list
ocelotl May 8, 2024
26d9c73
Merge branch 'main' into repeated-headers-list
ocelotl May 14, 2024
fa3683f
Fix import path of typing symbols
ocelotl May 14, 2024
8dc8c16
Merge branch 'main' into repeated-headers-list
lzchen May 24, 2024
06c9816
Merge branch 'main' into repeated-headers-list
lzchen May 28, 2024
6c3b8e5
Merge branch 'main' into repeated-headers-list
lzchen May 30, 2024
d5bfbc1
Merge branch 'main' into repeated-headers-list
ocelotl Jun 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
([#2425](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/2425))
- `opentelemetry-instrumentation-flask` Add `http.method` to `span.name`
([#2454](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/2454))
- Record repeated HTTP headers in lists, rather than a comma separate strings for ASGI based web frameworks
([#2361](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/2361))

### Added

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -129,10 +129,10 @@ def client_response_hook(span: Span, message: dict):

The name of the added span attribute will follow the format ``http.request.header.<header_name>`` where ``<header_name>``
is the normalized HTTP header name (lowercase, with ``-`` replaced by ``_``). The value of the attribute will be a
single item list containing all the header values.
list containing the header values.

For example:
``http.request.header.custom_request_header = ["<value1>,<value2>"]``
``http.request.header.custom_request_header = ["<value1>", "<value2>"]``

Response headers
****************
Expand Down Expand Up @@ -163,10 +163,10 @@ def client_response_hook(span: Span, message: dict):

The name of the added span attribute will follow the format ``http.response.header.<header_name>`` where ``<header_name>``
is the normalized HTTP header name (lowercase, with ``-`` replaced by ``_``). The value of the attribute will be a
single item list containing all the header values.
list containing the header values.

For example:
``http.response.header.custom_response_header = ["<value1>,<value2>"]``
``http.response.header.custom_response_header = ["<value1>", "<value2>"]``

Sanitizing headers
******************
Expand All @@ -193,9 +193,10 @@ def client_response_hook(span: Span, message: dict):

import typing
import urllib
from collections import defaultdict
from functools import wraps
from timeit import default_timer
from typing import Any, Awaitable, Callable, Tuple
from typing import Any, Awaitable, Callable, DefaultDict, Tuple

from asgiref.compatibility import guarantee_single_callable

Expand Down Expand Up @@ -339,24 +340,19 @@ def collect_custom_headers_attributes(
sanitize: SanitizeValue,
header_regexes: list[str],
normalize_names: Callable[[str], str],
) -> dict[str, str]:
) -> dict[str, list[str]]:
"""
Returns custom HTTP request or response headers to be added into SERVER span as span attributes.

Refer specifications:
- https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/http.md#http-request-and-response-headers
"""
# Decode headers before processing.
headers: dict[str, str] = {}
headers: DefaultDict[str, list[str]] = defaultdict(list)
raw_headers = scope_or_response_message.get("headers")
if raw_headers:
for _key, _value in raw_headers:
key = _key.decode().lower()
value = _value.decode()
if key in headers:
headers[key] += f",{value}"
else:
headers[key] = value
for key, value in raw_headers:
# Decode headers before processing.
headers[key.decode()].append(value.decode())
samuelcolvin marked this conversation as resolved.
Show resolved Hide resolved

return sanitize.sanitize_header_values(
headers,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,8 @@ def test_http_repeat_request_headers_in_span_attributes(self):
span_list = self.exporter.get_finished_spans()
expected = {
"http.request.header.custom_test_header_1": (
"test-header-value-1,test-header-value-2",
"test-header-value-1",
"test-header-value-2",
),
}
span = next(span for span in span_list if span.kind == SpanKind.SERVER)
Expand Down Expand Up @@ -225,7 +226,8 @@ def test_http_repeat_response_headers_in_span_attributes(self):
span_list = self.exporter.get_finished_spans()
expected = {
"http.response.header.custom_test_header_1": (
"test-header-value-1,test-header-value-2",
"test-header-value-1",
"test-header-value-2",
),
}
span = next(span for span in span_list if span.kind == SpanKind.SERVER)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ def client_response_hook(span: Span, message: dict):
single item list containing all the header values.

For example:
``http.request.header.custom_request_header = ["<value1>,<value2>"]``
``http.request.header.custom_request_header = ["<value1>", "<value2>"]``

Response headers
****************
Expand Down Expand Up @@ -146,10 +146,10 @@ def client_response_hook(span: Span, message: dict):

The name of the added span attribute will follow the format ``http.response.header.<header_name>`` where ``<header_name>``
is the normalized HTTP header name (lowercase, with ``-`` replaced by ``_``). The value of the attribute will be a
single item list containing all the header values.
list containing the header values.

For example:
``http.response.header.custom_response_header = ["<value1>,<value2>"]``
``http.response.header.custom_response_header = ["<value1>", "<value2>"]``

Sanitizing headers
******************
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import unittest
from timeit import default_timer
from typing import Mapping, Tuple
xrmx marked this conversation as resolved.
Show resolved Hide resolved
from unittest.mock import patch

import fastapi
Expand Down Expand Up @@ -549,6 +549,24 @@ def test_mark_span_internal_in_presence_of_span_from_other_framework(self):
)


class MultiMapping(Mapping):

def __init__(self, *items: Tuple[str, str]):
self._items = items

def __len__(self):
return len(self._items)

def __getitem__(self, __key):
raise NotImplementedError("use .items() instead")

def __iter__(self):
raise NotImplementedError("use .items() instead")

def items(self):
return self._items


@patch.dict(
"os.environ",
{
Expand All @@ -575,13 +593,15 @@ def _create_app():

@app.get("/foobar")
async def _():
headers = {
"custom-test-header-1": "test-header-value-1",
"custom-test-header-2": "test-header-value-2",
"my-custom-regex-header-1": "my-custom-regex-value-1,my-custom-regex-value-2",
"My-Custom-Regex-Header-2": "my-custom-regex-value-3,my-custom-regex-value-4",
"My-Secret-Header": "My Secret Value",
}
headers = MultiMapping(
("custom-test-header-1", "test-header-value-1"),
("custom-test-header-2", "test-header-value-2"),
("my-custom-regex-header-1", "my-custom-regex-value-1"),
("my-custom-regex-header-1", "my-custom-regex-value-2"),
("My-Custom-Regex-Header-2", "my-custom-regex-value-3"),
("My-Custom-Regex-Header-2", "my-custom-regex-value-4"),
("My-Secret-Header", "My Secret Value"),
)
content = {"message": "hello world"}
return JSONResponse(content=content, headers=headers)

Expand Down Expand Up @@ -657,10 +677,12 @@ def test_http_custom_response_headers_in_span_attributes(self):
"test-header-value-2",
),
"http.response.header.my_custom_regex_header_1": (
"my-custom-regex-value-1,my-custom-regex-value-2",
"my-custom-regex-value-1",
"my-custom-regex-value-2",
),
"http.response.header.my_custom_regex_header_2": (
"my-custom-regex-value-3,my-custom-regex-value-4",
"my-custom-regex-value-3",
"my-custom-regex-value-4",
),
"http.response.header.my_secret_header": ("[REDACTED]",),
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -108,10 +108,10 @@ def client_response_hook(span: Span, message: dict):

The name of the added span attribute will follow the format ``http.request.header.<header_name>`` where ``<header_name>``
is the normalized HTTP header name (lowercase, with ``-`` replaced by ``_``). The value of the attribute will be a
single item list containing all the header values.
list containing the header values.

For example:
``http.request.header.custom_request_header = ["<value1>,<value2>"]``
``http.request.header.custom_request_header = ["<value1>", "<value2>"]``

Response headers
****************
Expand Down Expand Up @@ -142,10 +142,10 @@ def client_response_hook(span: Span, message: dict):

The name of the added span attribute will follow the format ``http.response.header.<header_name>`` where ``<header_name>``
is the normalized HTTP header name (lowercase, with ``-`` replaced by ``_``). The value of the attribute will be a
single item list containing all the header values.
list containing the header values.

For example:
``http.response.header.custom_response_header = ["<value1>,<value2>"]``
``http.response.header.custom_response_header = ["<value1>", "<value2>"]``

Sanitizing headers
******************
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
from re import IGNORECASE as RE_IGNORECASE
from re import compile as re_compile
from re import search
from typing import Callable, Iterable, Optional
from typing import Callable, Iterable, Mapping, Optional
xrmx marked this conversation as resolved.
Show resolved Hide resolved
from urllib.parse import urlparse, urlunparse

from opentelemetry.semconv.trace import SpanAttributes
Expand Down Expand Up @@ -87,32 +87,32 @@ def sanitize_header_value(self, header: str, value: str) -> str:

def sanitize_header_values(
self,
headers: dict[str, str],
headers: Mapping[str, str | list[str]],
header_regexes: list[str],
normalize_function: Callable[[str], str],
) -> dict[str, str]:
values: dict[str, str] = {}
) -> dict[str, list[str]]:
values: dict[str, list[str]] = {}

if header_regexes:
header_regexes_compiled = re_compile(
"|".join("^" + i + "$" for i in header_regexes),
"|".join(header_regexes),
RE_IGNORECASE,
)

for header_name in list(
filter(
header_regexes_compiled.match,
headers.keys(),
)
):
header_values = headers.get(header_name)
if header_values:
for header_name, header_value in headers.items():
if header_regexes_compiled.fullmatch(header_name):
key = normalize_function(header_name.lower())
values[key] = [
self.sanitize_header_value(
header=header_name, value=header_values
)
]
if isinstance(header_value, str):
values[key] = [
self.sanitize_header_value(
header_name, header_value
)
]
else:
values[key] = [
self.sanitize_header_value(header_name, value)
for value in header_value
]

return values

Expand Down
Loading