Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable ASAN/UBSAN in pandas CI #55102

Merged
merged 57 commits into from
Dec 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
66d83d1
enable ASAN/UBSAN in pandas CI
WillAyd Sep 11, 2023
7aa2e7a
try input
WillAyd Sep 11, 2023
a5b3808
try removing sanitize
WillAyd Sep 12, 2023
7b58c6d
try no CFLAGS
WillAyd Sep 12, 2023
18111b0
try GH string substituion
WillAyd Sep 12, 2023
438cdfa
change flags in build script
WillAyd Sep 12, 2023
b18cf9d
quotes
WillAyd Sep 12, 2023
69cb6f6
update script run
WillAyd Sep 12, 2023
6f5fb11
single_cpu updates
WillAyd Sep 12, 2023
eb258ca
Merge branch 'main' into pandas-asan
WillAyd Sep 14, 2023
663d6d4
asan checks for datetime funcs
WillAyd Sep 14, 2023
466056d
try smaller config
WillAyd Sep 15, 2023
91f2e17
Merge remote-tracking branch 'upstream/main' into pandas-asan
WillAyd Sep 15, 2023
d4074ca
checkpoint
WillAyd Sep 15, 2023
aeff50e
Merge branch 'main' into pandas-asan
WillAyd Oct 26, 2023
e303ba1
bool fixup
WillAyd Oct 27, 2023
4220d82
Merge branch 'main' into pandas-asan
WillAyd Nov 16, 2023
46d1034
reverts
WillAyd Nov 16, 2023
89706a4
Merge branch 'main' into pandas-asan
WillAyd Nov 17, 2023
929c731
known UB marker
WillAyd Nov 17, 2023
b01242b
Merge branch 'main' into pandas-asan
lithomas1 Nov 28, 2023
6483e07
Finished marking tests with known UB
WillAyd Dec 2, 2023
de13605
Merge remote-tracking branch 'upstream/main' into pandas-asan
WillAyd Dec 2, 2023
b87a210
dedicated CI job
WillAyd Dec 2, 2023
77d1e00
Merge remote-tracking branch 'upstream/main' into pandas-asan
WillAyd Dec 2, 2023
46ec023
identifier fix
WillAyd Dec 2, 2023
8695dca
fixes
WillAyd Dec 2, 2023
05319ae
more test skip
WillAyd Dec 2, 2023
6d76a57
try quotes
WillAyd Dec 2, 2023
f5dd440
simplify ci
WillAyd Dec 2, 2023
12aa1d1
try CFLAGS
WillAyd Dec 2, 2023
628d1c2
preload args
WillAyd Dec 2, 2023
1de633e
skip single_cpu tests
WillAyd Dec 2, 2023
3e295c5
wording
WillAyd Dec 2, 2023
252197e
Merge remote-tracking branch 'upstream/main' into pandas-asan
WillAyd Dec 5, 2023
d5809b8
removed unneeded marker
WillAyd Dec 5, 2023
6266422
float set implementations
WillAyd Dec 5, 2023
b68a533
Revert "float set implementations"
WillAyd Dec 5, 2023
47dc305
Merge branch 'main' into pandas-asan
WillAyd Dec 6, 2023
636b8dd
Merge remote-tracking branch 'upstream/main' into pandas-asan
WillAyd Dec 13, 2023
a03ad1e
change marker name
WillAyd Dec 15, 2023
656edb1
dedicated actions file
WillAyd Dec 15, 2023
2aabda1
consolidated into matrix
WillAyd Dec 15, 2023
a9f2419
Merge remote-tracking branch 'upstream/main' into pandas-asan
WillAyd Dec 15, 2023
3056e5f
fixup
WillAyd Dec 15, 2023
89b2b80
typos
WillAyd Dec 15, 2023
d591b78
fixups
WillAyd Dec 16, 2023
6442066
add qt?
WillAyd Dec 16, 2023
c59703d
Merge branch 'main' into pandas-asan
WillAyd Dec 19, 2023
02bf20d
intentional UB with verbose
WillAyd Dec 19, 2023
01070f3
disable pytest-xdist
WillAyd Dec 20, 2023
9f1adbc
Merge remote-tracking branch 'upstream/main' into pandas-asan
WillAyd Dec 20, 2023
57ed286
original issue
WillAyd Dec 20, 2023
677da0e
remove UB
WillAyd Dec 20, 2023
af0150a
Revert "remove UB"
WillAyd Dec 21, 2023
4647f12
merge fixup
WillAyd Dec 21, 2023
cba79f6
remove UB
WillAyd Dec 21, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions .github/actions/build_pandas/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,12 @@ inputs:
editable:
description: Whether to build pandas in editable mode (default true)
default: true
meson_args:
description: Extra flags to pass to meson
required: false
cflags_adds:
WillAyd marked this conversation as resolved.
Show resolved Hide resolved
description: Items to append to the CFLAGS variable
required: false
runs:
using: composite
steps:
Expand All @@ -24,11 +30,12 @@ runs:

- name: Build Pandas
run: |
export CFLAGS="$CFLAGS ${{ inputs.cflags_adds }}"
if [[ ${{ inputs.editable }} == "true" ]]; then
pip install -e . --no-build-isolation -v --no-deps \
pip install -e . --no-build-isolation -v --no-deps ${{ inputs.meson_args }} \
--config-settings=setup-args="--werror"
else
pip install . --no-build-isolation -v --no-deps \
pip install . --no-build-isolation -v --no-deps ${{ inputs.meson_args }} \
--config-settings=setup-args="--werror"
fi
shell: bash -el {0}
9 changes: 8 additions & 1 deletion .github/actions/run-tests/action.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,16 @@
name: Run tests and report results
inputs:
preload:
description: Preload arguments for sanitizer
required: false
asan_options:
description: Arguments for Address Sanitizer (ASAN)
required: false
runs:
using: composite
steps:
- name: Test
run: ci/run_tests.sh
run: ${{ inputs.asan_options }} ${{ inputs.preload }} ci/run_tests.sh
shell: bash -el {0}

- name: Publish test results
Expand Down
19 changes: 18 additions & 1 deletion .github/workflows/unit-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,14 @@ jobs:
- name: "Pyarrow Nightly"
env_file: actions-311-pyarrownightly.yaml
pattern: "not slow and not network and not single_cpu"
- name: "ASAN / UBSAN"
env_file: actions-311-sanitizers.yaml
pattern: "not slow and not network and not single_cpu and not skip_ubsan"
asan_options: "ASAN_OPTIONS=detect_leaks=0"
preload: LD_PRELOAD=$(gcc -print-file-name=libasan.so)
meson_args: --config-settings=setup-args="-Db_sanitize=address,undefined"
cflags_adds: -fno-sanitize-recover=all
pytest_workers: -1 # disable pytest-xdist as it swallows stderr from ASAN
fail-fast: false
name: ${{ matrix.name || format('ubuntu-latest {0}', matrix.env_file) }}
env:
Expand All @@ -105,7 +113,7 @@ jobs:
PANDAS_COPY_ON_WRITE: ${{ matrix.pandas_copy_on_write || '0' }}
PANDAS_CI: ${{ matrix.pandas_ci || '1' }}
TEST_ARGS: ${{ matrix.test_args || '' }}
PYTEST_WORKERS: 'auto'
PYTEST_WORKERS: ${{ matrix.pytest_workers || 'auto' }}
PYTEST_TARGET: ${{ matrix.pytest_target || 'pandas' }}
# Clipboard tests
QT_QPA_PLATFORM: offscreen
Expand Down Expand Up @@ -174,16 +182,25 @@ jobs:
- name: Build Pandas
id: build
uses: ./.github/actions/build_pandas
with:
meson_args: ${{ matrix.meson_args }}
cflags_adds: ${{ matrix.cflags_adds }}

- name: Test (not single_cpu)
uses: ./.github/actions/run-tests
if: ${{ matrix.name != 'Pypy' }}
with:
preload: ${{ matrix.preload }}
asan_options: ${{ matrix.asan_options }}
env:
# Set pattern to not single_cpu if not already set
PATTERN: ${{ env.PATTERN == '' && 'not single_cpu' || matrix.pattern }}

- name: Test (single_cpu)
uses: ./.github/actions/run-tests
with:
preload: ${{ matrix.preload }}
asan_options: ${{ matrix.asan_options }}
env:
PATTERN: 'single_cpu'
PYTEST_WORKERS: 0
Expand Down
32 changes: 32 additions & 0 deletions ci/deps/actions-311-sanitizers.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
name: pandas-dev
channels:
- conda-forge
dependencies:
- python=3.11

# build dependencies
- versioneer[toml]
- cython>=0.29.33
- meson[ninja]=1.2.1
- meson-python=0.13.1

# test dependencies
- pytest>=7.3.2
- pytest-cov
- pytest-xdist>=2.2.0
- pytest-localserver>=0.7.1
- pytest-qt>=4.2.0
- boto3
- hypothesis>=6.46.1
- pyqt>=5.15.9

# required dependencies
- python-dateutil
- numpy<2
- pytz

# pandas dependencies
- pip

- pip:
- "tzdata>=2022.7"
2 changes: 2 additions & 0 deletions pandas/tests/frame/test_constructors.py
Original file line number Diff line number Diff line change
Expand Up @@ -3206,6 +3206,7 @@ def test_from_out_of_bounds_ns_datetime(
assert item.asm8.dtype == exp_dtype
assert dtype == exp_dtype

@pytest.mark.skip_ubsan
def test_out_of_s_bounds_datetime64(self, constructor):
scalar = np.datetime64(np.iinfo(np.int64).max, "D")
result = constructor(scalar)
Expand Down Expand Up @@ -3241,6 +3242,7 @@ def test_from_out_of_bounds_ns_timedelta(
assert item.asm8.dtype == exp_dtype
assert dtype == exp_dtype

@pytest.mark.skip_ubsan
@pytest.mark.parametrize("cls", [np.datetime64, np.timedelta64])
def test_out_of_s_bounds_timedelta64(self, constructor, cls):
scalar = cls(np.iinfo(np.int64).max, "D")
Expand Down
1 change: 1 addition & 0 deletions pandas/tests/groupby/test_cumulative.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ def test_groupby_cumprod():
tm.assert_series_equal(actual, expected)


@pytest.mark.skip_ubsan
def test_groupby_cumprod_overflow():
# GH#37493 if we overflow we return garbage consistent with numpy
df = DataFrame({"key": ["b"] * 4, "value": 100_000})
Expand Down
10 changes: 9 additions & 1 deletion pandas/tests/io/parser/common/test_float.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,14 @@ def test_scientific_no_exponent(all_parsers_all_precisions):
tm.assert_frame_equal(df_roundtrip, df)


@pytest.mark.parametrize("neg_exp", [-617, -100000, -99999999999999999])
@pytest.mark.parametrize(
"neg_exp",
[
-617,
-100000,
pytest.param(-99999999999999999, marks=pytest.mark.skip_ubsan),
],
)
def test_very_negative_exponent(all_parsers_all_precisions, neg_exp):
# GH#38753
parser, precision = all_parsers_all_precisions
Expand All @@ -51,6 +58,7 @@ def test_very_negative_exponent(all_parsers_all_precisions, neg_exp):
tm.assert_frame_equal(result, expected)


@pytest.mark.skip_ubsan
@xfail_pyarrow # AssertionError: Attributes of DataFrame.iloc[:, 0] are different
@pytest.mark.parametrize("exp", [999999999999999999, -999999999999999999])
def test_too_many_exponent_digits(all_parsers_all_precisions, exp, request):
Expand Down
2 changes: 2 additions & 0 deletions pandas/tests/scalar/timedelta/methods/test_round.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ def test_round_invalid(self):
with pytest.raises(ValueError, match=msg):
t1.round(freq)

@pytest.mark.skip_ubsan
def test_round_implementation_bounds(self):
# See also: analogous test for Timestamp
# GH#38964
Expand All @@ -86,6 +87,7 @@ def test_round_implementation_bounds(self):
with pytest.raises(OutOfBoundsTimedelta, match=msg):
Timedelta.max.round("s")

@pytest.mark.skip_ubsan
@given(val=st.integers(min_value=iNaT + 1, max_value=lib.i8max))
@pytest.mark.parametrize(
"method", [Timedelta.round, Timedelta.floor, Timedelta.ceil]
Expand Down
1 change: 1 addition & 0 deletions pandas/tests/scalar/timedelta/test_arithmetic.py
Original file line number Diff line number Diff line change
Expand Up @@ -966,6 +966,7 @@ def test_td_op_timedelta_timedeltalike_array(self, op, arr):


class TestTimedeltaComparison:
@pytest.mark.skip_ubsan
def test_compare_pytimedelta_bounds(self):
# GH#49021 don't overflow on comparison with very large pytimedeltas

Expand Down
1 change: 1 addition & 0 deletions pandas/tests/scalar/timedelta/test_timedelta.py
Original file line number Diff line number Diff line change
Expand Up @@ -551,6 +551,7 @@ def test_timedelta_hash_equality(self):
ns_td = Timedelta(1, "ns")
assert hash(ns_td) != hash(ns_td.to_pytimedelta())

@pytest.mark.skip_ubsan
@pytest.mark.xfail(
reason="pd.Timedelta violates the Python hash invariant (GH#44504).",
)
Expand Down
1 change: 1 addition & 0 deletions pandas/tests/scalar/timestamp/methods/test_tz_localize.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@


class TestTimestampTZLocalize:
@pytest.mark.skip_ubsan
def test_tz_localize_pushes_out_of_bounds(self):
# GH#12677
# tz_localize that pushes away from the boundary is OK
Expand Down
1 change: 1 addition & 0 deletions pandas/tests/scalar/timestamp/test_constructors.py
Original file line number Diff line number Diff line change
Expand Up @@ -822,6 +822,7 @@ def test_barely_out_of_bounds(self):
with pytest.raises(OutOfBoundsDatetime, match=msg):
Timestamp("2262-04-11 23:47:16.854775808")

@pytest.mark.skip_ubsan
def test_bounds_with_different_units(self):
out_of_bounds_dates = ("1677-09-21", "2262-04-12")

Expand Down
1 change: 1 addition & 0 deletions pandas/tests/tools/test_to_datetime.py
Original file line number Diff line number Diff line change
Expand Up @@ -1140,6 +1140,7 @@ def test_to_datetime_dt64s_out_of_ns_bounds(self, cache, dt, errors):
assert ts.unit == "s"
assert ts.asm8 == dt

@pytest.mark.skip_ubsan
def test_to_datetime_dt64d_out_of_bounds(self, cache):
dt64 = np.datetime64(np.iinfo(np.int64).max, "D")

Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -523,6 +523,7 @@ markers = [
"db: tests requiring a database (mysql or postgres)",
"clipboard: mark a pd.read_clipboard test",
"arm_slow: mark a test as slow for arm64 architecture",
"skip_ubsan: Tests known to fail UBSAN check",
]

[tool.mypy]
Expand Down
1 change: 1 addition & 0 deletions scripts/tests/data/deps_minimum.toml
Original file line number Diff line number Diff line change
Expand Up @@ -382,6 +382,7 @@ markers = [
"db: tests requiring a database (mysql or postgres)",
"clipboard: mark a pd.read_clipboard test",
"arm_slow: mark a test as slow for arm64 architecture",
"skip_ubsan: tests known to invoke undefined behavior",
]

[tool.mypy]
Expand Down