Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate NVTabular into Morpheus Core and replace existing column_info based workflows. #938

Merged
merged 75 commits into from
Jul 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
b541e53
First commit -- update environments to pull nvtabular
drobison00 Apr 13, 2023
e18642e
Testing
drobison00 Apr 17, 2023
cbad33b
More testing code, developing a work around for being able to add dyn…
drobison00 Apr 19, 2023
b00afb1
More experimental code, converging on 'MutateOp' implementation
drobison00 Apr 20, 2023
249d801
Land operators and transforms in their proper physical location and a…
drobison00 Apr 20, 2023
c111b10
Merge 23.07 and fix protobuf incompatibility problem
drobison00 Apr 20, 2023
d72fd33
Column converters are in place, working through unit tests and proble…
drobison00 Apr 21, 2023
f345f5c
Checkpoint -- Finally have operators all working correctly
drobison00 Apr 26, 2023
ff5680c
Functional graph composition on top of legacy Schema declaration
drobison00 Apr 27, 2023
46c118e
Working!
drobison00 Apr 27, 2023
be42d6a
Resolved remaining unittest issues
drobison00 Apr 27, 2023
ae7b088
Add more unit tests
drobison00 Apr 27, 2023
554db56
Switch to BFS approach, much cleaner and easier to remove duplicates
drobison00 Apr 28, 2023
383d001
More unit tests, modularize BFS code
drobison00 Apr 28, 2023
9b27206
More unit tests
drobison00 Apr 28, 2023
a377682
More unit tests, coverage for post filtering and column preservation …
drobison00 Apr 28, 2023
233c80b
NVT Tests passing, still working on data mappings
drobison00 May 2, 2023
9622a02
Implement work arounds for Merlin StructDType handling, all unit test…
drobison00 May 3, 2023
806da3c
Updates
drobison00 May 3, 2023
a9b167d
Add a few more unit tests, add process_workflow args
drobison00 May 3, 2023
6e51004
Merge latest 23.07
drobison00 May 3, 2023
8f99acb
Fix ColumnInfo bug
drobison00 May 4, 2023
b81006d
Checkpoint -- Blocked on NVT issue
drobison00 May 5, 2023
eae44ef
Revert 'process_dataframe' until DASK cudf issue is resolved. Formatt…
drobison00 May 8, 2023
11a8544
Merge branch 'branch-23.07' into devin_issue_862
drobison00 May 10, 2023
9842f30
Merge branch-23.07, formatting fixes
drobison00 May 10, 2023
1f73e27
Formatting fixes
drobison00 May 10, 2023
ab99cff
Revert to NVT impl since Dask is nearing a fix
drobison00 May 10, 2023
f09acf3
Merge branch 'branch-23.07' into devin_issue_862
mdemoret-nv May 16, 2023
2a80fa6
Removing requirement on sphinx
mdemoret-nv May 16, 2023
712865b
Formatting updates that fix all was missing
drobison00 May 16, 2023
246f7ee
Flake8 fixes
drobison00 May 16, 2023
7305e1f
Merge branch-23.07
drobison00 Jul 5, 2023
ae8138d
Update to handle dtype mappings, clean up circular dependencies creat…
drobison00 Jul 6, 2023
a06ff23
Update to fix Non-root CustomColumns
drobison00 Jul 6, 2023
08650cf
Formatting fixes
drobison00 Jul 6, 2023
e31a172
Merge branch 'cudf-23.06' into devin_issue_862
mdemoret-nv Jul 8, 2023
a03c911
Fix indentation [no ci]
dagardner-nv Jul 10, 2023
aaf1668
Mock the dask client for contructor test
dagardner-nv Jul 10, 2023
ecb53dd
Refactor setup into fixtures
dagardner-nv Jul 10, 2023
49f9851
Merge pull request #10 from dagardner-nv/devin_issue_862_dg
drobison00 Jul 10, 2023
3a44b8a
Refactor setup into fixtures
dagardner-nv Jul 10, 2023
b2d5306
Merge upstream
drobison00 Jul 10, 2023
05c3c0c
Refactor data into a fixture
dagardner-nv Jul 10, 2023
f219293
Refactor df as a fixture
dagardner-nv Jul 10, 2023
e26a295
Expose dataframe class as a property
dagardner-nv Jul 10, 2023
7044a6f
Remove invalid test, fix test_single_object_to_dataframe
drobison00 Jul 10, 2023
74d9fc5
Refactor to use parametarized fixture, consolidate cudf and pandas te…
dagardner-nv Jul 10, 2023
1a4893d
Merge pull request #11 from dagardner-nv/devin_issue_862_dg
drobison00 Jul 10, 2023
01d67db
Cleanup imports and type hints
dagardner-nv Jul 10, 2023
82e7b05
Use compare to allow for subtle rouding differences
dagardner-nv Jul 10, 2023
6fa9da7
Merge pull request #12 from dagardner-nv/devin_issue_862_dg
drobison00 Jul 10, 2023
9bfc33f
Merge branch 'devin_issue_862' of github.com:drobison00/devin-morpheu…
dagardner-nv Jul 10, 2023
64f24bc
Fix import sorting
dagardner-nv Jul 10, 2023
ce6f9d1
Update morpheus/utils/nvt/schema_converters.py
drobison00 Jul 10, 2023
2411732
pylint fixes for morpheus/utils/column_info.py
dagardner-nv Jul 10, 2023
90b4f8f
Update morpheus/utils/schema_transforms.py
drobison00 Jul 10, 2023
4e6c9b1
PR feedback update
drobison00 Jul 10, 2023
463cbd9
pylint fixes for morpheus/utils/nvt/mutate.py
dagardner-nv Jul 10, 2023
b7191ea
Merge pull request #13 from dagardner-nv/devin_issue_862_dg_lint
drobison00 Jul 10, 2023
fee9cf8
More PR feedback updates - docstrings, generalized sync decorator, ot…
drobison00 Jul 10, 2023
c5c665e
PR feedback updates
drobison00 Jul 10, 2023
f840705
Update tritonclient install, docs updates
drobison00 Jul 11, 2023
1f581e9
PR Feedback updates. Adds additional explicit type hints
drobison00 Jul 11, 2023
5aa6d1e
Additional PR feedback updates
drobison00 Jul 11, 2023
61dec6c
Formatting fixes
drobison00 Jul 11, 2023
1f5acb2
Merge branch-23.07
drobison00 Jul 11, 2023
227509e
Docs fixes, formatting
drobison00 Jul 11, 2023
3e41f93
Formatting fixes
drobison00 Jul 11, 2023
a2bfc14
Formatting fixes
drobison00 Jul 11, 2023
d01eda0
Formatting fixes
drobison00 Jul 11, 2023
a739181
Formatting fixes
drobison00 Jul 11, 2023
85e7d68
Formatting fixes
drobison00 Jul 11, 2023
37388e8
PR feedback + test numba channel removal
drobison00 Jul 11, 2023
bb75522
Removing the data class descriptor since pylint doesnt like it.
mdemoret-nv Jul 11, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion ci/scripts/bootstrap_local_ci.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/bash
# SPDX-FileCopyrightText: Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
Expand Down
2 changes: 2 additions & 0 deletions docker/conda/environments/cuda11.8_dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,7 @@ dependencies:
- nodejs=18.15.0
- numba>=0.56.2
- numpydoc=1.4
- nvtabular=23.06
- pandas=1.3
- pip
- pkg-config # for mrc cmake
Expand All @@ -91,6 +92,7 @@ dependencies:
- sphinx
- sphinx_rtd_theme
- sysroot_linux-64=2.17
- tritonclient=2.26 # Required by NvTabular, force the version, so we get protobufs compatible with 4.21
- tqdm=4
- typing_utils=0.1
- watchdog=2.1
Expand Down
1 change: 0 additions & 1 deletion docker/conda/environments/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,4 @@ jupyterlab
nvidia-pyindex
# Duplicated in conda dev to ensure parity with libprotobuf
protobuf==4.21.*
tritonclient[all]==2.17.*
websockets
Loading