Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create tests for examples with custom stages #885

Merged
merged 90 commits into from
Apr 28, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
90 commits
Select commit Hold shift + click to select a range
9e9222d
First 20 lines of examples/abp_pcap_detection/pcap_out.jsonlines pull…
dagardner-nv Mar 29, 2023
16de6f7
Remove flow_id and rollup_time cols
dagardner-nv Mar 29, 2023
a239f41
Test data
dagardner-nv Mar 29, 2023
59dc4f9
Fix type-o
dagardner-nv Mar 29, 2023
d7e23be
Add a get_stage_class method to StageInfo
dagardner-nv Mar 29, 2023
fb7829e
Fixture to reset the PluginManager and GlobalStageRegistry
dagardner-nv Mar 29, 2023
454eeea
Add helper method to assist pulling a stage from the plugin manager
dagardner-nv Mar 29, 2023
e3007b7
End-to-end test for AbpPcapPreprocessingStage
dagardner-nv Mar 29, 2023
c354361
Test asserting that needed columns are defined
dagardner-nv Mar 29, 2023
3fcb19a
Request pre-allocation
dagardner-nv Mar 29, 2023
dd41ddc
Fix setting get_stage_class in LazyStageInfo
dagardner-nv Mar 29, 2023
1781bd6
Merge branch 'branch-23.03' into pcap_preprocessing-pre-allocating-797
dagardner-nv Mar 29, 2023
5d4eb0a
Merge branch 'branch-23.07' into pcap_preprocessing-pre-allocating-797
dagardner-nv Apr 5, 2023
493e7fb
reload_modules fixture reloads requested modules before and after the…
dagardner-nv Apr 5, 2023
81002ac
Move test_abp_pcap_preprocessing.py to an examples subdir
dagardner-nv Apr 5, 2023
ad9e578
Fixture to restore sys.path
dagardner-nv Apr 5, 2023
664b2d7
Add type hints
dagardner-nv Apr 5, 2023
c8992cd
Add test for pass-thru stages
dagardner-nv Apr 5, 2023
f460f66
Fix invocation of reload_modules fixture
dagardner-nv Apr 5, 2023
7c3a2bd
Deleting redundant copies of bert hash files
dagardner-nv Apr 5, 2023
e0185a2
Update references to files bert-base-cased-hash.txt and bert-base-unc…
dagardner-nv Apr 6, 2023
219e911
Revert "Deleting redundant copies of bert hash files"
dagardner-nv Apr 6, 2023
c49c039
Remove redundant copies of bert-base-cased-hash.txt and bert-base-unc…
dagardner-nv Apr 6, 2023
af42993
Remove redundant copue of bert-base-cased-vocab.txt
dagardner-nv Apr 6, 2023
59b1e88
Subset of models/log-parsing-models/log-parsing-config-20220418.json …
dagardner-nv Apr 6, 2023
6b52af3
Fetch the bert cased vocab file for tests
dagardner-nv Apr 6, 2023
85d2924
Test for log parsing post proc example
dagardner-nv Apr 6, 2023
4df19b0
Move to sub-dir
dagardner-nv Apr 7, 2023
7a7f4d5
Test for https://github.com/rapidsai/cudf/issues/130850
dagardner-nv Apr 7, 2023
5985006
Explicitly set the encoding as a work-around for https://github.com/n…
dagardner-nv Apr 7, 2023
6956143
Add kw_only to C++ impls of tensor memory classes to match Python API
dagardner-nv Apr 10, 2023
49b31e4
wip
dagardner-nv Apr 10, 2023
e628b56
wip
dagardner-nv Apr 10, 2023
49b8474
Merge branch 'branch-23.07' into david-test-examples
dagardner-nv Apr 11, 2023
4f41628
Revert unneeded changes
dagardner-nv Apr 11, 2023
faf29f4
Unimport modules after each test
dagardner-nv Apr 11, 2023
c0ab1d9
Use new import_mod fixture
dagardner-nv Apr 11, 2023
8555f08
Update to use new import_mod fixture
dagardner-nv Apr 11, 2023
a1c96fb
Set the configs for this subdir to NLP
dagardner-nv Apr 11, 2023
b3aecc9
Seed cupy's number generator as well
dagardner-nv Apr 11, 2023
241b564
wip
dagardner-nv Apr 11, 2023
c62a014
Fix type hints
dagardner-nv Apr 12, 2023
a5e80ba
Exclude tensors from repr
dagardner-nv Apr 12, 2023
f895ce3
wip
dagardner-nv Apr 12, 2023
2cedf1d
Remove _infer_callback method as it was identicle to parent's _infer_…
dagardner-nv Apr 12, 2023
462b758
wip
dagardner-nv Apr 12, 2023
f31a4c5
Use from_message
dagardner-nv Apr 12, 2023
a74d529
move test data
dagardner-nv Apr 12, 2023
6440ab8
wip
dagardner-nv Apr 13, 2023
18ea2c6
Tests for developer guide ex2
dagardner-nv Apr 13, 2023
3fa6d0f
Merge branch 'branch-23.07' of github.com:nv-morpheus/Morpheus into d…
dagardner-nv Apr 13, 2023
829e924
Fix license headers
dagardner-nv Apr 13, 2023
d03bbef
Merge branch 'branch-23.07' into david-test-examples
dagardner-nv Apr 13, 2023
923f452
Remove unused imports
dagardner-nv Apr 13, 2023
610b288
Merge branch 'branch-23.07' into david-test-examples
dagardner-nv Apr 13, 2023
517b964
Merge branch 'branch-23.07' into david-test-examples
dagardner-nv Apr 14, 2023
a3aef0a
Optionally bypass the cache and optionally pass additional kwargs
dagardner-nv Apr 14, 2023
d2065ad
Fix usage of assert_df_equal which is now a static method of DatasetM…
dagardner-nv Apr 14, 2023
b42d7c8
Use datasetmanger
dagardner-nv Apr 14, 2023
de73c01
Use new dataset fixtures [no ci]
dagardner-nv Apr 14, 2023
3cb8b98
Revert code updates due to removing redundant bert files, will replac…
dagardner-nv Apr 14, 2023
1215fbe
Add bert vocab files to morpheus/data to be available for testing, pr…
dagardner-nv Apr 14, 2023
ce57304
Updates due to moving the bert vocabs [no ci]
dagardner-nv Apr 14, 2023
e990cbf
wip [no ci]
dagardner-nv Apr 14, 2023
2e97a48
Leave symlinks from the old locations for bert data to the new locati…
dagardner-nv Apr 14, 2023
049c5d5
Remove unused import
dagardner-nv Apr 14, 2023
fc91053
Shorted comment to fix under 100 column limit
dagardner-nv Apr 14, 2023
225bf9f
Move example tests data to its own subfolder
dagardner-nv Apr 17, 2023
7fa07b5
Update tests to reflect data move [no ci]
dagardner-nv Apr 17, 2023
e4460d2
Copy of the first 10 lines of examples/data/email_with_addresses.json…
dagardner-nv Apr 17, 2023
7b9c440
Update test to reflect data move [no ci]
dagardner-nv Apr 17, 2023
fc8edeb
Merge branch 'branch-23.07' of github.com:nv-morpheus/Morpheus into d…
dagardner-nv Apr 17, 2023
39f4f3b
Add shared libs from examples to tar
dagardner-nv Apr 17, 2023
fea441f
Merge branch 'branch-23.07' of github.com:nv-morpheus/Morpheus into d…
dagardner-nv Apr 17, 2023
e45508f
Add the shared libs from examples not the build dir
dagardner-nv Apr 17, 2023
c53366d
Move tests for examples
dagardner-nv Apr 20, 2023
5c9e338
Organize test data for examples
dagardner-nv Apr 20, 2023
d86fc17
Explain the overload of the config fixture
dagardner-nv Apr 20, 2023
49b7476
Fix dirs
dagardner-nv Apr 20, 2023
65af7c6
Update test paths
dagardner-nv Apr 20, 2023
70e0b82
Replace emptry strings with nulls for the purposes of testing filter_…
dagardner-nv Apr 20, 2023
4c4eee1
Warn on implied no_cache
dagardner-nv Apr 20, 2023
597cdfd
Test reader_args
dagardner-nv Apr 20, 2023
1aa9b2f
Merge branch 'branch-23.07' of github.com:nv-morpheus/Morpheus into d…
dagardner-nv Apr 20, 2023
232afb8
Merge branch 'branch-23.07' into david-test-examples
dagardner-nv Apr 20, 2023
cb94631
Fix type-hints for modules
dagardner-nv Apr 20, 2023
367a68c
Merge branch 'david-test-examples' of github.com:dagardner-nv/Morpheu…
dagardner-nv Apr 20, 2023
b50782d
Merge branch 'branch-23.07' of github.com:nv-morpheus/Morpheus into d…
dagardner-nv Apr 24, 2023
f7d67f1
Merge branch 'branch-23.07' of github.com:nv-morpheus/Morpheus into d…
dagardner-nv Apr 25, 2023
56b9393
Merge branch 'branch-23.07' into david-test-examples
dagardner-nv Apr 28, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion ci/scripts/github/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,8 @@ sccache --show-stats
rapids-logger "Archiving results"
tar cfj "${WORKSPACE_TMP}/wheel.tar.bz" build/dist

MORPHEUS_LIBS=($(find ${MORPHEUS_ROOT}/build/morpheus/_lib -name "*.so" -exec realpath --relative-to ${MORPHEUS_ROOT} {} \;))
MORPHEUS_LIBS=($(find ${MORPHEUS_ROOT}/build/morpheus/_lib -name "*.so" -exec realpath --relative-to ${MORPHEUS_ROOT} {} \;) \
$(find ${MORPHEUS_ROOT}/examples -name "*.so" -exec realpath --relative-to ${MORPHEUS_ROOT} {} \;))
tar cfj "${WORKSPACE_TMP}/morhpeus_libs.tar.bz" "${MORPHEUS_LIBS[@]}"

CPP_TESTS=($(find ${MORPHEUS_ROOT}/build/morpheus/_lib/tests -name "*.x" -exec realpath --relative-to ${MORPHEUS_ROOT} {} \;))
Expand Down
4 changes: 2 additions & 2 deletions examples/log_parsing/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ python run.py \
--num_threads 1 \
--input_file ${MORPHEUS_ROOT}/models/datasets/validation-data/log-parsing-validation-data-input.csv \
--output_file ./log-parsing-output.jsonlines \
--model_vocab_hash_file=${MORPHEUS_ROOT}/models/training-tuning-scripts/sid-models/resources/bert-base-cased-hash.txt \
--model_vocab_hash_file=${MORPHEUS_ROOT}/morpheus/data/bert-base-cased-hash.txt \
--model_vocab_file=${MORPHEUS_ROOT}/models/training-tuning-scripts/sid-models/resources/bert-base-cased-vocab.txt \
--model_seq_length=256 \
--model_name log-parsing-onnx \
Expand Down Expand Up @@ -114,7 +114,7 @@ morpheus --log_level INFO \
pipeline-nlp \
from-file --filename ./models/datasets/validation-data/log-parsing-validation-data-input.csv \
deserialize \
preprocess --vocab_hash_file ./models/training-tuning-scripts/sid-models/resources/bert-base-cased-hash.txt --stride 64 --column=raw \
preprocess --vocab_hash_file ${MORPHEUS_ROOT}/morpheus/data/bert-base-cased-hash.txt --stride 64 --column=raw \
monitor --description "Preprocessing rate" \
inf-logparsing --model_name log-parsing-onnx --server_url localhost:8001 --force_convert_inputs=True \
monitor --description "Inference rate" --unit inf \
Expand Down
36 changes: 6 additions & 30 deletions examples/log_parsing/inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,12 +30,10 @@
from morpheus.cli.register_stage import register_stage
from morpheus.config import Config
from morpheus.config import PipelineModes
from morpheus.messages import InferenceMemory
from morpheus.messages import MultiInferenceMessage
from morpheus.pipeline.stream_pair import StreamPair
from morpheus.stages.inference.inference_stage import InferenceStage
from morpheus.stages.inference.inference_stage import InferenceWorker
from morpheus.stages.inference.triton_inference_stage import InputWrapper
from morpheus.stages.inference.triton_inference_stage import _TritonInferenceWorker
from morpheus.utils.producer_consumer_queue import ProducerConsumerQueue

Expand Down Expand Up @@ -97,7 +95,7 @@ def default_inout_mapping(cls) -> typing.Dict[str, str]:
# Some models use different names for the same thing. Set that here but allow user customization
return {"attention_mask": "input_mask"}

def build_output_message(self, x: MultiInferenceMessage) -> MultiResponseLogParsingMessage:
def build_output_message(self, x: MultiInferenceMessage) -> MultiPostprocLogParsingMessage:

memory = PostprocMemoryLogParsing(
count=x.count,
Expand All @@ -111,7 +109,7 @@ def build_output_message(self, x: MultiInferenceMessage) -> MultiResponseLogPars
mess_offset=x.mess_offset,
mess_count=x.mess_count,
memory=memory,
offset=x.offset,
offset=0,
count=x.count)
return output_message

Expand All @@ -131,25 +129,6 @@ def _build_response(self, batch: MultiInferenceMessage,

return mem

def _infer_callback(self,
cb: typing.Callable[[ResponseMemoryLogParsing], None],
m: InputWrapper,
b: MultiInferenceMessage,
result: tritonclient.InferResult,
error: tritonclient.InferenceServerException):

# If its an error, return that here
if (error is not None):
raise error

# Build response
response_mem = self._build_response(b, result)

# Call the callback with the memory
cb(response_mem)

self._mem_pool.return_obj(m)


@register_stage("inf-logparsing", modes=[PipelineModes.NLP])
class LogParsingInferenceStage(InferenceStage):
Expand Down Expand Up @@ -261,7 +240,9 @@ def set_output_fut(resp: ResponseMemoryLogParsing, b, f: mrc.Future):
return stream, out_type

@staticmethod
def _convert_one_response(memory: InferenceMemory, inf: MultiInferenceMessage, res: ResponseMemoryLogParsing):
def _convert_one_response(memory: PostprocMemoryLogParsing,
inf: MultiInferenceMessage,
res: ResponseMemoryLogParsing):

memory.input_ids[inf.offset:inf.count + inf.offset, :] = inf.input_ids
memory.seq_ids[inf.offset:inf.count + inf.offset, :] = inf.seq_ids
Expand All @@ -280,12 +261,7 @@ def _convert_one_response(memory: InferenceMemory, inf: MultiInferenceMessage, r
memory.confidences[idx, :] = cp.maximum(memory.confidences[idx, :], res.confidences[i, :])
memory.labels[idx, :] = cp.maximum(memory.labels[idx, :], res.labels[i, :])

return MultiPostprocLogParsingMessage(meta=inf.meta,
mess_offset=inf.mess_offset,
mess_count=inf.mess_count,
memory=memory,
offset=inf.offset,
count=inf.count)
return MultiPostprocLogParsingMessage.from_message(inf, memory=memory, offset=inf.offset, count=inf.mess_count)

def _get_inference_worker(self, inf_queue: ProducerConsumerQueue) -> InferenceWorker:

Expand Down
7 changes: 5 additions & 2 deletions examples/log_parsing/postprocessing.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,11 +54,14 @@ def __init__(self, c: Config, vocab_path: pathlib.Path, model_config_path: pathl
self._model_config_path = model_config_path

self._vocab_lookup = {}
with open(vocab_path) as f:

# Explicitly setting the encoding, we know we have unicode chars in this file and we need to avoid issue:
# https://github.com/nv-morpheus/Morpheus/issues/859
with open(vocab_path, encoding='UTF-8') as f:
for index, line in enumerate(f):
self._vocab_lookup[index] = line.split()[0]

with open(model_config_path) as f:
with open(model_config_path, encoding='UTF-8') as f:
config = json.load(f)

self._label_map = {int(k): v for k, v in config["id2label"].items()}
Expand Down
Loading