Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve message bus topic matching performance #2151

Merged

Conversation

ryantam626
Copy link
Contributor

Pull Request

Improve nautilus_trader.common.component.is_matching exact match performance as suggested in #2150.

Also adds some more test cases.

Type of change

Bug fix (non-breaking change which fixes an issue)

How has this change been tested?

Existing tests are almost sufficient.

For completeness, I have also added some more tests for

  1. ? pattern which we do support
  2. [seq] and [!seq] pattern which do NOT support

@CLAassistant
Copy link

CLAassistant commented Dec 29, 2024

CLA assistant check
All committers have signed the CLA.

@ryantam626
Copy link
Contributor Author

Actually I am not certain if topic_contains_wildcard = contains_wildcard(topic) and its check is required.

I was under the impression that it's required, because of this test

in particular.

However upon trying to add more test which flips all the existing tests' topic and pattern around, they all failed, which led me to think that topic_contains_wildcard = contains_wildcard(topic) is actually not required - if that's the case then I think this can almost be reduced to a fnmatch.fnmatch (well depending on if we want to support other shell-style patterns?)

@cjdsellers
Copy link
Member

Hey @ryantam626

There is a chance the implementation is not correct. We only have to preserve the current functionality, and this will be revisited and optimized in Rust later (current impl here).

@ryantam626
Copy link
Contributor Author

Whoops, forgot to follow the contributor guide, will fix the CI failures.

Polymarket has excessively long names for instruments[1], the `is_matching`
O(nm) algorithm becomes prohibitively expensive when dealing with
Polymarket instruments - this is further exacerbated when we maintain
all instruments from Polymarket instead of a selected few.

I think there is an alternative optimisation here, which involves
converting the inputs into deterministic finite automata and try to walk
them in sync, however this takes time to implement and validate, and
thus is out of scope right now.

[1] - example of Polymarket related channel/topic:
`data.book.deltas.POLYMARKET.0xf6ea0a571a43a122471e1c7604d6bd8c267082a39758bdd8162dbfae0cd8d137-40739562057794550541187194569338873089092359730180442205123214215534855241164`
- Add some test involving `?` which is a wildcard we support;
- Add some test involving `[seq]` and `[!seq]` (UNIX shell-style
  wildcard that we do NOT support);
@ryantam626 ryantam626 force-pushed the rt.msgbus-is-matching-opt branch 2 times, most recently from 530821d to a9f49ec Compare December 29, 2024 22:06
Following the previous commit for completeness.
@ryantam626 ryantam626 force-pushed the rt.msgbus-is-matching-opt branch from a9f49ec to 5c92022 Compare December 29, 2024 22:07
@ryantam626
Copy link
Contributor Author

Should be fixed now, sorry for the repeated force push 😓

FWIW, I think it probably is correct - I just got the wrong end of the stick about the purpose of the ["*", "*", True] test!

I will give this patch a spin and follow up with another PR if I find a better solution in the future.

@cjdsellers cjdsellers changed the title Improve nautilus_trader.common.component.is_matching exact match performance Improve message bus topic matching performance Dec 29, 2024
Copy link

codspeed-hq bot commented Dec 29, 2024

CodSpeed Performance Report

Merging #2151 will degrade performances by 42.43%

Comparing ryantam626:rt.msgbus-is-matching-opt (5c92022) with develop (3ac3a39)

Summary

⚡ 1 improvements
❌ 1 regressions
✅ 50 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark develop ryantam626:rt.msgbus-is-matching-opt Change
test_is_limit_filled 25.7 µs 16.2 µs +58.06%
test_symbol_equality 13.3 µs 23 µs -42.43%

cdef inline bint is_matching(str topic, str pattern):
topic_contains_wildcard = contains_wildcard(topic)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small Cython thing: if a local variable isn't typed with a cdef then the Python interpreter is needed to figure out the type. Will merge like this though and optimize on develop.

@cjdsellers cjdsellers merged commit 465c3db into nautechsystems:develop Dec 29, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants