Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix[BMQ]: unsafe schemalearner use #618

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

dorjesinpo
Copy link
Collaborator

Cannot use bmqp::SchemaLearner in both FSM and event threads.
Instead, call the leaner in FSM and cache the result (schema) in the generated event.

@dorjesinpo dorjesinpo added the bug Something isn't working label Feb 19, 2025
@dorjesinpo dorjesinpo requested a review from a team as a code owner February 19, 2025 16:32
Signed-off-by: dorjesinpo <129227380+dorjesinpo@users.noreply.github.com>
Signed-off-by: dorjesinpo <129227380+dorjesinpo@users.noreply.github.com>
@dorjesinpo dorjesinpo force-pushed the fix/unsafe-scheamalearner-use branch from a454b0d to 91d01b5 Compare February 19, 2025 17:33
Copy link

@bmq-oss-ci bmq-oss-ci bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build 2484 of commit 91d01b5 has completed with FAILURE

@dorjesinpo dorjesinpo requested a review from 678098 February 19, 2025 18:58
@678098 678098 changed the title Fix unsafe scheamalearner use Fix unsafe schemalearner use Feb 24, 2025
@678098 678098 changed the title Fix unsafe schemalearner use Fix[BMQ]: unsafe schemalearner use Feb 24, 2025
@@ -60,6 +60,7 @@
// BMQ

#include <bmqa_queueid.h>
#include <bmqp_messageproperties.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only change in this header is this new field:
bmqp::MessageProperties::SchemaPtr d_schema_sp

It is a pointer type, so we don't need to know the class sizeof in advance. We can replace include to a heavy header with a forward declaration to this type. At some point in the future we might want to optimize includes to achieve a better compilation time.

If you think it's helpful, you can make this change now.

@@ -251,8 +259,7 @@ class Event {
// has been constructed in its buffer. We
// cannot rely on 'd_msgEventMode'

bsl::vector<bsl::pair<bmqt::CorrelationId, unsigned int> >
d_correlationIds;
bsl::vector<MessageContext> d_contexts;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to update the comment below


return d_correlationIds[position].second;
return d_contexts[position];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Less code possible

We can turn position arg to type size_t (it allows to remove the check 0 <= position.
And use vector.at(size_t pos) that has another range check with performance hint:
https://github.com/bloomberg/bde/blob/2c7ef378b00a779198ef4bfc774ce835d741c545/groups/bsl/bslstl/bslstl_vector.h#L2496

@@ -811,8 +803,11 @@ inline void Event::addCorrelationId(const bmqt::CorrelationId& correlationId,
// BSLS_ASSERT_SAFE(messageEventMode() == MessageEventMode::e_READ);
// BSLS_ASSERT_SAFE(d_rawEvent.isAckEvent());

d_correlationIds.push_back(
bsl::make_pair(correlationId, subscriptionHandleId));
d_contexts.resize(d_contexts.size() + 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we construct MessageContext with a default constructor, then reassign all fields one by one. Might be better to emplace_back:
https://github.com/bloomberg/bde/blob/65f2de5338d10f6a947357af66b100b1490616bb/groups/bsl/bslstl/bslstl_vector.h#L3681

void addContext(const bmqt::CorrelationId& correlationId,
unsigned int subscriptionHandleId =
bmqt::SubscriptionHandle::k_INVALID_HANDLE_ID,
bmqp::MessageProperties::SchemaPtr schema =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New arg schema not mentioned in docs


const SchemaPtr& result = contextHandle->d_schema_sp;

if (!result) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we assume that it's an unlikely scenario? Or is it better to evaluate the expression without such assumptions?

MessageProperties mps(d_allocator_p);
int rc = mps.streamIn(blob, input, result);

if (rc == 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If rc != 0 we will silently return an empty SchemaPtr. Should we log something or skip the message?

@@ -99,6 +99,8 @@ class MessageProperties_Schema {
bslma::Allocator* basicAllocator);
MessageProperties_Schema(const MessageProperties_Schema& other);

~MessageProperties_Schema();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also add BSLS_KEYWORD_FINAL to the class head, since we declared a non-virtual destructor?

class MessageProperties_Schema BSLS_KEYWORD_FINAL {

@@ -45,6 +45,8 @@ struct SchemaLearner::SchemaHandle {
// If 'false', needs new translation.

SchemaHandle(SchemaIdType id);

~SchemaHandle();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also add BSLS_KEYWORD_FINAL to the class head, since we declared a non-virtual destructor?

struct SchemaLearner::SchemaHandle BSLS_KEYWORD_FINAL {

bmqp::MessageProperties::SchemaPtr schema = schemaLearner().learn(
schemaLearner().createContext(queueId),
bmqp::MessagePropertiesInfo(info.d_header),
bdlbb::Blob());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at learn implementation:

SchemaLearner::SchemaPtr
SchemaLearner::learn(Context&                     context,
                     const MessagePropertiesInfo& input,
                     const bdlbb::Blob&           blob)
{
    const SchemaIdType inputId = input.schemaId();

    if (!isPresentAndValid(inputId)) {
        // Nothing to do
        return SchemaPtr();  // RETURN
    }

We have an early return if schemaId is not present or valid. Does it happen for all messages that don't have message properties? If it can happen that frequently, we do a lot of unnecessary work here:

  • Pass args and call schemaLearner().createContext(queueId)
  • Within schemaLearner().createContext(queueId) make lookup in a map (we don't use its result)
  • Construct bdlbb::Blob() that we don't use
  • Pass args and call SchemaLearner::learn just to return early

Do you think it's worth to avoid these extra operations if we don't have schemaId?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants