Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extending BatchActuateStreamResponse #103

Merged
merged 5 commits into from
Nov 21, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 28 additions & 4 deletions databroker/src/grpc/kuksa_val_v2/val.rs
Original file line number Diff line number Diff line change
Expand Up @@ -626,8 +626,10 @@ impl proto::val_server::Val for broker::DataBroker {
// e.g. if sending an unsupported enum value
// - if the published value is out of the min/max range specified
//
// - Provider returns BatchActuateStreamResponse <- Databroker sends BatchActuateStreamRequest
// No error definition, a BatchActuateStreamResponse is expected from provider.
// - Databroker sends BatchActuateStreamRequest -> Provider may return one or more BatchActuateStreamResponse upon error,
// but should for performance reasons send nothing upon success.
// It is up to the provider to decide if the stream shall be closed,
// as of today Databroker will not react on the received error message.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should anyone close the stream?

Copy link
Contributor Author

@erikbosch erikbosch Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see it as a reasonable behavior if you as a provider get a lot of junk from the Databroker, like if you get a lot of BatchActuateStreamRequest where the signal id does not match the IDs provided in the ProvideActuationRequest you have sent, or the values does not match the value limitations the Provider have internally. I think it is up to the Provider to decide how many errors it will tolerate before it gets pissed off and closes the stream.

Like if a provider has received 1000 requests and all of them have been faulty, why should it keep the stream open

//
async fn open_provider_stream(
&self,
Expand Down Expand Up @@ -678,8 +680,30 @@ impl proto::val_server::Val for broker::DataBroker {
}
}
},
Some(BatchActuateStreamResponse(_batch_actuate_stream_response)) => {
// TODO discuss and implement
Some(BatchActuateStreamResponse(batch_actuate_stream_response)) => {

if let Some(error) = batch_actuate_stream_response.error {
match error.code {
0 => {},
_ => {
let mut msg : String = "Batch actuate stream response error".to_string();
if let Some(signal_id) = batch_actuate_stream_response.signal_id {
match signal_id.signal {
Some(proto::signal_id::Signal::Path(path)) => {
msg = msg + ", path: " + &path;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better to use format!() here to create the debug msg messages?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean having something like
msg =format! ("{}, path: {}" msg, &path);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. exactly that

}
Some(proto::signal_id::Signal::Id(id)) => {
msg = msg + ", id: " + &id.to_string();
}
None => {}
}
}
msg = msg + ", error code: " + &error.code.to_string() + ", error message: " + &error.message;
debug!(msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is just a first step in implementing this. But shouldn't the error coming from the provider be forwarded to the originator of the actuate(...) call, i.e. the "consumer"?

Copy link
Contributor Author

@erikbosch erikbosch Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I looked mostly at the Provider rpc.

But I assume we need quite a lot refactoring if we are to return results to the "consumer", some aspects I think of:

  • If Actuate / BatchActuate are to return actual results, the Databroker will need to wait until the providers has replied (for BatchActuate there could be multiple, right?)
  • That means the providers MUST send a response also on OK. Possibly combined with a timeout in databroker so that Actuation is considered unsuccessful if Databroker does not get a response within X seconds.
  • Then we maybe also need some form of the request-id/statefulness, so we know which request the error belong to. At least theoretically there might be multiple consumers trying to change the same signal, right? And it can happen partially in parallel.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you are discussing here was postponed if I remember correctly as we were not sure if we wanted to put databroker to wait until it receives ACK of receipt and forwards it to the consumer.
So in my opinion, this PR is enough for now if databroker can output some debug actuate information.

}
}
}

},
None => {

Expand Down
11 changes: 9 additions & 2 deletions proto/kuksa/val/v2/val.proto
Original file line number Diff line number Diff line change
Expand Up @@ -178,8 +178,10 @@ service VAL {
// e.g. if sending an unsupported enum value
// - if the published value is out of the min/max range specified
//
// - Provider returns BatchActuateStreamResponse <- Databroker sends BatchActuateStreamRequest
// No error definition, a BatchActuateStreamResponse is expected from provider.
// - Databroker sends BatchActuateStreamRequest -> Provider may return one or more BatchActuateStreamResponse upon error,
// but should for performance reasons send nothing upon success.
// It is up to the provider to decide if the stream shall be closed,
// as of today Databroker will not react on the received error message.
//
rpc OpenProviderStream(stream OpenProviderStreamRequest) returns (stream OpenProviderStreamResponse);

Expand Down Expand Up @@ -280,7 +282,12 @@ message BatchActuateStreamRequest {
repeated ActuateRequest actuate_requests = 1;
}

// Message that can be used by provider to indicate that an actuation request was not accepted.
// It is optional for the provider to use this message, and for performance reasons it is recommended
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really disagree with what you said here, just to qualify it a bit, I think it won't add a big overhead if the provider returns an OK response in case of success, since kuksa_client actuate is an unary call (it is not supposed to be used with a high frequency as publish datapoint stream). If we add one day actuate_stream to kuksa_client, it should not definitely return an OK response up success to reduce stream overhead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, returning feedback to the originator of the actuate call seems way better than introducing any form of optimization here. But it should probably be made a lot more clear what the expected provider behaviour is here.

Copy link
Contributor Author

@erikbosch erikbosch Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no problems changing it so that it is expected or even required that a result is returned. But if so, do we want say something more on the meaning of OK and when it should be sent. Like does it mean:

I acknowledge reception and did not find any problem with the request

or

I confirm that the request has been executed!

Or something in between? Or leave it to the provider? It could have an impact if we have the ambition to return a response to the requester - like how long do we want to wait for a response?

// not to send it upon success.
message BatchActuateStreamResponse {
SignalID signal_id = 1;
Error error = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of a architectural question, but do we really want a single Error enum that will contain every error possible for all functions in the whole service?

Wouldn't it be better to have specific errors for different functions, i.e. an ActuationError that only contains the errors possible when trying to actuate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the enum I used matched quite well. At least I can find use-cases for all of them

  • OK - Good to have a default value that indicates no error
  • INVALID_ARGUMENT - value is received is out of expected range
  • NOT_FOUND - signal id is not known by provider
  • PERMISSION_DENIED - signal is known by provider, but it does not accept that it is changes. A possible example could be that the provider has not registered itself for the signal id, so it does not expect to get it.

Off course one can argue that everything here is INVALID_ARGUMENT, so we do not need the rest. But to me the set seems quite reasonable.

https://github.com/eclipse-kuksa/kuksa-databroker/blob/main/proto/kuksa/val/v2/types.proto#L64

message Error {
  ErrorCode code = 1;
  string message = 2;
}

enum ErrorCode {
  OK                = 0;
  INVALID_ARGUMENT  = 1;
  NOT_FOUND         = 2;
  PERMISSION_DENIED = 3;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this could be further refined in the future I think to use ErrorCode if it meets current needs is sufficient for now.

}

message OpenProviderStreamRequest {
Expand Down